A personal desktop voice transcription app powered by Whisper, built with Svelte 5 and Tauri 2.
All transcription runs locally on your Mac using the Whisper large-v3-turbo model with Metal GPU acceleration — no cloud, no API calls, privacy first. Built 100% with Claude Code with human review.
macOS only — built exclusively for macOS (Apple Silicon) with no plans to support other platforms.
Features
- Local transcription — Whisper large-v3-turbo (GGML, quantized Q5_0) via Metal GPU, ~547MB model
- Dual-track recording — microphone + system audio (ScreenCaptureKit) with independent controls
- Global shortcut —
Cmd+Shift+Rto start/stop recording from anywhere - System tray — lives in the menu bar with live recording timer, no dock icon
- Chunked processing — 5-minute intervals with partial transcripts during recording
- Session recovery — automatic crash recovery with WAV chunks and manifest files
- Clipboard integration — transcription results copied to clipboard automatically
- Audio device selection — choose specific mic and system audio devices with live VU meters
- Configurable output — transcriptions saved as text files to a folder of your choice
Stack
| Layer | Technology |
|---|---|
| Frontend | Svelte 5 (runes), SvelteKit, TypeScript |
| Backend | Tauri 2 (Rust) |
| Transcription | whisper-rs (whisper.cpp bindings) with Metal GPU |
| Audio capture | cpal (ScreenCaptureKit fork for system audio) |
| Audio processing | rubato (resampling to 16kHz), hound (WAV I/O) |
Privacy
Privacy is a core value of this project. Koko Whisper is designed to work entirely offline — your audio and transcriptions never leave your machine.
- All audio is captured and processed locally
- Transcription runs on-device via Whisper large-v3-turbo with Metal GPU acceleration
- Transcriptions are saved as plain text files to a local folder
- No analytics, no tracking, no accounts, no sign-up