Stop paying for speech-to-text. Local, open-source.

TypeWhisper is an open-source, on-device speech-to-text application designed for macOS and Windows. It enables users to convert spoken language into text without transmitting audio data to remote servers, ensuring full privacy and offline functionality. The software targets professionals, developers, writers, educators, and accessibility users who require reliable, secure, and subscription-free dictation capabilities across native applications.
Unlike cloud-based alternatives, TypeWhisper runs entirely on the user's device using locally executed AI models. It requires no internet connection during transcription, no account creation, and no telemetry or data collection. Its GPLv3 license guarantees transparency and freedom to inspect, modify, and redistribute the source code.
TypeWhisper operates through a three-step workflow: First, the user triggers recording using a configurable global keyboard shortcut—functional across all applications. Second, speech is captured and processed in real time by the selected on-device AI engine, with optional streaming preview (available in WhisperKit). Third, the transcribed text is automatically inserted into the currently focused text input field, completing the dictation cycle without manual copy-paste.
Beyond live dictation, TypeWhisper supports file-based transcription: users can drag audio or video files directly into the application window to generate transcripts with precise timestamps. The local HTTP API exposes endpoints on localhost for programmatic access, enabling custom integrations without external dependencies. Engine selection, model management (manual download for WhisperKit and Parakeet; automatic for Apple Speech), and profile configuration are handled through the application’s native interface.
TypeWhisper serves practical use cases including hands-free documentation, accessibility support for users with mobility or visual impairments, multilingual note-taking, captioning for educational or internal video content, and developer tooling via its local API. Its per-app profile system allows tailored behavior—for example, using WhisperKit with Japanese translation in a messaging app while defaulting to Apple Speech for English in a word processor. File transcription supports content creators generating subtitles for podcasts or training videos. Because all processing occurs locally, it meets strict compliance requirements for environments where data residency and confidentiality are mandatory, such as legal, healthcare, or government settings.
| Feature | WhisperKit (Versatile) | Parakeet TDT v3 (Fast) | Apple Speech (Zero Setup) |
|---|---|---|---|
| Languages | 99+ | 25 European | ~40 |
| Streaming preview | Yes | No | No |
| Translation support | 20 languages | 20 languages | 20 languages |
| Speed | Fast | Up to 5× faster | Fast |
| Model size options | Tiny to Large v3 | 1.1B parameters | System-managed |
| Model download method | Manual in-app | Manual in-app | Automatic by macOS |
| Platform availability | macOS, Windows | macOS, Windows | macOS 26+ only |
| Best suited for | Multilingual use, translation, streaming | High-speed European-language transcription | Quick setup on compatible macOS systems |