X (Twitter)

About Voca

Introduction to Voca

Voca is a desktop application designed for developers and technical professionals who require accurate, context-aware voice-to-text transcription. It addresses common limitations in general-purpose speech recognition tools—particularly their inability to correctly interpret technical terminology such as 'React', 'navbar', or 'props'. Built with developer workflows in mind, Voca integrates real-time AI correction directly into the typing environment, minimizing context switching and preserving focus during coding, documentation, or communication tasks.

The application processes speech locally or via user-provided API keys, ensuring privacy and data control. It supports both Windows and macOS and is distributed under the MIT open-source license. Voca is not a cloud-only service; it enables on-device operation where possible and gives users full ownership of their speech processing pipeline.

Key Takeaways

Real-time AI correction using Gemini 2.0 Flash to fix mispronunciations and technical terms (e.g., 'reakt' → 'React', 'nahbar' → 'navbar')
Dual STT engine support: Deepgram Nova-3 and Groq Whisper, with automatic language detection across 35+ languages
Context-aware translation into 100+ languages, with tone modes (Developer, Personal) that preserve technical English terms
Numeric add-on converts spoken numbers (e.g., 'twenty-five') into digits (e.g., '25')
Planning add-on structures dictated instructions into formatted lists (e.g., step-by-step tasks)
Fully open source (MIT license); audio processed only through user-controlled API keys—no third-party data collection or storage
Global keyboard shortcut triggers recording, AI correction, and auto-pasting at the cursor position without app switching

How Voca Works

Voca operates as a lightweight desktop agent. When activated via a configurable global shortcut, it begins listening and sends audio to the selected STT engine (Deepgram Nova-3 or Groq Whisper). The resulting transcript is passed through an AI layer powered by Gemini 2.0 Flash, which corrects grammar, syntax, and domain-specific vocabulary—especially technical jargon relevant to software development. The corrected text is then automatically pasted at the current cursor location in any application.

Users can optionally enable translation before pasting, selecting from formal, casual, or Developer tone modes. In Developer mode, technical terms remain in English while surrounding content is translated. Additional add-ons handle numeric conversion and list generation. All processing respects user privacy: no audio is stored or transmitted to Voca’s servers unless explicitly routed through the user’s own API keys.

Core Benefits and Applications

Voca streamlines repetitive text input tasks for developers, including writing code comments, drafting documentation, composing technical emails or Slack messages, and creating issue tickets or PR descriptions. Its ability to maintain technical accuracy reduces manual editing time and improves consistency in written output. The translation capability supports multilingual teams, enabling real-time composition of localized documentation or user-facing content. Because it works entirely within the user’s existing toolchain—without requiring copy-paste or tab switching—it integrates seamlessly into IDEs, editors, browsers, and collaboration platforms. The open-source nature and local-first architecture also make it suitable for regulated or security-sensitive environments where data residency and transparency are required.

Plan	Monthly Cost	Included Credits	Max Recording Size	Features
Pro	$3	$3	10 MB	All STT engines, translation, tone modes, numeric & planning add-ons
Max	$10	$10	25 MB	Same as Pro, with higher file size limit

Both plans include access to all features—no tiered feature restrictions.

Voca

About Voca

Introduction to Voca

Key Takeaways

How Voca Works

Core Benefits and Applications

Get Started

Categories

Tags