订阅
加入社区
订阅邮件,第一时间获取最新资讯与更新
Speak to type, command, and control your computer. Yak turns your voice into polished text and actions with AI editing across 100+ languages.
Yak is a desktop voice interface that converts your speech into text and commands across any application on your Mac. Unlike basic speech-to-text tools, Yak uses multimodal AI to automatically polish your voice input — removing filler words, fixing self-corrections, and formatting numbers and symbols correctly. You speak naturally, and Yak handles the cleanup. It also adapts its output style based on which app you're using, switching between casual iMessage tone and formal email formatting without you lifting a finger.
AI Auto-Edit for Clean Output
Yak doesn't just transcribe — it refines. When you speak with natural pauses, hesitations, and corrections, Yak's AI automatically removes filler words like "um" and "uh," fixes self-corrections on the fly, and properly formats numbers, symbols, and special characters. You get polished text ready to send, without manual editing.
Adaptive Styles Per Application
The tool detects which app is active and adjusts its output accordingly. Casual messaging in iMessage stays casual. Email in Mail.app gets proper formatting. Documents in Notion follow document conventions. You can also create custom styles and fine-tune which style each application uses.
AI Command Mode for Transformations
Select any text in any app, then speak a command to transform it. Make it shorter, longer, change the tone, translate to another language, or rewrite entirely. This works as your personal AI editor that lives inside every application you use.
Personal Dictionary
Yak learns the names, terms, and expressions that matter to you. It auto-learns from your usage, or you can add entries manually. This vocabulary carries across all apps and contexts, reducing recognition errors for domain-specific terms.
Privacy-First Architecture
Your voice data is never stored, trained on, or reviewed. Audio is processed through Vertex AI and immediately discarded — zero retention. All recordings stay local on your device. Yak is compliant with SOC 2, ISO 27001, and HIPAA standards.
Yak targets Mac users who spend significant time typing and want to speed up their workflow through voice. The homepage specifically appeals to developers doing "vibe coding," but the tool works for anyone who writes emails, messages, documents, or code. It's particularly useful for:
Yak offers two tiers. The Free plan costs $0 forever and includes 100 cloud AI requests per week, standard processing, custom dictionary, BYOK mode, 100+ languages, proxy relay access, and support for up to 3 devices with a 5-minute maximum per recording. Yak Pro is $12 per month when billed annually ($144/year) or $15 per month for monthly billing. Pro includes unlimited cloud AI, priority processing, and priority support. A 30-day Pro trial is available with no credit card required.
Yak requires macOS 12 (Monterey) or newer and works on both Apple Silicon (M1-M4) and Intel Macs. Download the installer from the website, install the app, and press Space to start recording. The browser demo lets you try basic voice-to-email functionality before downloading. Windows and Linux versions are in development.
The homepage claims speaking is 4x faster than typing — 220 words per minute with Yak versus 45 words per minute on a keyboard. This translates to roughly one day saved per week for heavy users, though your actual results will depend on your speaking speed, accent, and workflow.
Removes filler words, fixes self-corrections, and formats numbers and symbols automatically — get polished text from natural speech without manual cleanup.
Automatically switches output style based on the active app — casual for iMessage, formatted for email, structured for documents — with custom style support.
Select any text and speak to transform it — shorten, lengthen, change tone, translate, or rewrite entirely across any application.
Learns your names, jargon, and terms automatically or manually — carries accurate recognition across all apps and contexts.
Supports over 100 languages with automatic detection, handling various accents, dialects, and speaking speeds.
定价模式
支持的平台
支持的语言
Zero data retention — audio processed and immediately discarded, never stored or trained on, with local-only history storage.