Newsletter
Join the Community
Subscribe to our newsletter for the latest news and updates
Chrome extension for AI voice typing and speech to text on any site, with smart cleanup, 25+ languages, and multiple output modes.
Whisper AI is a Chrome extension that brings AI-powered speech-to-text directly into your browser, letting you dictate into any text field on any website without switching tabs or copying output from a separate app. It sits on top of OpenAI's Whisper models and adds an LLM-based cleanup layer, so what lands in your input is a polished transcript rather than raw dictation.
The extension adds a floating microphone button to your browser. Focus any text input — a Gmail compose window, a CMS editor, a form field — and start speaking. When you stop, Whisper AI transcribes your audio, runs it through a cleanup pass, and inserts the result directly where your cursor is.
Cleanup handles the things that make raw speech-to-text hard to use: run-on sentences without punctuation, filler words like "um" and "you know", and the occasional mishearing that automatic speech recognition produces. The goal is to preserve what you said without adding or inventing anything.
Beyond straight transcription, the extension offers a set of output modes. Translate converts your spoken input into another language. Email mode shapes your dictation into a structured, professional message. Grammar mode cleans up written text you speak aloud. You can also write a custom prompt once and save it as a mode, so repeat workflows — writing in a specific tone, formatting notes a certain way — don't require any setup after the first time.
The extension is aimed at people who type a lot and want a faster input method, as well as anyone for whom extended typing is uncomfortable or slow.
Content creators and writers use it to get first drafts down quickly without losing the thread of an idea. Remote workers handling high volumes of email find the Email mode useful for dictating structured messages that don't require heavy editing afterward. People in meetings use it to capture notes and action items in real time without pulling attention away from the conversation.
Bilingual professionals and international teams get practical value from the translation mode — dictating in one language and outputting in another without a separate tool in the workflow. The 25+ language support covers a wide range of transcription and voice note use cases beyond English.
Setup takes a few minutes: install the extension from the Chrome Web Store, sign in with Google, and pin the extension for quick access. Settings sync across devices, so switching machines doesn't require reconfiguration.
Dictation works two ways. You can click the floating mic button that appears near focused inputs, or use the default keyboard shortcut (Ctrl+Shift+L) to toggle recording on and off. Hold-to-talk is also available — hold the shortcut while speaking and release to stop — which some users find more natural for short bursts of input.
Per-site controls let you decide where the floating mic appears. If you only want it active in Gmail and Notion, you can disable it everywhere else. This keeps the extension from interfering with sites where you don't need it.
Audio is sent to Whisper AI's servers only for transcription and optional cleanup, then returned as text. The company states it does not sell audio data or use it to train models. The extension connects to your account using short-lived authorization codes and revocable API keys, which limits exposure if credentials are ever compromised. Output modes are enforced server-side, meaning your custom prompt instructions aren't exposed in the browser client.
Whisper AI uses a freemium, credit-based model. New accounts receive 100 free credits on sign-up — approximately 17 minutes of transcription — with no payment card required at the start. This is enough to evaluate the core workflow before committing to a paid plan.
For regular users, the credit model means usage has a direct cost. Heavy dictators — people transcribing long meetings or drafting high volumes of content daily — will move through credits at a meaningful rate and will need a paid tier to sustain the workflow. Pricing details are available on the product's pricing page.
Claim this listing to get dofollow backlinks, featured placement, and full control over your product page.
Dictate into any focused input field across your browser without copy-paste loops.
Automatically fixes punctuation, removes disfluencies, and corrects ASR errors without changing meaning.
Switch languages naturally for bilingual work and international team collaboration.
Toggle recording with Ctrl+Shift+L or hold the shortcut while speaking and release to stop.
Switch between Transcribe, Translate, Email draft, and Grammar fix modes with one click.
Enable or disable the floating mic on a per-site basis so it only appears where you want it.
Pricing Model
Supported Platforms