I Dictate 4,000 Words a Day Now: The Voice to Text Workflow That Actually Works

I write about 4,000 words per day across articles, emails, briefs, and notes. Until last fall I typed all of it. Average sustained typing is around 60 to 80 words per minute for a fast typist. Average sustained dictation with the right setup is 130 to 160 words per minute, with the AI doing the cleanup behind the scenes. The math says I am saving about two hours per day. Six months in, that estimate is conservative. Voice to text is finally good enough that typing for first drafts is the slow path.

The piece that changed in the last 18 months was accuracy. OpenAI's Whisper model hit 96 to 98 percent word accuracy on conversational audio in late 2024. The 2026 versions of Whisper Large v3 and Google's Universal Speech Model are pushing 98 to 99 percent. That last two percent is what made the workflow viable. At 95 percent accuracy, you spend 30 percent of your time fixing errors, which kills any speed advantage. At 98 percent, the cleanup is fast and the speed advantage holds.

The current best stack runs about $40 per month total. Step one is the capture layer. I use Wispr Flow at $15 per month for a system wide microphone shortcut on macOS. Hold a hotkey, talk, release the hotkey, and the transcription appears wherever your cursor is. Windows users have similar options through Talon, Dragon, and Otter. The key feature is system wide, not app specific. You should be able to dictate into email, your editor, Slack, or a web form using the same shortcut.

Step two is the AI cleanup layer. Raw dictation has filler words, repeated phrases, and awkward sentence structure. The fix is to pipe the raw text through a language model that knows your style. Most of the modern dictation apps do this automatically now. Wispr Flow uses GPT-4o by default. Superwhisper at $9 per month uses Claude or local models. The output reads like edited prose instead of stream of consciousness.

Step three is the microphone. This is where most people skimp and lose 5 percent of accuracy. A built-in laptop mic gets you to 92 to 94 percent in a quiet room. A Shokz OpenComm 2 at $180 gets you to 97 percent in any environment. A Shure MV7 at $280 sitting on your desk gets you to 98 percent and works as a podcast mic too. The Shokz wins for mobile work because it is bone conduction and rejects ambient noise. The Shure wins for stationary work because the audio quality is studio grade.

The workflow that works for long form writing is different from the workflow for emails. For emails, I dictate the full message in one pass and let the AI clean it up. About 9 out of 10 emails go out without further edits. For articles, I dictate the rough draft as if I am explaining it to a friend. The AI cleans up the structure but I still edit it. Dictation handles the first 70 percent of the work. Editing handles the last 30 percent.

The mental shift took about two weeks. Typing teaches you to think in keystrokes. Dictation forces you to think in sentences. Early in the transition, my dictated drafts were short and choppy. After 200 hours of practice, I am thinking in full paragraphs before I press the button. The skill compounds. The same way typing speed grew over years of writing, dictation speed grows with reps.

A few practical notes from running this for six months. Have a glass of water nearby. Your throat gets dry faster than your fingers get tired. Use punctuation commands deliberately. Saying period, comma, new paragraph keeps the AI from guessing. Train the model on your jargon by feeding it 10 to 20 examples of your typical content. Wispr Flow and Superwhisper both support custom dictionaries. Industry terms, proper nouns, and brand names go in there.

Privacy is worth thinking about. Cloud based services send your audio to OpenAI or Anthropic for transcription. If you are dictating client information, medical notes, or anything legally sensitive, run a local model instead. Whisper.cpp on a M2 or M3 Mac transcribes 30 minutes of audio in about 2 minutes locally with no cloud round trip. The accuracy is slightly lower but the data never leaves your machine. For Lumina Media client work, this is the setup I would use.

Try this for two weeks. Buy the microphone. Subscribe to one of the apps. Replace your typing for 80 percent of your writing. Track the time. The first week will feel slower than typing because you are adjusting. By the second week, you will not want to go back. The technology finally caught up.

I Dictate 4,000 Words a Day Now: The Voice to Text Workflow That Actually Works

Continue Reading

Forrester Says AI Has Moved Past Experimentation Into Real-World Work and Most Companies Are Not Ready

AI Agents Are Where Small Businesses Will Get the Real Productivity Wins in 2026

Running AI Models On Your Own Computer Is Easier Than You Think

How Small Business Owners Are Actually Using AI Image Generation in 2026