Skip to content

Why Diction

Most of what you do on your phone now involves an AI on the other end. You ask it a question. You hand it a draft to fix. You feed it a meeting note to clean up. The model is fast. The bottleneck is you, fumbling on a glass keyboard built for thumbs.

Diction exists to fix that one bottleneck.

The bridge

The phone is where you meet the AI. That isn't going to change soon. The model lives in a data center or on a chip in your pocket, but the meeting point is the rectangle in your hand.

A meeting point needs a bridge. The bridge has to be fast, accurate, and available in every app you already use. Voice is the obvious channel — natural, expressive, faster than typing for almost everyone. But voice isn't always usable. You're in a meeting. You're on the bus. The room is loud. The room is too quiet. So the bridge has to handle the moments when you can't speak too.

Diction is that bridge. Voice first, because voice is the strongest channel. Keys when you need them, because reality has edges.

It runs as the keyboard. Not a separate app you have to switch into. The keyboard is the one piece of iOS that sits between you and every text field you ever touch, in any app, on any screen. Replace the keyboard and you have replaced the input layer of the phone.

Voice as the primary channel

Talking is faster than typing. Most people read at around 250 words a minute and type at maybe 40. They speak at 150. The arithmetic is one-sided.

It is also more honest. When you type, you edit as you go. You hedge. You rewrite the first sentence three times before you commit. When you talk, you say what you actually meant, and a good model picks the right words around it. You end up with cleaner text and a faster path to the thing you wanted to send.

Diction leans into that. Tap, talk, the text appears. The transcription is fast enough that you don't lose the thought. The cleanup is smart enough that "so um basically the meeting went well and uh they agreed to the timeline" becomes "The meeting went well. They agreed to the timeline." You stay in the conversation, the keyboard does the spelling.

The keyboard channel still matters

Voice fails sometimes. Sitting in a quiet meeting, dictating across a table from a colleague, in a packed train with three people pressed against you. You need a way to type a single line without taking the phone out of your mouth. The bridge that only works in ideal conditions is not a bridge.

That is why Diction stays a keyboard. The replacement for QWERTY is not a separate dictation app you launch with a hotword. It is the same input surface you already use, with a button that does the talking. When voice works, you tap and talk. When it doesn't, you fall back to the typed channel without leaving the field. Same app, same flow, no switching.

Focused on one job

Most dictation tools spread thin. Mac, Windows, Android, iOS, a browser extension, a desktop overlay, a CLI. Each platform gets a fraction of the engineering attention, and it shows. The iOS app is usually the worst of the bunch because iOS is the hardest surface to build for.

Diction picks one job. iOS keyboard. That's it. Everything in the codebase serves that single surface. Memory ceilings, custom audio filters, mic warmup, app-switching survival, text-field quirks across hundreds of apps. The depth shows up where it counts — in the moment you tap the mic and expect text.

You can read more about why we don't spread on Focus Over Spread. The short version: focused tools beat broad ones, every time.

What you actually get

A keyboard that does one thing well, in three places.

  • On your device. Voice recognition runs locally on the iPhone. No network, no account, no telemetry. Free.
  • On your own server. Run the open-source server on hardware you control. Diction connects to it. Your data never leaves your network. Free.
  • On Diction One, the hosted version. Lower latency, the best models, zero setup. Transcriptions encrypted with AES-256-GCM before they leave the server. Subscription.

Pick the one that fits the moment. Switch between them in two taps. The bridge is the same either way.


The rest of these docs explain how the bridge is built and how to set it up.

If you want the engineering thesis, read How Diction Is Built. If you want to know who's behind it, read About the Author. If you want to start using it, the App Store is one tap away.

Download for iOS