Apple Foundation Models: what indie Mac developers can do with on-device AI

For about three years, every indie developer who wanted to ship “AI features” had to make the same uncomfortable decision. Either pay an OpenAI or Anthropic bill that scales linearly with users, or skip AI features entirely. The math for a one-time-purchase app was brutal. A few cents per user per month destroys the margin on a $9.99 lifetime app within months.

Apple’s Foundation Models framework, announced at WWDC 2025 and shipping in macOS 26 (Tahoe), changes the math. The framework gives developers programmatic access to the same on-device language model that powers Apple Intelligence features on the user’s Mac. The model runs locally. Your app does not pay for inference. The user does not need an API key. The text never leaves the device.

This post is a practical tour of what indie developers can actually do with the framework, what it is not good at, and what the architectural shift means for a focused Mac app.

What you get

The framework exposes a LanguageModelSession you can prompt with text. The model responds with text, and optionally with a structured object you specify with the @Generable macro. The structured-output mode is the more useful one for most apps because it lets you say “give me back JSON conforming to this schema” instead of trying to parse free-form text.

A typical flow looks like this:

Create a session with system instructions describing the model’s task.
Send a user message.
Read the structured response.
Use the response to drive your app.

The whole thing runs in milliseconds for short prompts and a few seconds for long ones. There is no network call. There is no API key. There is no rate limit beyond what the device itself can sustain.

The same prompt processed by a cloud LLM (top) or by Apple's on-device Foundation Models framework (bottom). The on-device path has no network hop and no per-call billing.

What the model is actually good at

The on-device model is small compared to a state-of-the-art cloud model. It is not GPT-4. Treating it like one will lead to disappointment. Where it shines is a specific class of tasks:

Classification. “Is this string a date phrase or not?” “Which of these five categories does this task belong to?” These are tasks the on-device model handles reliably and quickly.
Structured extraction. Pulling specific fields from a free-text input. “What time of day does this sentence reference?” “What is the verb in this sentence?” The structured-output mode of the framework is built for this.
Short text rewriting. Converting an informal note into a clean title, summarizing a paragraph in one sentence, fixing the grammar in a draft. The model is good at small, contained text transformations.
Tone shifts. Making a draft warmer, more concise, or more professional. Same constraint: short inputs, contained outputs.

This is most of what a productivity app actually needs. Notice what is not on the list: long-form generation, complex reasoning, world knowledge questions, code generation. The on-device model can do those, but not as well as a cloud model. If those are core to your product, you still need cloud.

What the model is not good at

Three classes of task where you should reach for something else:

Long context. The on-device model has a smaller context window than a cloud model. Feeding it a 50-page document and asking for analysis will not go well. Feed it the relevant excerpt instead.
Open-ended creative writing. It can do short creative writing, but you will notice the difference compared to a frontier cloud model. If your app is a writing assistant for novelists, this is probably not your model.
Tasks where the user expects state-of-the-art quality. If your users will compare your output to ChatGPT and judge accordingly, you will lose. The model is excellent for invisible utility, less so for tasks where the AI is the visible product.

The right framing is: use the on-device model to make the app smarter in the background, not to be the visible product.

What this means for indie pricing

The most interesting consequence of the on-device model is the pricing implication. For most of the last three years, the standard advice for indie developers shipping AI features was “you must charge a subscription, because inference costs are real and recurring.” That advice was correct.

It is no longer correct for apps where on-device intelligence is sufficient. The whole reason a one-time-purchase app could not ship AI was the recurring cost. If the recurring cost is zero, the model breaks down. You can ship AI features in a one-time-purchase app and not go broke.

This is a big deal for the small wave of indie Mac apps trying to revive the one-time-purchase model. We wrote about the broader trend but the AI piece is one of the actual technical reasons it works in 2026.

How to start

The framework is part of the standard Apple SDK on macOS 26. There is no separate download. There is no API key. There is no account to create. Add import FoundationModels to a Swift file, create a LanguageModelSession, send a prompt, read the response.

The model is available on Apple Silicon Macs that meet the system requirements for Apple Intelligence. Older Intel Macs do not get the framework, so your app needs a fallback strategy if you want to support them. For most indie Mac apps shipping today, an availability check and a graceful degradation path is enough. The user without the model gets the regex-based version of the feature; the user with the model gets the smarter version.

What this looks like in TodoBar

TodoBar uses the framework as a fallback for natural-language date parsing. The fast path is regular expressions, which catch around 90% of date phrases in under a millisecond. When the regex path fails, the on-device model takes a shot, with a typical latency of about 50 milliseconds, and returns a structured classification of what the user meant. We described the full pipeline in the date parsing post.

The model is invisible to the user. They do not know it is there. They just notice that “in a couple hours” works the same way “in 2 hours” does. That is what good on-device AI feels like.

It is also why a $9.99 one-time purchase app can ship a feature that would have required a subscription a year ago. The math finally works.