Seven brains. One assistant. May 2026 update.

Inside Memo AI.

Memo AI runs seven model tiers, each tuned for a specific job. Pick the right one for your task — or let the Auto router classify your question and pick for you. Cascades reordered May 2026 so frontier-class models (DeepSeek V3.2, Hermes 405B, Ring 1T) lead, fast fallbacks follow.

Smart
2,000/day · 15-step cascade · best-first
The everyday powerhouse. Reordered May 2026 so frontier-class models lead — DeepSeek V3.2 (685B), DeepSeek V3.1, Hermes 3 (405B), Ring 2.6 (1T) before the fast-tier fallbacks. All large-model entries; no low-tier noise.
Engine
DeepSeek V3.2 (685B) → V3.1 → Hermes 3 (405B) → Ring 2.6 (1T) → Llama 4 Maverick → GPT-OSS 120B → Nemotron 120B → Qwen 235B → 7 more
Knowledge
April 2025
Reasoner
1,000/day · 10-step cascade · chain-of-thought
For problems that need real thought. Complex code, multi-step maths, legal reasoning, strategy. DeepSeek V3.2 leads, reasoning-tuned models (Nemotron Omni Reasoning, Arcee Trinity Thinking) before generalists. Thinking process shown in a collapsible panel.
Engine
DeepSeek V3.2 → V3.1 → Ring 2.6 (1T) → Hermes 3 → Nemotron Omni Reasoning → Trinity Thinking → Qwen 235B → 3 more
Knowledge
December 2024
Live
2,000/day · Gemini grounded · 11 keys
Real-time web search via Google grounding. Current news, weather, FX rates, container tracking, sports scores. 11 Gemini keys rotate across 3 models — 2.5 Flash, 2.5 Flash Lite, 3 Flash Preview — with Groq + Tavily emergency fallback. Sources cited inline.
Engine
Gemini 2.5 Flash + Google Search → Flash Lite → Gemini 3 Flash Preview → Groq GPT-OSS 120B + Tavily web search
Knowledge
Real-time
Fast
5,000/day · 41ms first token
Lightning fast. Cerebras Llama 3.1 8B on WSE-3 hardware runs at 2,000 tokens per second. Groq GPT-OSS 20B serves first token in 41ms. Quick lookups, one-liner rewrites. Practically unlimited.
Engine
Cerebras Llama 3.1 8B (2,000 tok/s) → Groq GPT-OSS 20B (41ms) → Llama 3.1 8B Instant → 3 more
Knowledge
April 2025
Coder
2,000/day · 11-step · code-tuned
Built for programming. Clean TypeScript, Python, SQL, HTML/CSS. Spots bugs, writes tests, refactors legacy. DeepSeek V3.2 + Qwen3 Coder lead. Long code blocks open in a Canvas side panel. Python blocks have a Run button that executes in your browser.
Engine
DeepSeek V3.2 → V3.1 → Qwen3 Coder → Ring 2.6 → GLM-4.5 Air → GPT-OSS 120B → Qwen 235B → 4 more
Knowledge
April 2025
Vision
1,000/day · auto-activates · multimodal
Reads photos, screenshots, receipts, diagrams, charts. Switches on automatically when you attach an image. Also powers receipt OCR in the expenses module. Paste an image and say 'change to navy' — FLUX.2 klein edits with instruction-following.
Engine
Llama 4 Maverick → Gemini 2.5 Flash → Flash Lite → Llama-4-Scout (Groq) → Nemotron Omni (text+image+audio+video) → 4 more
Knowledge
August 2024
Auto
Free · classifier-routed · 7 destinations
When you don't pick a model, Memo AI's auto-router classifies your question (regex first, Cerebras Llama 8B fallback for ambiguous cases) and routes to the right tier — Smart for general, Reasoner for hard maths, Live for current events, Fast for quick lookups, Coder for code, Vision for images.
Engine
intent classifier → Smart / Reasoner / Live / Fast / Coder / Vision cascade
Knowledge
varies