Seven brains. One assistant. May 2026 update.

Inside Memo AI.

Memo AI runs seven model tiers, each tuned for a specific job. Pick the right one for your task — or let the Auto router classify your question and pick for you. Cascades reordered May 2026 so frontier-class models (DeepSeek V3.2, Hermes 405B, Ring 1T) lead, fast fallbacks follow.

Smart

2,000/day · 15-step cascade · best-first

The everyday powerhouse. Reordered May 2026 so frontier-class models lead — DeepSeek V3.2 (685B), DeepSeek V3.1, Hermes 3 (405B), Ring 2.6 (1T) before the fast-tier fallbacks. All large-model entries; no low-tier noise.

Engine

DeepSeek V3.2 (685B) → V3.1 → Hermes 3 (405B) → Ring 2.6 (1T) → Llama 4 Maverick → GPT-OSS 120B → Nemotron 120B → Qwen 235B → 7 more

Knowledge

April 2025

Reasoner

1,000/day · 10-step cascade · chain-of-thought

For problems that need real thought. Complex code, multi-step maths, legal reasoning, strategy. DeepSeek V3.2 leads, reasoning-tuned models (Nemotron Omni Reasoning, Arcee Trinity Thinking) before generalists. Thinking process shown in a collapsible panel.

Engine

DeepSeek V3.2 → V3.1 → Ring 2.6 (1T) → Hermes 3 → Nemotron Omni Reasoning → Trinity Thinking → Qwen 235B → 3 more

Knowledge

December 2024

Live

2,000/day · Gemini grounded · 11 keys

Real-time web search via Google grounding. Current news, weather, FX rates, container tracking, sports scores. 11 Gemini keys rotate across 3 models — 2.5 Flash, 2.5 Flash Lite, 3 Flash Preview — with Groq + Tavily emergency fallback. Sources cited inline.

Engine

Gemini 2.5 Flash + Google Search → Flash Lite → Gemini 3 Flash Preview → Groq GPT-OSS 120B + Tavily web search

Knowledge

Real-time

Fast

5,000/day · 41ms first token

Lightning fast. Cerebras Llama 3.1 8B on WSE-3 hardware runs at 2,000 tokens per second. Groq GPT-OSS 20B serves first token in 41ms. Quick lookups, one-liner rewrites. Practically unlimited.

Engine

Cerebras Llama 3.1 8B (2,000 tok/s) → Groq GPT-OSS 20B (41ms) → Llama 3.1 8B Instant → 3 more

Knowledge

April 2025

Coder

2,000/day · 11-step · code-tuned

Built for programming. Clean TypeScript, Python, SQL, HTML/CSS. Spots bugs, writes tests, refactors legacy. DeepSeek V3.2 + Qwen3 Coder lead. Long code blocks open in a Canvas side panel. Python blocks have a Run button that executes in your browser.

Engine

DeepSeek V3.2 → V3.1 → Qwen3 Coder → Ring 2.6 → GLM-4.5 Air → GPT-OSS 120B → Qwen 235B → 4 more

Knowledge

April 2025

Vision

1,000/day · auto-activates · multimodal

Reads photos, screenshots, receipts, diagrams, charts. Switches on automatically when you attach an image. Also powers receipt OCR in the expenses module. Paste an image and say 'change to navy' — FLUX.2 klein edits with instruction-following.

Engine

Llama 4 Maverick → Gemini 2.5 Flash → Flash Lite → Llama-4-Scout (Groq) → Nemotron Omni (text+image+audio+video) → 4 more

Knowledge

August 2024

Auto

Free · classifier-routed · 7 destinations

When you don't pick a model, Memo AI's auto-router classifies your question (regex first, Cerebras Llama 8B fallback for ambiguous cases) and routes to the right tier — Smart for general, Reasoner for hard maths, Live for current events, Fast for quick lookups, Coder for code, Vision for images.

Engine

intent classifier → Smart / Reasoner / Live / Fast / Coder / Vision cascade

Knowledge

varies

Read the technical deep dive →