Models

Browse available models routable through Multi-Router. Compare capabilities, pricing, and specs.

Current Models

49 of 49

Anthropicactive

Claude Haiku 4.5

anthropic/claude-haiku-4-5

The fastest model with near-frontier intelligence. Best for low-latency, high-volume workloads.

$1.00 input/$5.00 outputper 1M tokens

textvisiontool useextended thinking

Anthropicactive

Claude Opus 4.6

anthropic/claude-opus-4-6

The most intelligent model for building agents and coding. Exceptional performance in reasoning, complex analysis, and multi-step tasks.

Latencymoderate

$5.00 input/$25.00 outputper 1M tokens

textvisiontool useextended thinkingadaptive thinking

Anthropicactive

Claude Sonnet 4.6

anthropic/claude-sonnet-4-6

The best combination of speed and intelligence. Ideal for high-throughput tasks requiring strong reasoning.

$3.00 input/$15.00 outputper 1M tokens

textvisiontool useextended thinkingadaptive thinking

Gemini 2.5 Flash

google/gemini-2.5-flash

Best price-performance model for large scale processing, low-latency, high-volume tasks with thinking.

Max Output65.5K

$0.30 input/$2.50 outputper 1M tokens

textvisiontool usereasoning

Gemini 2.5 Flash-Lite

google/gemini-2.5-flash-lite

Fastest and most budget-friendly multimodal model. Ideal for classification, extraction, and low-latency workloads.

Max Output65.5K

$0.10 input/$0.40 outputper 1M tokens

textvisiontool usereasoning

Gemini 2.5 Pro

google/gemini-2.5-pro

State-of-the-art thinking model for complex problems in code, math, STEM, and long-context analysis.

Max Output65.5K

Latencymoderate

$1.25 input/$10.00 outputper 1M tokens

textvisiontool usereasoning

Gemini 3 Flash

google/gemini-3-flash-preview

Frontier-class reasoning and multimodal understanding at a fraction of the cost of larger models.

Max Output65.5K

$0.50 input/$3.00 outputper 1M tokens

textvisiontool usereasoning

Gemini 3.1 Flash-Lite

google/gemini-3.1-flash-lite-preview

Most cost-efficient multimodal model. Optimized for high-frequency lightweight tasks like translation and data extraction.

Max Output65.5K

$0.25 input/$1.50 outputper 1M tokens

textvisiontool usereasoning

Gemini 3.1 Pro

google/gemini-3.1-pro-preview

Advanced intelligence with better thinking, improved token efficiency, and factual consistency. Optimized for software engineering and agentic workflows.

Max Output65.5K

Latencymoderate

$2.00 input/$12.00 outputper 1M tokens

textvisiontool usereasoning

MiniMax M2.5

minimax/minimax-m2-5

General-purpose model with tool use and 197K context. Free tier.

$0.00 input/$0.00 outputper 1M tokens

Nemotron 3 Nano 30B A3B

nvidia/nemotron-3-nano-30b-a3b

Compact MoE model for efficient text generation with 256K context. Free tier.

$0.00 input/$0.00 outputper 1M tokens

Nemotron 3 Super

nvidia/nemotron-3-super

MoE model (120B total, 12B active) optimized for text tasks with 262K context. Free tier.

$0.00 input/$0.00 outputper 1M tokens

Nemotron Nano 12B 2 VL

nvidia/nemotron-nano-12b-2-vl

Compact vision-language model with 128K context. Free tier.

$0.00 input/$0.00 outputper 1M tokens

Nemotron Nano 9B V2

nvidia/nemotron-nano-9b-v2

Compact reasoning model with 128K context. Free tier.

$0.00 input/$0.00 outputper 1M tokens

GPT-4.1

Excels at instruction following and tool calling with broad knowledge across domains. 1M token context.

Max Output32.8K

$2.00 input/$8.00 outputper 1M tokens

textvisiontool use

GPT-5

Strong coding, reasoning, and agentic capabilities across domains with built-in reasoning tokens.

Latencymoderate

$1.25 input/$10.00 outputper 1M tokens

textvisiontool usereasoning

GPT-5 Mini

openai/gpt-5-mini

Near-frontier intelligence for cost-sensitive, low-latency, high-volume workloads.

$0.25 input/$2.00 outputper 1M tokens

textvisiontool use

GPT-5 Nano

openai/gpt-5-nano

Fastest and cheapest GPT-5 variant. Optimized for summarization and classification tasks.

$0.05 input/$0.40 outputper 1M tokens

textvisiontool use

GPT-5.4

Most capable frontier model. Exceptional at complex professional work with configurable reasoning depth.

Latencymoderate

$2.50 input/$15.00 outputper 1M tokens

textvisiontool usereasoning

GPT-5.4 Pro

openai/gpt-5.4-pro

Premium tier using extended compute for harder problems. Best for research-grade reasoning tasks.

Latencymoderate

$30.00 input/$180.00 outputper 1M tokens

textvisiontool usereasoning

o3

Sets a new standard for math, science, coding, and visual reasoning tasks with highest reasoning capability.

Latencymoderate

$2.00 input/$8.00 outputper 1M tokens

textvisiontool usereasoning

o3-pro

Extended compute version of o3 for harder problems. Best for research-grade reasoning tasks.

Latencymoderate

$20.00 input/$80.00 outputper 1M tokens

textvisiontool usereasoning

o4-mini

Fast, cost-efficient reasoning model optimized for coding and visual tasks.

$1.10 input/$4.40 outputper 1M tokens

textvisiontool usereasoning

Qwen3 Coder 480B A35B

qwen/qwen3-coder-480b-a35b

MoE coding model (480B total, 35B active) with tool use and 262K context. Free tier.

Latencymoderate

$0.00 input/$0.00 outputper 1M tokens

Qwen3 Next 80B A3B Instruct

qwen/qwen3-next-80b-a3b-instruct

MoE instruction-tuned model with tool use and 262K context. Free tier.

$0.00 input/$0.00 outputper 1M tokens

Step 3.5 Flash

stepfun/step-3-5-flash

MoE model (196B total, 11B active) with strong reasoning and 256K context. Free tier.

$0.00 input/$0.00 outputper 1M tokens

GLM 4.5 Air

z-ai/glm-4-5-air

Reasoning model with 131K context from Zhipu AI. Free tier.

$0.00 input/$0.00 outputper 1M tokens

Legacy Models

Still available but consider migrating to current models for improved performance.

Claude 3 Haiku

anthropic/claude-3-haiku

200K ctx4.1K out$0.25 / $1.25

Claude Opus 4

anthropic/claude-opus-4

200K ctx32K out$15.00 / $75.00

Claude Opus 4.1

anthropic/claude-opus-4-1

200K ctx32K out$15.00 / $75.00

Claude Opus 4.5

anthropic/claude-opus-4-5

200K ctx64K out$5.00 / $25.00

Claude Sonnet 4

anthropic/claude-sonnet-4

200K ctx64K out$3.00 / $15.00

Claude Sonnet 4.5

anthropic/claude-sonnet-4-5

200K ctx64K out$3.00 / $15.00

Gemini 2.0 Flash

google/gemini-2.0-flash

1M ctx8.2K out$0.10 / $0.40

GPT-3.5 Turbo

openai/gpt-3.5-turbo

16.4K ctx4.1K out$0.50 / $1.50

GPT-4

8.2K ctx8.2K out$30.00 / $60.00

GPT-4 Turbo

openai/gpt-4-turbo

128K ctx4.1K out$10.00 / $30.00

GPT-4.1 Mini

openai/gpt-4.1-mini

1M ctx32.8K out$0.40 / $1.60

GPT-4.1 Nano

openai/gpt-4.1-nano

1M ctx32.8K out$0.10 / $0.40

GPT-4o

128K ctx16.4K out$2.50 / $10.00

GPT-4o Mini

openai/gpt-4o-mini

128K ctx16.4K out$0.15 / $0.60

GPT-5 Pro

openai/gpt-5-pro

400K ctx272K out$15.00 / $120.00

GPT-5.1

400K ctx128K out$1.25 / $10.00

GPT-5.2

400K ctx128K out$1.75 / $14.00

GPT-5.2 Pro

openai/gpt-5.2-pro

400K ctx128K out$21.00 / $168.00

o1

200K ctx100K out$15.00 / $60.00

o1-mini

128K ctx65.5K out$1.10 / $4.40

o1-pro

200K ctx100K out$150.00 / $600.00

o3-mini

200K ctx100K out$1.10 / $4.40