Xiaomi MiMo
MiMo-V2-Flash
The English MiMo blog introduces MiMo-V2-Flash as a globally available open-weight foundation language model. Xiaomi describes it as a 309B-total-parameter, 15B-active-parameter Mixture-of-Experts model with hybrid attention, hybrid thinking mode, 256K context, 150 tokens-per-second inference and low API pricing. The release is available through Hugging Face, the MiMo API Platform and MiMo AI Studio.
Quick answers
At a glance
- Overview
- Xiaomi's open-weight fast MoE model for reasoning, coding, agentic workflows and 256K-context tasks.
- Best fit
- Teams evaluating a low-cost open-weight Chinese model for coding agents, long-context tasks and API/self-hosted comparison.
- Trust
- 4/4 sources verified, recently checked · 2026-05-17
- Coverage
- 100/100
Editorial verdict
Best for
Teams evaluating a low-cost open-weight Chinese model for coding agents, long-context tasks and API/self-hosted comparison.
Avoid if
Avoid using release benchmarks as the only acceptance criterion for production workloads.
Why it matters
MiMo-V2-Flash gives MiMo a concrete English-documented open-weight anchor with price, context, test and deployment details.
Pricing
$0.1 input / $0.3 output per 1M tokens listed in the English release blog
Payment
API Platform billing, AI Studio, Hugging Face self-hosting
Commercial use
Commercial use should follow the current product, API, model license and billing terms.
Privacy
Review prompt, file, media upload, retention and training-use terms before sensitive workloads.
Use-case fit
Coding agents
StrongThe release calls out Claude Code, Cursor and Cline-style coding workflows.
Low-cost API comparison
StrongUse the published $0.1/$0.3 per-1M-token reference to compare against DeepSeek, Qwen and Kimi pricing.
Self-hosted open-weight evaluation
MediumHugging Face weights and SGLang inference support make it relevant for local deployment experiments.
Global user checklist
Model names, quotas, release status, regional access and commercial terms can change quickly; recheck official sources before procurement or production use.
Pros
- - 309B total parameters with 15B active parameters
- - 256K context, hybrid thinking mode and coding-agent positioning
- - MIT-licensed weights and Day 0 SGLang inference contribution
Cons
- - Release claims should be rechecked against independent tests before procurement
- - Hosted API availability can differ from open-source self-hosting
Decision paths
deepseek-v4-api
qwen
kimi-k2-api
Sources
official · en · verified 2026-05-17
Official release with model architecture, pricing, tests, access paths and open-source notes.
other · en · verified 2026-05-17
Model weight entry linked from the release.
docs · en · verified 2026-05-17
Official hosted API access path linked from the release.
other · en · verified 2026-05-17
Official chat and studio access path linked from the release.