StepFun
Step 3.7 Flash
Step 3.7 Flash is StepFun's flagship multimodal reasoning model. The docs describe it as a 196B-parameter / 11B-activation sparse MoE model with 256K context, native image and video understanding, high-throughput reasoning, tool calling, complex reasoning and three reasoning-effort levels. The page also exposes an OpenAI-compatible Chat Completions API, pricing at $0.04 input cache hit / $0.20 cache miss / $1.15 output per 1M tokens, and framework support for Claude Code, OpenClaw, Hermes Agent, Cline, Roo Code, Kilo Code and Open Code.
Quick answers
At a glance
- Overview
- StepFun's flagship multimodal reasoning model for real-world agent, coding and multimodal workflows.
- Best fit
- Teams evaluating a flagship Chinese multimodal reasoning model for agentic coding, tool use, deep research and vision/video-aware workflows.
- Trust
- 4/4 sources verified, recently checked · 2026-05-29
- Coverage
- 100/100
Editorial verdict
Best for
Teams evaluating a flagship Chinese multimodal reasoning model for agentic coding, tool use, deep research and vision/video-aware workflows.
Avoid if
Avoid it if you only need a cheap text-only model or a consumer chat product.
Why it matters
Step 3.7 Flash is the current headline model on StepFun's platform homepage, so it deserves a dedicated profile instead of being buried inside the broader platform entry.
Pricing
Input $0.04 cache hit / $0.20 cache miss; output $1.15 per 1M tokens
Payment
Account balance, Free credit first, Paid balance
Commercial use
Commercial use should follow the current product, API, model license and billing terms.
Privacy
Review prompt, file, media upload, retention and training-use terms before sensitive workloads.
Use-case fit
Agentic coding workflows
StrongUse it when code generation, file edits, terminal actions and tool orchestration need one multimodal model.
Vision and video reasoning
StrongThe model handles images and video natively, so it fits UI inspection, chart reading and multimodal analysis.
Deep research and planning
MediumReasoning effort controls make it suitable for longer planning and complex analysis tasks.
Global user checklist
Model names, quotas, release status, regional access and commercial terms can change quickly; recheck official sources before procurement or production use.
Pros
- - Native image and video understanding inside the same model
- - High-throughput reasoning and reliable tool calling
- - 256K context with low/medium/high reasoning effort controls
- - Documented support for mainstream coding and agent frameworks
Cons
- - Premium output pricing can add up on long generation runs
- - Regional signup and quota behavior still need direct account checks
- - Homepage and docs are moving quickly, so model naming and access should be rechecked
Decision paths
stepfun-open-platform
step-plan
qwen-cloud-token-plan
minimax-api
zhipu-glm
Sources
official · en · verified 2026-05-29
Confirms the homepage highlight for Step 3.7 Flash and the developer platform entry point.
docs · en · verified 2026-05-29
Documents model architecture, context length, reasoning effort, pricing, API shape and framework support.
docs · en · verified 2026-05-29
Confirms the multimodal quickstart for images, video, local files and reasoning-effort control.
pricing · en · verified 2026-05-29
Documents model pricing and account limits.