AI API Speed Benchmarks: 10 Models Tested for Latency
Published 2026-05-21
Results
| Model | TTFT | tok/s | $/M |
|---|---|---|---|
| Step-3.5-Flash | 120ms | 80 | $0.15 |
| DeepSeek V4 Flash | 180ms | 60 | $0.25 |
| Qwen3-8B | 150ms | 70 | $0.01 |
All tests via Global API, streaming enabled.
See AI Tool Reviewer for quality comparisons. Code & Cost for pricing.