比较和追踪最新 AI 模型性能排名
AI 排行榜
构建时从 Artificial Analysis 获取的模型评估和定价快照。
#1 Claude Opus 4.8 (Adaptive Reasoning, Max Effort) · 智能 61.4
来源:artificialanalysis.ai
| 排名 | 模型 | 创建者 | 智能 | 编程 | 数学 | 混合 $/1M | Tok/s | TTFT (秒) |
|---|---|---|---|---|---|---|---|---|
1最佳新 | Claude Opus 4.8 (Adaptive Reasoning, Max Effort) claude-opus-4-8 | Anthropic | 61.4 | 56.7 | — | $10.938 | 59.8 | 12.48 |
2 | GPT-5.5 (xhigh) gpt-5-5 | OpenAI | 60.2 | 59.1 | — | $11.25 | 69.5 | 37.98 |
3 | GPT-5.5 (high) gpt-5-5-high | OpenAI | 58.9 | 58.5 | — | $11.25 | 69.9 | 17.38 |
4 | Claude Opus 4.7 (Adaptive Reasoning, Max Effort) claude-opus-4-7 | Anthropic | 57.3 | 52.5 | — | $10.938 | 59.1 | 21.03 |
5 | Gemini 3.1 Pro Preview gemini-3-1-pro-preview | 57.2 | 55.5 | — | $4.5 | 135.9 | 22.92 | |
6 | GPT-5.4 (xhigh) gpt-5-4 | OpenAI | 56.8 | 57.2 | — | $5.625 | 81.8 | 163.99 |
7 | GPT-5.5 (medium) gpt-5-5-medium | OpenAI | 56.7 | 56.2 | — | $11.25 | 67.5 | 4.84 |
8 | Qwen3.7 Max qwen3-7-max | Alibaba | 56.6 | 50.1 | — | $3.75 | 204.6 | 1.68 |
9 | Gemini 3.5 Flash (high) gemini-3-5-flash | 55.3 | 45 | — | $3.375 | 226.3 | 13.2 | |
10 | Gemini 3.5 Flash (medium) gemini-3-5-flash-medium | 54.8 | 43.9 | — | $3.375 | 212.6 | 11.43 | |
11 | Kimi K2.6 kimi-k2-6 | Kimi | 53.9 | 47.1 | — | $1.712 | 39.9 | 1.31 |
12 | MiMo-V2.5-Pro mimo-v2-5-pro | Xiaomi | 53.8 | 45.5 | — | $0.544 | 53.2 | 1.93 |
13 | GPT-5.3 Codex (xhigh) gpt-5-3-codex | OpenAI | 53.6 | 53.1 | — | $4.813 | 83.9 | 58.22 |
14 | Grok 4.3 (high) grok-4-3 | xAI | 53.2 | 41 | — | $1.563 | 130.1 | 23.5 |
15 | Claude Opus 4.6 (Adaptive Reasoning, Max Effort) claude-opus-4-6-adaptive | Anthropic | 52.9 | 48.1 | — | $10.938 | 49.8 | 11.03 |
16 | Muse Spark muse-spark | Meta | 52.2 | 47.5 | — | $0 | 0 | 0 |
17 | Claude Opus 4.7 (Non-reasoning, High Effort) claude-opus-4-7-non-reasoning | Anthropic | 51.8 | 53.1 | — | $10.938 | 49.5 | 1.1 |
18 | Qwen3.6 Max Preview qwen3-6-max | Alibaba | 51.8 | 44.9 | — | $2.925 | 40.4 | 1.92 |
19 | Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort) claude-sonnet-4-6-adaptive | Anthropic | 51.7 | 50.9 | — | $6.563 | 63.9 | 70.8 |
20 | DeepSeek V4 Pro (Reasoning, Max Effort) deepseek-v4-pro | DeepSeek | 51.5 | 47.5 | — | $0.544 | 53.7 | 1.19 |
21 | GLM-5.1 (Reasoning) glm-5-1 | Z AI | 51.4 | 43.4 | — | $2.15 | 55.8 | 1.09 |
22 | GPT-5.2 (xhigh) gpt-5-2 | OpenAI | 51.3 | 48.7 | 99 | $4.813 | 80.3 | 97.55 |
23 | GPT-5.5 (low) gpt-5-5-low | OpenAI | 50.8 | 52.1 | — | $11.25 | 64.1 | 1.82 |
24 | Qwen3.6 Plus qwen3-6-plus | Alibaba | 50 | 42.9 | — | $1.125 | 52.8 | 1.87 |
25 | DeepSeek V4 Pro (Reasoning, High Effort) deepseek-v4-pro-high | DeepSeek | 49.8 | 43.2 | — | $0.544 | 55.2 | 1.17 |
按Artificial Analysis 智能指数排序。
显示 25 个模型。