比较和追踪最新 AI 模型性能排名
AI 排行榜
构建时从 Artificial Analysis 获取的模型评估和定价快照。
#1 GPT-5.4 (xhigh) · 智能 57.2
来源:artificialanalysis.ai
| 排名 | 模型 | 创建者 | 智能 | 编程 | 数学 | 混合 $/1M | Tok/s | TTFT (秒) |
|---|---|---|---|---|---|---|---|---|
1最佳 | GPT-5.4 (xhigh) gpt-5-4 | OpenAI | 57.2 | 57.3 | — | $5.625 | 72.7 | 178.11 |
2最佳 | Gemini 3.1 Pro Preview gemini-3-1-pro-preview | 57.2 | 55.5 | — | $4.5 | 122.3 | 24.46 | |
3 | GPT-5.3 Codex (xhigh) gpt-5-3-codex | OpenAI | 54 | 53.1 | — | $4.813 | 82 | 59.3 |
4 | Claude Opus 4.6 (Adaptive Reasoning, Max Effort) claude-opus-4-6-adaptive | Anthropic | 53 | 48.1 | — | $10 | 51.2 | 10.52 |
5 | Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort) claude-sonnet-4-6-adaptive | Anthropic | 51.7 | 50.9 | — | $6 | 51.4 | 32.71 |
6 | GPT-5.2 (xhigh) gpt-5-2 | OpenAI | 51.3 | 48.7 | 99 | $4.813 | 75.4 | 58.52 |
7 | GLM-5 (Reasoning) glm-5 | Z AI | 49.8 | 44.2 | — | $1.55 | 69.6 | 0.81 |
8 | Claude Opus 4.5 (Reasoning) claude-opus-4-5-thinking | Anthropic | 49.7 | 47.8 | 91.3 | $10 | 53.5 | 10.81 |
9新 | MiniMax-M2.7 minimax-m2-7 | MiniMax | 49.6 | 41.9 | — | $0.525 | 41.3 | 1.75 |
10 | MiMo-V2-Pro mimo-v2-pro | Xiaomi | 49.2 | 41.4 | — | $1.5 | 0 | 0 |
11 | GPT-5.2 Codex (xhigh) gpt-5-2-codex | OpenAI | 49 | 43 | — | $4.813 | 114.1 | 3.42 |
12 | Grok 4.20 Beta 0309 (Reasoning) grok-4-20 | xAI | 48.5 | 42.2 | — | $3 | 246.2 | 10.8 |
13 | Gemini 3 Pro Preview (high) gemini-3-pro | 48.4 | 46.5 | 95.7 | $4.5 | 134.3 | 25.2 | |
14 | GPT-5.4 mini (xhigh) gpt-5-4-mini | OpenAI | 48.1 | 51.5 | — | $1.688 | 184.5 | 3.9 |
15 | GPT-5.1 (high) gpt-5-1 | OpenAI | 47.7 | 44.7 | 94 | $3.438 | 120.5 | 14.92 |
16 | Kimi K2.5 (Reasoning) kimi-k2-5 | Kimi | 46.8 | 39.5 | — | $1.2 | 32.6 | 1.3 |
17 | GLM-5-Turbo glm-5-turbo | Z AI | 46.8 | 36.8 | — | $0 | 0 | 0 |
18 | GPT-5.2 (medium) gpt-5-2-medium | OpenAI | 46.6 | 44.2 | 96.7 | $4.813 | 0 | 0 |
19 | Claude Opus 4.6 (Non-reasoning, High Effort) claude-opus-4-6 | Anthropic | 46.5 | 47.6 | — | $10 | 46.6 | 1.77 |
20 | Gemini 3 Flash Preview (Reasoning) gemini-3-flash-reasoning | 46.4 | 42.6 | 97 | $1.125 | 190.6 | 6.08 | |
21 | Qwen3.5 397B A17B (Reasoning) qwen3-5-397b-a17b | Alibaba | 45 | 41.3 | — | $1.35 | 62.4 | 1.4 |
22 | GPT-5 (high) gpt-5 | OpenAI | 44.6 | 36 | 94.3 | $3.438 | 102.7 | 76.33 |
23 | GPT-5 Codex (high) gpt-5-codex | OpenAI | 44.6 | 38.9 | 98.7 | $3.438 | 179.3 | 7.41 |
24 | GPT-5.4 nano (xhigh) gpt-5-4-nano | OpenAI | 44.4 | 43.9 | — | $0.463 | 206.4 | 2.38 |
25 | Claude Sonnet 4.6 (Non-reasoning, High Effort) claude-sonnet-4-6 | Anthropic | 44.4 | 46.4 | — | $6 | 45.8 | 0.83 |
按Artificial Analysis 智能指数排序。
显示 25 个模型。