单一 JSON 数据源(bench 维度)生成:能力、性价比、推理、编码、稳定性、速度排行榜。
| 排名 | 模型 | 供应商 | API 风格 | 综合分 | 推理 | 编码 | 稳定性 | 速度 | 性价比分 | 输入价 | 输出价 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| #1 | claude-3-5-sonnet-20241022 | claude | 92.0 | 94.0 | 91.0 | 90.0 | 73.0 | 10.4 | $3.000 | $15.000 | |
| #2 | gpt-4.1 | openai-responses | 91.0 | 92.0 | 90.0 | 91.0 | 77.0 | 18.4 | $2.000 | $8.000 | |
| #3 | gemini-2.5-pro | gemini | 90.0 | 93.0 | 88.0 | 86.0 | 76.0 | 13.1 | $3.500 | $10.500 | |
| #4 | deepseek-reasoner | openai-chat | 86.0 | 91.0 | 84.0 | 82.0 | 74.0 | 63.7 | $0.550 | $2.190 | |
| #5 | qwen-max | openai-chat | 85.0 | 86.0 | 84.0 | 85.0 | 82.0 | 21.6 | $1.600 | $6.400 | |
| #6 | gpt-4.1-mini | openai-responses | 84.0 | 82.0 | 83.0 | 88.0 | 92.0 | 84.8 | $0.400 | $1.600 | |
| #7 | gemini-2.5-flash | gemini | 83.0 | 81.0 | 80.0 | 85.0 | 91.0 | 120.9 | $0.350 | $1.050 | |
| #8 | claude-3-5-haiku-20241022 | claude | 80.0 | 78.0 | 77.0 | 87.0 | 90.0 | 33.8 | $0.800 | $4.000 | |
| #9 | deepseek-chat | openai-chat | 79.0 | 76.0 | 78.0 | 83.0 | 89.0 | 117.0 | $0.270 | $1.100 |