Model Leaderboard

单一 JSON 数据源(bench 维度)生成:能力、性价比、推理、编码、稳定性、速度排行榜。

排名模型供应商API 风格综合分推理编码稳定性速度性价比分输入价输出价
#1claude-3-5-sonnet-20241022
anthropicanthropic
claude92.094.091.090.073.010.4$3.000$15.000
#2gpt-4.1
openaiopenai
openai-responses91.092.090.091.077.018.4$2.000$8.000
#3gemini-2.5-pro
googlegoogle
gemini90.093.088.086.076.013.1$3.500$10.500
#4deepseek-reasoner
deepseekdeepseek
openai-chat86.091.084.082.074.063.7$0.550$2.190
#5qwen-max
qwenqwen
openai-chat85.086.084.085.082.021.6$1.600$6.400
#6gpt-4.1-mini
openaiopenai
openai-responses84.082.083.088.092.084.8$0.400$1.600
#7gemini-2.5-flash
googlegoogle
gemini83.081.080.085.091.0120.9$0.350$1.050
#8claude-3-5-haiku-20241022
anthropicanthropic
claude80.078.077.087.090.033.8$0.800$4.000
#9deepseek-chat
deepseekdeepseek
openai-chat79.076.078.083.089.0117.0$0.270$1.100