Navigation
Recherche
|
Cutting-Edge Chinese 'Reasoning' Model Rivals OpenAI o1
mardi 21 janvier 2025, 22:40 , par Slashdot
The releases immediately caught the attention of the AI community because most existing open-weights models -- which can often be run and fine-tuned on local hardware -- have lagged behind proprietary models like OpenAI's o1 in so-called reasoning benchmarks. Having these capabilities available in an MIT-licensed model that anyone can study, modify, or use commercially potentially marks a shift in what's possible with publicly available AI models. 'They are SO much fun to run, watching them think is hilarious,' independent AI researcher Simon Willison told Ars in a text message. Willison tested one of the smaller models and described his experience in a post on his blog: 'Each response starts with a... pseudo-XML tag containing the chain of thought used to help generate the response,' noting that even for simple prompts, the model produces extensive internal reasoning before output. Although the benchmarks have yet to be independently verified, DeepSeek reports that R1 outperformed OpenAI's o1 on AIME (a mathematical reasoning test), MATH-500 (a collection of word problems), and SWE-bench Verified (a programming assessment tool). TechCrunch notes that three Chinese labs -- DeepSeek, Alibaba, and Moonshot AI's Kimi, have released models that match o1's capabilities. Read more of this story at Slashdot.
https://slashdot.org/story/25/01/21/2138247/cutting-edge-chinese-reasoning-model-rivals-openai-o1?ut...
Voir aussi |
56 sources (32 en français)
Date Actuelle
mer. 22 janv. - 05:34 CET
|