MacMusic  |  PcMusic  |  440 Software  |  440 Forums  |  440TV  |  Zicos
model
Recherche

Alibaba says its new AI model rivals DeepSeeks’s R-1, OpenAI’s o1

vendredi 7 mars 2025, 02:43 , par ComputerWorld
Alibaba Cloud on Thursday launched QwQ-32B, a compact reasoning model built on its latest large language model (LLM), Qwen2.5-32b, one it says delivers performance comparable to other large cutting edge models, including Chinese rival DeepSeek and OpenAI’s o1, with only 32 billion parameters.

According to a release from Alibaba, “the performance of QwQ-32B highlights the power of reinforcement learning (RL), the core technique behind the model, when applied to a robust foundation model like Qwen2.5-32B, which is pre-trained on extensive world knowledge. By leveraging continuous RL scaling, QwQ-32B demonstrates significant improvements in mathematical reasoning and coding proficiency.”

AWS defines RL as “a machine learning technique that trains software to make decisions to achieve the most optimal results and mimics the trial-and-error learning process that humans use to achieve their goals. Software actions that work towards your goal are reinforced, while actions that detract from the goal are ignored.” 
https://www.infoworld.com/article/3840580/alibaba-says-its-new-ai-model-rivals-deepseekss-r-1-openai...

Voir aussi

News copyright owned by their original publishers | Copyright © 2004 - 2025 Zicos / 440Network
Date Actuelle
dim. 9 mars - 14:25 CET