Intel, Ampere show running LLMs on CPUs isn't as crazy as it sounds

mercredi 1 mai 2024, 13:24 , par TheRegister

If you lower you expectations, of course. Think more Llama2-7B, less GPT-4
Popular generative AI chatbots and services like ChatGPT or Gemini mostly run on GPUs or other dedicated accelerators, but as smaller models are more widely deployed in the enterprise, CPU-makers Intel and Ampere are suggesting their wares can do the job too – and their arguments aren't entirely without merit.…

Lire la suite sur TheRegister