News

This gain is made possible by TNG’s Assembly-of-Experts (AoE) method — a technique for building LLMs by selectively merging the weight tensors ...
The updated version of DeepSeek-R1 tied for first place with Google’s Gemini-2.5 and Anthropic’s Claude Opus 4 on the WebDev Arena leaderboard, which evaluates large language models (LLMs) on ...
German firm TNG has released DeepSeek-TNG R1T2 Chimera, an open-source variant twice as fast as its parent model thanks to a ...
DeepSeek quietly updates its R1 reasoning model Chinese AI startup DeepSeek has released an update to its R1 reasoning model. The new version, named R1-0528, was published on developer platform ...
Say hello to DeepSeek-TNG R1T2 Chimera, a large language model built by German firm TNG Consulting, using three different ...
Chinese AI upstart MiniMax released a new large language model, joining a slew of domestic peers inspired to surpass DeepSeek in the field of reasoning AI.
Benchmark results cited by DeepSeek show that R1-0528 now surpasses Alibaba’s Qwen 3 and matches the performance of OpenAI and Google’s best models.
A new report says that Huawei's CloudMatrix 384 outperforms Nvidia processors running DeepSeek R1, which is to be expected given the energy use involved.