Highlights:
- DeepSeek, operated by Hangzhou DeepSeek Artificial Intelligence Co. Ltd. and Beijing DeepSeek Artificial Intelligence Co. Ltd., made headlines last week after releasing its two primary reasoning models—DeepSeek-R1-Zero and DeepSeek-R1—on Hugging Face.
- DeepSeek’s breakthrough suggests that cutting-edge AI can be developed at a fraction of the typical cost.
Hugging Face Inc. to recreate DeepSeek’s R1 “reasoning model.” Researchers from Hugging Face wants to reverse engineer this new AI model from Chinese startup.
Their effort follows R1’s surprising achievement of matching the performance of top AI models developed by the U.S. firms—despite being built at a significantly lower cost. The Open-R1 project aims to create a fully open-source version of R1 and make all its components accessible to the AI community.
Elie Bakouch, a Hugging Face engineer leading the initiative, reported that while DeepSeek claims R1 is open source because it has no usage restrictions, it does not meet the standard definition of open software. This is because key components and training data used to develop the model have not been publicly released.
Due to this lack of transparency, Bakouch explained, R1 remains a “black box” much like proprietary models such as OpenAI’s GPT series, preventing the AI community from fully understanding, improving, or building upon it.
DeepSeek, operated by Hangzhou DeepSeek Artificial Intelligence Co. Ltd. and Beijing DeepSeek Artificial Intelligence Co. Ltd., made headlines last week after releasing its two primary reasoning models—DeepSeek-R1-Zero and DeepSeek-R1—on Hugging Face. Alongside the release, the company also published a paper on arxiv.com detailing the models’ development process.
The R1 model has generated significant excitement for its ability to compete with advanced large language models like OpenAI’s GPT-4o and Anthropic PBC’s Claude, despite being developed at a total cost of just USD 5.6 million. In contrast, the U.S. tech giants such as OpenAI, Google LLC, and Meta Platforms Inc. have invested billions in building their own models.
DeepSeek’s breakthrough suggests that cutting-edge AI can be developed at a fraction of the typical cost. This revelation sent shockwaves through financial markets earlier this week, causing stocks of major AI-focused U.S. companies to plummet. Nvidia Corp. saw its stock drop by 15%, Broadcom Inc. fell 16%, and Taiwan Semiconductor Manufacturing Corp. declined 14%.
Meanwhile, DeepSeek’s iOS chatbot app, which offers free access to the R1 model, skyrocketed to the top of the Apple App Store’s productivity category, becoming the No. one app seemingly overnight.
DeepSeek claims it developed R1 using fewer and significantly less advanced graphics processing units than those used to train models like GPT-4o and Llama 3. This has sparked debate over whether the multibillion-dollar investments in AI development are truly necessary. In several benchmark tests, R1 has demonstrated performance at par with—or even surpassing—OpenAI’s o1 reasoning model.
Reasoning models stand out for their ability to “fact-check” responses before generating them, reducing the “hallucinations” that often affect traditional large language models. While this additional verification process slightly slows response times, it makes reasoning models far more reliable in fields such as physics, science, and mathematics.
Hugging Face aims to replicate R1 to benefit the AI research community and plans to do so within just a few weeks. The company will utilize its dedicated research server, the “Science Cluster,” which runs on 768 Nvidia H100 GPUs, to reverse-engineer R1—analyzing its training data and underlying components.
The Open-R1 project is also calling on the broader AI research community to help reconstruct DeepSeek’s training datasets. The initiative has already attracted significant interest, with its GitHub page amassing over 100,000 stars within just three days of launch.
Bakouch emphasized that the project isn’t a zero-sum game but rather the beginning of something that could greatly benefit the broader AI industry. He expressed hope that their efforts would lay the foundation for a new generation of even more advanced open-source reasoning models. Successfully replicating R1, he explained, would give the entire AI community the opportunity to analyze its inner workings and build upon its capabilities.
“Open-source development immediately benefits everyone, including the frontier labs and the model providers, as they can all use the same innovations,” he added.