News | DeepSeek Unveils Enhanced V3 Model Under MIT License

DeepSeek Unveils Enhanced V3 Model Under MIT License

Published by: Insights Desk Released: Mar 25, 2025 Source: DemandTalk

Highlights:

DeepSeek-V3, an open-source LLM launched in December, serves as the foundation for DeepSeek-R1, the reasoning model that brought the Chinese AI lab into the spotlight earlier this year.
Sources indicate that the latest DeepSeek-V3 version outperforms the original in programming tasks.

Recently, DeepSeek has unveiled an enhanced version of DeepSeek-V3 large language model under a newly introduced open-source license.

Software developer and blogger Simon Willison was the first to report the update, as DeepSeek did not make an official announcement. The Readme file for the new model, typically used to provide explanatory notes in code repositories, is currently empty.

Released in December, DeepSeek-V3 is an open-source LLM that serves as the foundation for DeepSeek-R1, the reasoning model that brought the Chinese AI lab into the spotlight earlier this year. While DeepSeek-V3 is a general-purpose model rather than one specifically optimized for reasoning, it is capable of solving some math problems and generating code.

Previously, the LLM was available under a custom open-source license. With recent release, DeepSeek has transitioned to the widely adopted MIT License, allowing developers to use and modify the updated model for commercial projects with virtually no restrictions.

Notably, the latest DeepSeek-V3 release seems to offer improved capabilities and greater hardware efficiency compared to the original.

Most advanced LLMs require data center-grade GPUs to operate. However, Awni Hannun, a research scientist at Apple Inc.’s machine learning research group, successfully ran the new DeepSeek-V3 release on a Mac Studio, where it generated output at approximately 20 tokens per second.

The Mac Studio used for testing had a high-end configuration, priced at USD 9,499. Running DeepSeek-V3 on the device required applying four-bit quantization, an optimization technique for LLMs that reduces memory usage and latency at the cost of some output accuracy.

As noted in an X post highlighted by sources, the latest DeepSeek-V3 version demonstrates improved programming capabilities compared to the original release. The post features a benchmark test assessing the model’s ability to generate Python and Bash code. The updated version scored around 60%, outperforming the original DeepSeek-V3 by several percentage points.

The model still lags behind DeepSeek-R1, the AI lab’s flagship LLM optimized for reasoning. Additionally, the latest DeepSeek-V3 release scored lower than Qwen-32B, another model designed for reasoning tasks.

Despite having 671 billion parameters, DeepSeek-V3 activates only about 37 billion when processing prompts. This design allows the model to operate with less infrastructure compared to traditional LLMs that utilize all their parameters. DeepSeek also claims that the model is more efficient than DeepSeek-R1, reducing inference costs.

The initial DeepSeek-V3 model was trained on a dataset containing 14.8 trillion tokens, utilizing approximately 2.8 million GPU hours-considerably less than what cutting-edge LLMs typically require. To enhance its output quality, DeepSeek engineers fine-tuned the model with prompt responses from DeepSeek-R1.

il est temps de devenir sérieux avec le genai dan...

harnessing ai: the future of business transformati...

prepare for the future now. achieve greater, secur...

stay ahead with modern technology...

stay ahead...

workforce upskilling for the ai era...

unlock the full potential of generative ai at work...

ai pcs are quickly becoming the key to achieving s...

developing tomorrow’s ai on today’s ai-ready w...

unveiling ai-level productivity...

the new cyber security opportunity in an ‘ai eve...

how ai is changing managed detection and response...

answering your 4 biggest questions about generativ...

understanding the costs of generative ai...

the top 5 generative ai questions on every executi...

7 leading generative ai use cases...

6 steps to success with generative...

revolutionize your product launches with ai-driven...

unlock the full potential of ai-powered software d...

new era energy efficiency whitepaper longform...

compliance automation: a strategic investment for ...

leading the way: how modern workplaces embrace cha...

choosing the right ai foundation model for your ne...

ai governance: the path to responsible ai...

ai in market research: new possibilities, new insi...

ai ready workforce: upskilling for the ai era...

ai pricing strategy: the key to sustainable busine...

ai in business strategy: enhancing decisions boos...

genai at work: revolutionizing modern business ope...

ai misinformation: ai’s role in amplifying misin...

decision intelligence empowering business actions ...

committee machine in ml harnessing ensemble techni...

information processing language serves scalable an...

ai agents in business: transforming operations dr...

ai adoption framework: key components for effectiv...

machine learning use cases that deliver tangible r...

profitable ai-powered data management solutions to...

business-centric cognitive architecture revolution...

ai use cases – innovations for business success...

the role of ai in software development...

openai upgrades chatgpt’s image generation tool ...

microsoft is improving security copilot service wi...

deepseek unveils enhanced v3 model under mit licen...

nvidia reportedly acquires gretel to generate arti...

dataminr raises usd 85 m for real-time analytics...

ai code review startup graphite raises usd 52 m to...

zoom upgrades with agentic ai for enhanced video c...

google introduces gemini robotics and gemini robot...

google launches next-gen lightweight gemma ai mode...

ai21 labs introduces maestro for enhancing llm qua...

servicenow to acquire moveworks in a usd 2.9 b...

qualcomm acquires edge impulse, edge ai startup...

google introduces two new ai features to enhance i...

coreweave plans to buy weight biases for seamless...

openai launches nextgenai consortium with 15 insti...

anthropic pbc raises usd 3.5 b at usd 61.5 b value...

openai introduces gpt-4.5 as the most advanced and...

amazon launches alexa , an llm-powered assistant...

perplexity ai is creating a browser for ‘agentic...

mongodb acquires voyage ai for ai models generatin...

role of machine learning in networking...

DeepSeek Unveils Enhanced V3 Model Under MIT License

Insights Desk

Related posts

OpenAI Upgrades ChatGPT’s Image Generation Tool ...

Microsoft is Improving Security Copilot Service wi...

Nvidia Reportedly Acquires Gretel to Generate Arti...

Dataminr Raises USD 85 M for Real-time Analytics...

AI Code Review Startup Graphite Raises USD 52 M to...

Zoom Upgrades with Agentic AI for Enhanced Video C...

Google Introduces Gemini Robotics and Gemini Robot...

Google Launches Next-Gen Lightweight Gemma AI Mode...

AI21 Labs Introduces Maestro for Enhancing LLM Qua...

Our Brands