DeepSeek, a Chinese AI research lab backed by High-Flyer Capital Management, has unveiled DeepSeek-V3, a cutting-edge open-source AI model poised to redefine the landscape of AI innovation. With 671 billion parameters and impressive performance benchmarks, this model is a major leap forward for the global open-source AI ecosystem.
A Powerful Mixture-of-Experts Model
DeepSeek-V3 operates as a Mixture-of-Experts (MoE) model, featuring 671 billion total parameters, with 37 billion activated per token during processing. The model was trained on a staggering 14.8 trillion tokens, leveraging advanced architectures for enhanced performance across various AI tasks such as coding, language translation, and creative writing.
Released on GitHub, alongside a comprehensive technical paper, DeepSeek-V3 demonstrates performance comparable to some of the most advanced closed-source models, including OpenAI’s GPT-4o and Anthropic’s Claude 3.5 Sonnet.
Performance Benchmarks: A Leader in Open-Source AI
DeepSeek-V3 consistently outperformed Meta’s Llama 3.1 405B parameter model in most benchmarks. The company reports that the model is also three times faster than its predecessor, DeepSeek-V2, achieving a token throughput of 60 tokens per second.
Key achievements of DeepSeek-V3 include:
- Outperforming Claude 3.5 Sonnet on multiple benchmarks.
- Dominating the Aider Polyglot test for coding tasks and code integration.
- Matching leading closed-source models in programming challenges hosted on platforms like Codeforces.
Despite these successes, DeepSeek-V3 trails OpenAI’s GPT-4o in some domains, such as the GPQA Diamond benchmark, where it scored 59.1% compared to GPT-4o’s 76%.

Cost Efficiency and Accessibility
DeepSeek-V3 is not only powerful but also cost-efficient. The company has announced that API pricing will remain aligned with DeepSeek V2 until February 8, 2025. Post this date, usage will cost:
- $0.27/million tokens for input.
- $1.10/million tokens for output.
This competitive pricing positions DeepSeek-V3 as one of the most affordable large models in the market.
Technological and Ethical Considerations
Training on a Budget
DeepSeek’s ability to train a model of this scale using just NVIDIA H800 GPUs over two months and with a budget of $5.5 million is a testament to its innovative approach. By comparison, models like OpenAI’s GPT-4 required clusters of 16,000 GPUs and significantly higher expenditures.
Limitations and Ethical Constraints
DeepSeek-V3’s political responses are notably constrained due to Chinese internet regulations mandating adherence to “core socialist values.” For instance, the model avoids answering politically sensitive queries, such as those related to the Tiananmen Square incident.
Open-Source Rivalry Intensifies
DeepSeek-V3’s release intensifies the competition between Eastern and Western AI models. For instance:
- Alibaba’s Qwen 2.5 series matches GPT-4o in code generation benchmarks like EvalPlus and BigCodeBench.
- DeepSeek’s previous model, V2.5-1210, showcased strong results and paved the way for V3’s enhanced capabilities.
These advancements indicate a growing trend of open-source models from China challenging Western dominance in AI innovation.
The Road Ahead
DeepSeek, spearheaded by High-Flyer Capital Management, remains focused on pushing the boundaries of AI. With its robust infrastructure, including server clusters equipped with 10,000 Nvidia A100 GPUs, the organization aims to democratize access to superintelligent AI.
Founder Liang Wenfeng’s vision of overcoming the “temporary moat” of closed-source models is already materializing as DeepSeek-V3 positions itself as a credible challenger in the global AI landscape.