DeepSeek-R1 is a groundbreaking AI model that revolutionizes machine reasoning capabilities. Built upon large-scale reinforcement learning without traditional supervised fine-tuning, DeepSeek-R1 has naturally developed powerful reasoning behaviors that enable it to tackle complex problems across various domains.
What sets DeepSeek-R1 apart is its unique training approach and architecture. With 671B total parameters and 37B activated parameters per token, DeepSeek-R1 demonstrates remarkable abilities in mathematical problem-solving, code generation, and logical reasoning tasks, while maintaining efficient inference capabilities.
We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing.
Innovate Ignite Invent: Your AI Business Companion
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation.
Smart Solution
Lorem ipsum dolor sit amet, dui consectetur adipiscing elit, sed do eiusmod tempor.
Better Reliability
Lorem ipsum dolor sit amet, dui consectetur adipiscing elit, sed do eiusmod tempor.
Easy Integration
Lorem ipsum dolor sit amet, dui consectetur adipiscing elit, sed do eiusmod tempor.