Grok 3 represents a fusion of robust reasoning capabilities with extensive pretraining knowledge. Trained on xAI’s Colossus supercluster, which boasts ten times the computational power of previous state-of-the-art models, Grok 3 exhibits marked improvements in areas such as reasoning, mathematics, coding, world knowledge, and adherence to instructions. Its reasoning prowess has been honed through large-scale reinforcement learning, enabling the model to deliberate over complex problems for extended periods, rectify errors, explore alternative solutions, and provide precise answers. Notably, Grok 3 has achieved an Elo score of 1402 in the Chatbot Arena, underscoring its superior performance in both academic benchmarks and real-world applications.
Introducing Grok 3 Mini: Cost-Efficient AI Reasoning
Alongside Grok 3, xAI has introduced Grok 3 Mini, a model that signifies a new frontier in cost-effective AI reasoning. While maintaining substantial reasoning capabilities, Grok 3 Mini is designed to be more resource-efficient, making advanced AI accessible to a broader audience without compromising performance.
Advanced Reasoning with Test-Time Compute
A standout feature of Grok 3 is its test-time compute and reasoning ability. Through reinforcement learning at an unprecedented scale, Grok 3 has refined its chain-of-thought processes, enabling advanced reasoning in a data-efficient manner. This approach allows the model to tackle complex problems by considering multiple strategies, backtracking to correct errors, and simplifying steps to arrive at accurate solutions. Depending on the complexity of the task, Grok 3 can engage in reasoning that spans from a few seconds to several minutes, ensuring thorough and precise responses.
Benchmark Performance: Setting New Standards
Grok 3’s capabilities have been validated through rigorous testing across various benchmarks. On the 2025 American Invitational Mathematics Examination (AIME), Grok 3 achieved a remarkable 93.3% accuracy. In graduate-level expert reasoning assessments (GPQA), it scored 84.6%, and on LiveCodeBench for code generation and problem-solving, it attained a 79.4% success rate. These results not only highlight Grok 3’s proficiency in academic and practical domains but also position it ahead of competitors like OpenAI’s GPT-4o and DeepSeek’s V3 in specific areas.
DeepSearch: Elevating Information Retrieval
In tandem with Grok 3, xAI has launched DeepSearch, an AI agent engineered to revolutionize information retrieval. DeepSearch is designed to navigate the vast expanse of human knowledge, synthesizing key information, resolving conflicting data, and distilling clarity from complexity. Whether users seek real-time news updates, personal advice, or in-depth scientific research, DeepSearch aims to provide concise and comprehensive reports, surpassing traditional browser searches in both speed and depth.
Access and Future Developments
Grok 3 is currently being rolled out to users, with an early preview of its reasoning capabilities available. Access is provided to X Premium and Premium+ subscribers on platforms like X and Grok.com. Premium+ users gain additional benefits, including higher usage limits and advanced features such as the “Think” mode and DeepSearch. Furthermore, xAI plans to release Grok 3 and Grok 3 Mini through its API platform in the coming weeks, extending access to both standard and reasoning models. Enterprise partners will also have the opportunity to integrate DeepSearch into their systems via the API.
A Call to Innovators
Since the launch of Grok 1 in November 2023, xAI’s dedicated team has propelled the company to the forefront of AI innovation. With Grok 3, xAI continues to advance core reasoning capabilities, leveraging the expanded Colossus supercluster to push the boundaries of what’s possible in artificial intelligence. As xAI looks to the future, it invites passionate individuals to join its mission of building AI that serves humanity’s best interests.





