2025.10.27

MiniMax M2 & Agent: Ingenious in Simplicity

https://filecdn.minimax.chat/public/92084bb1-5fb4-44d9-bab1-ce7f2de81a62.png

From Day 1 of our founding, we have been committed to the vision of "Intelligence with Everyone."


Today, we are officially open-sourcing and launching MiniMax M2, a model born for Agents and code. At only 8% of the price of Claude Sonnet and twice the speed, it's available for free for a limited time!


Our team has been building a variety of Agents to help tackle the challenges of our company's rapid growth. These Agents are beginning to complete increasingly complex tasks, from analyzing online data and researching technical issues to daily programming, processing user feedback, and even screening HR resumes. These Agents, working alongside our team, are driving the company's development, building an AI-native organization that is evolving from developing AGI to advancing together with AGI. We have an ever-stronger conviction that AGI is a force of production, and Agents are an excellent vehicle for it, representing an evolution from the simple Q&A of conversational assistants to the independent completion of complex tasks by Agents.


However, we found that no single model could fully meet our needs for these Agents. The challenge lies in finding a model that strikes the right balance between performance, price, and inference speed—an almost "impossible triangle." The best overseas models offer good performance but are very expensive and relatively slow. Domestic models are cheaper, but there is a gap in their performance and speed. This has led to existing Agent products often being very expensive or slow to achieve good results. For instance, many Agent subscriptions cost tens or even hundreds of dollars per month, and completing a single task can often take hours.


We have been exploring whether it's possible to create a model that achieves a better balance of performance, price, and speed, thereby allowing more people to benefit from the intelligence boost of the Agent era and continuing our vision of "Intelligence with Everyone." This model needs a diverse range of capabilities, including programming, tool use, logical reasoning, and knowledge, all while having extremely fast inference speeds and very low deployment costs. To this end, we developed and have now open-sourced MiniMax M2.


Let's first look at the three most critical capabilities for an Agent: programming, tool use, and deep search. We compared M2 with several mainstream models:

As you can see, the model's abilities in tool use and deep search are very close to the best overseas models. While it is slightly behind the top overseas models in programming, it is already among the best in the domestic market.


There are several algorithmic and cognitive advancements behind this, which we will share in due course. But the core principle is simple: to create a model that meets our requirements, we must first be able to use it ourselves. To achieve this, our developers, including those in business and backend teams, worked alongside algorithm engineers, investing significant effort in building environments and evaluations, and are increasingly integrating it into their daily work.


Once we mastered these complex scenarios, we found that by transferring our accumulated methods to traditional large model tasks like knowledge and math, we could naturally achieve excellent results. For example, on the popular Artificial Analysis benchmark, which integrates 10 test tasks, our model ranked in the top five globally:

We have set the API price for the model at $0.30/¥2.1 RMB per million input tokens and $1.20/¥8.4 RMB per million output tokens, while providing an online inference service with a TPS (tokens per second) of around 100 (and rapidly improving). This price is 8% of Claude 3.5 Sonnet's, with nearly double the inference speed.


Over the past weekend, many enthusiastic developers from home and abroad have conducted extensive tests with us. To make it easier for everyone to explore the model's capabilities, we are extending the free trial period to November 7th, 00:00 UTC. At the same time, we have open-sourced the complete model weights on Hugging Face. Interested developers can deploy it themselves, with support already available from SGLang and vLLM.


We believe this combination of price and inference speed is an excellent choice among the current mainstream models. Let's analyze this from two perspectives.


First, price versus performance. A suitable model should have good performance and be affordable, falling into the green area in the chart below. Here, we use the average score from the 10 test sets on Artificial Analysis to represent performance:

Next is price versus inference speed. When deploying a model, there is often a trade-off where a slower inference speed can lead to a lower price. An ideal model should be affordable and fast. We have compared several representative models:

In addition to this analysis of standard data, we also conducted 1-on-1 comparisons of practical performance against Claude 3.5 Sonnet and several open-source models:

These test sets will be released on GitHub this week.


To make it easier for everyone to use Agent-related capabilities, we have launched our M2-powered Agent product in China and upgraded the overseas version. In MiniMax Agent, we offer two modes:



Benefiting from M2's inherent inference speed, the M2-powered Agent is not only cost-effective but also completes complex tasks with significantly greater fluency.


We are currently offering MiniMax Agent for free, until our servers can't keep up.


logo