4 Chinese AI Models in 12 Days

Twelve days. Four models. Each from a different Chinese company. If you thought the AI race was only between OpenAI, Anthropic, and Google, it’s time to think again.

In late April and early May 2026, four large language models were released by Chinese companies, all competing with Western frontier models — but with significantly lower inference costs.

Four Models, Twelve Days

GLM-5.1 (Z.ai / Zhipu AI)

Zhipu AI arrived with GLM-5.1. This model performs remarkably well on coding and reasoning benchmarks. The interesting thing is that Z.ai is focusing on the developer ecosystem — solid SDKs, comprehensive documentation, and an API compatible with the OpenAI format. This means if you’re currently using the OpenAI API, migrating to GLM-5.1 is straightforward.

MiniMax M2.7

MiniMax, previously more active in the Chinese domestic market, has entered the global arena with M2.7. Its strength lies in multimodal tasks — text, image, and audio. Its inference cost is roughly one-third that of GPT-5.5 for comparable tasks.

Kimi K2.6 (Moonshot AI)

Kimi K2.6 comes from Moonshot AI, the company previously known for its large context window. K2.6 excels at agentic engineering — it can manage complex tool chains without getting lost along the way. This is a crucial capability for building sophisticated agents.

DeepSeek V4

And of course DeepSeek, which after the success of V3 and R1, has now arrived with V4. DeepSeek V4 performs roughly on par with GPT-5.5 in mathematical reasoning and coding. But the key point is: DeepSeek publishes its training methods. This means the open-source community can learn from them and build better models.

A Common Ceiling

The interesting thing is that all four models have reached roughly the same capability ceiling — especially in agentic engineering. When you compare benchmarks, the differences have become very small.

What does this mean? It means we’re reaching a point where “model quality” is no longer the sole differentiator. When all models perform a given task at roughly the same quality level, other factors become important:

Cost: If two models do the same job, you pick the cheaper one
Speed: Lower latency = better user experience
Ecosystem: SDK, documentation, support, community
Customizability: Fine-tuning, RAG, integration with existing systems

Key takeaway: The inference cost of Chinese models is roughly 3 to 5 times lower than Western models. For projects with high request volumes, this cost difference can save thousands of dollars per month.

Apple and Freedom of Model Choice

A related piece of news: Apple has announced that iOS 27 will allow users to choose third-party AI models. Instead of Siri or Apple’s default model, you can select Claude, GPT, or even a Chinese model as your phone’s AI assistant.

This Apple decision is significant because:

The market becomes more competitive — every model must work to attract users
Chinese models with lower costs can capture market share
Users have more choice
Pressure on Apple to improve its own model increases

Subquadratic — A New Architecture with 12 Million Tokens

Another important technical development: the Subquadratic architecture that has pushed context windows to 12 million tokens. To grasp the scale of this number:

An average book is about 70,000 tokens
12 million tokens means putting roughly 170 books in context at once
Or feeding an entire large project’s codebase to the model

The problem with standard Transformers is that the computational cost of Attention grows quadratically with context length. Double the context, and cost quadruples. The Subquadratic architecture solves this — cost grows much less than quadratically.

This means much larger context windows without dramatic cost increases become feasible. For use cases like legal document analysis, codebase review, or scientific research, this is a genuine breakthrough.

Who Benefits from This Competition?

The simple answer: developers and users like us.

When many models of similar quality exist:

Prices come down
Innovation accelerates
Open-source models improve
Dependency on any single company decreases

But there’s also a challenge: choosing becomes harder. When you have 10 good models, which one do you pick? The answer depends on your use case:

Cost matters most? -> Chinese models (DeepSeek V4, Kimi K2.6)
Ecosystem and support matter? -> OpenAI or Anthropic
Independence and control matter? -> Open-source models (DeepSeek, Llama)
Need a huge context window? -> Models with Subquadratic architecture

Conclusion

Four Chinese models in twelve days. This isn’t just a number — it’s a sign of a fundamental shift in the AI market. The era of monopoly is over. The era of diversity and competition has begun.

For us developers, this is the best time. More tools, lower costs, and more freedom of choice. We just need to learn how to choose wisely — and that itself is a new skill.