AI Tools
4 min read

Mystery Solved: Pony Alpha Is Zhipu's GLM-5 — And It's a Beast

The mysterious free AI model that rivalled Claude Opus has been unmasked. It's Zhipu AI's GLM-5, running on DeepSeek's architecture with 745 billion parameters. Here's what we know.

Remember that mysterious "Pony Alpha" model that appeared on OpenRouter last week? The one offering Claude Opus-level performance completely free, processing 40 billion tokens on day one?

We now know what it is: Zhipu AI's GLM-5.

And the specs are wild.

The Unmasking

The identity was confirmed through multiple channels:

  1. System prompt leak: The model's system prompt explicitly revealed its GLM identity
  2. "Fingerprint" tests: The AI community identified it through GLM-specific logical quirks — apparently, if you ask it about "heating vegetable oil in a pan," it gives a characteristic weird answer that's unique to the GLM family
  3. Timing: Release coincided perfectly with Zhipu's announced GLM-5 launch window around Chinese New Year

The "Pony" name? A nod to the Chinese zodiac, as many suspected.

The Technical Specs Are Insane

GLM-5 isn't just an incremental update. It's a generational leap:

SpecGLM-5
Total Parameters745 billion
ArchitectureDeepSeek Sparsity Attention (DSA)
Experts256 total, 8 active per inference
Active Parameters~44 billion per query
Sparsity Rate5.9%
Context Window202K tokens
MultimodalYes (including video understanding)

That's twice the parameters of GLM-4.7. And they're using DeepSeek's architecture.

Why DeepSeek's Architecture Matters

The decision to adopt DeepSeek's Sparsity Attention architecture is strategic genius.

DeepSeek-V3 proved you can run massive models efficiently by only activating a small fraction of parameters per query. GLM-5 takes this further: 745B total parameters, but only ~44B active at any time (5.9% sparsity).

This means:

  • Lower inference costs — you're not paying to run 745B parameters on every query
  • Faster responses — smaller active compute footprint
  • Existing tooling works — vLLM, SGLang, and other inference frameworks already support this architecture

For enterprise users, this dramatically reduces deployment barriers. You don't need to build custom infrastructure.

Multimodal: Filling DeepSeek's Gap

One area where GLM-5 goes beyond DeepSeek: multimodal capabilities.

DeepSeek-V3 is text-only. GLM-5 adds video understanding and other multimodal features, positioning it for the 2026 market where pure text models are increasingly seen as limited.

The Market Reaction

Zhipu's stock has gone vertical:

  • +200% in recent weeks
  • Market cap: 150 billion HKD (roughly $19B USD)
  • 3x their IPO valuation

This puts Zhipu firmly in the top tier of Chinese AI companies. They're no longer an underdog — they're a frontrunner.

What This Means for the AI Landscape

A few implications:

1. China's AI Gap Is Closing Fast

Six months ago, the narrative was that Chinese AI lagged behind OpenAI and Anthropic by 12-18 months. That gap is now measured in weeks, maybe days.

GLM-5 matching Claude Opus performance isn't a fluke. It's a signal.

2. DeepSeek's Architecture Is Becoming Standard

When your competitor adopts your architecture, you know you built something good. DeepSeek's sparse attention approach is now the baseline for efficient large models.

3. Free Frontier Models Are a Strategy

Zhipu didn't release GLM-5 for free by accident. They wanted:

  • Massive adoption data (40B+ tokens)
  • Global buzz and benchmark validation
  • Developer mindshare before official launch

Expect more "mystery model" marketing from AI labs.

4. Enterprise AI Just Got Cheaper

If GLM-5 can match Claude Opus at a fraction of the inference cost (thanks to sparsity), enterprise AI deployments become much more economical. The $20/user/month AI assistant might become $5.

Should You Use It?

Now that we know it's GLM-5, the calculus changes slightly:

For development and experimentation: Absolutely. It's legitimately capable.

For production: Wait for official release with proper SLAs and support.

For sensitive data: Still no. The OpenRouter deployment logs everything, and we don't know the full data handling pipeline.

For Chinese market applications: This might become your default choice if you need a model that plays well with Chinese regulations.

The Bottom Line

The AI race just got more interesting. China has a legitimate Claude Opus competitor, it runs on efficient architecture that's already supported by mainstream tools, and they gave it away for free to prove the point.

The days of American AI supremacy aren't over, but they're certainly not guaranteed.


GLM-5's official release should come soon, likely with API pricing and enterprise tiers. I'll update when that drops.

▸ TAGS
#AI#LLM#GLM-5#Zhipu#DeepSeek#open-source#China
▸ STAY IN THE LOOP

Weekly. No spam. No fluff.