# MiniMax M3

> **TL;DR.** MiniMax M3 is a 428B-parameter Mixture-of-Experts large language model from Shanghai-based [MiniMax](https://www.

- **Category:** AI / Models
- **Stage:** emergent
- **Age:** 13 days
- **Origin date:** 2026-06-01
- **First detected:** 2026-06-01
- **Canonical URL:** https://earlyterms.com/term/minimax-m3
- **Sources:** 7 primary URLs

## Definition

MiniMax M3 is a 428B-parameter Mixture-of-Experts large language model from Shanghai-based [MiniMax](https://www.minimax.io/) (稀宇科技), activating 22B parameters per token. It is the first open-weight model to combine frontier-level coding, a 1M-token context window, and native multimodal input in a single architecture.

Released June 1, 2026, M3 introduces MSA (MiniMax Sparse Attention), a sparse attention mechanism that cuts per-token compute at one-million-token context to 1/20th of its predecessor M2.7. On [SWE-Bench Pro](https://github.com/MiniMax-AI/MiniMax-M3) it scores 59.0%, edging past GPT-5.5 (58.6%) and Gemini 3.1 Pro, with weights open-sourced on Hugging Face shortly after API launch.

## Analogy

Think of MSA as a librarian who scans the index before pulling shelves — skipping 95% of stacks to fetch only the relevant pages.

## Why it's emerging now

MiniMax M3 landed June 1 as the first open-weight model pairing frontier coding (59.0% SWE-Bench Pro) with a true 1M-token context and native multimodal input — all at $0.30/M input tokens, roughly 15x cheaper than Claude Opus 4.7. The weights followed on Hugging Face by June 13, removing the last barrier for self-hosted deployment.

## Related terms

- *competitor:* deepseek-v4
- *competitor:* kimi-k2-6
- *competitor:* glm-5-1
- *competitor:* mimo-code
- *related:* agentic-coding
- *related:* context-window
- *related:* managed-agents
- *parent:* MiniMax
- *related:* MoE (Mixture of Experts)
- *related:* SWE-Bench
- *alias:* 稀宇科技

## Sources

1. [MiniMax M3 — Hugging Face model card (428B MoE, 22B active, MSA, 1M context)](https://huggingface.co/MiniMaxAI/MiniMax-M3)
2. [MarkTechPost — MiniMax M3 launch coverage (Jun 1, 2026)](https://www.marktechpost.com/2026/06/01/minimax-releases-minimax-m3-with-msa-architecture-supporting-1m-token-context-native-multimodality-and-agentic-coding/)
3. [NVIDIA Developer Blog — MiniMax M3 deployment guide](https://developer.nvidia.com/blog/deploy-long-context-reasoning-and-agentic-workflows-with-minimax-m3-on-nvidia-accelerated-infrastructure)
4. [The Decoder — Open-weight 1M-context model analysis](https://the-decoder.com/minimax-m3-open-weight-model-with-a-million-token-context-challenges-proprietary-leaders/)
5. [Artificial Analysis — M3 benchmark review and caveats](https://artificialanalysis.ai/articles/minimax-m3)
6. [Wikipedia — MiniMax Group (company background, HK IPO Jan 2026)](https://en.wikipedia.org/wiki/MiniMax_Group)
7. [GitHub — MiniMax-AI/MiniMax-M3 (official repo, created Jun 1 2026)](https://github.com/MiniMax-AI/MiniMax-M3)

---
_Generated by EarlyTerms · https://earlyterms.com/term/minimax-m3_