# Zaya1-8B

> **TL;DR.** Zaya1-8B is an open-weight mixture-of-experts reasoning model from [Zyphra](https://www.

- **Category:** AI / Open-Weight Models / Efficient Inference
- **Stage:** validating
- **Age:** 41 days
- **Origin date:** 2026-05-06
- **First detected:** 2026-05-07
- **Canonical URL:** https://earlyterms.com/term/zaya1-8b
- **Sources:** 7 primary URLs

## Definition

Zaya1-8B is an open-weight mixture-of-experts reasoning model from [Zyphra](https://www.zyphra.com/post/zaya1-8b) that activates only 760 million of its 8.4 billion parameters per forward pass, delivering frontier math and coding results at a fraction of the compute cost through what the company calls maximum intelligence density per active parameter.

Released on May 6, 2026, under Apache 2.0 license, ZAYA1-8B was trained on 1,024 AMD Instinct MI300X GPUs in collaboration with IBM — making it the first competitive reasoning model to demonstrate full-stack AMD viability. Its three core innovations (Compressed Convolutional Attention, MLP-based expert routing, and Learned Residual Scaling) let it match or exceed models 10-30x larger on AIME and HMMT math benchmarks.

## Analogy

Think of it as a Formula 1 car engine tuned for lap records, not highway cruising — fewer cylinders firing, maximum output per combustion.

## Why it's emerging now

Zyphra released ZAYA1-8B on May 6, 2026, combining proprietary architecture (Compressed Convolutional Attention, Markovian RSA inference) with full AMD MI300X training to produce a model that matches or exceeds DeepSeek-R1 on AIME math benchmarks using under 1B active parameters — a new efficiency frontier for open reasoning models.

## Related terms

- *competitor:* deepseek-v4
- *competitor:* qwen3
- *related:* mlx
- *related:* grpo
- *parent:* Mixture of Experts
- *child:* Markovian RSA
- *child:* Compressed Convolutional Attention
- *related:* Zyphra
- *related:* intelligence density
- *related:* AMD Instinct MI300X
- *parent:* open reasoning model
- *competitor:* DeepSeek-R1

## Sources

1. [Zyphra — ZAYA1-8B official announcement](https://www.zyphra.com/post/zaya1-8b)
2. [Hugging Face — ZAYA1-8B model card](https://huggingface.co/Zyphra/ZAYA1-8B)
3. [PR Newswire — Zyphra releases ZAYA1-8B](https://www.prnewswire.com/news-releases/zyphra-releases-zaya1-8b-a-reasoning-model-trained-on-amd-and-optimized-for-maximum-intelligence-density-per-parameter-302764700.html)
4. [VentureBeat — ZAYA1-8B: super efficient open reasoning model](https://venturebeat.com/technology/meet-zaya1-8b-a-super-efficient-open-reasoning-model-trained-on-amd-instinct-mi300-gpus)
5. [MarkTechPost — Zyphra ZAYA1-8B MoE analysis](https://www.marktechpost.com/2026/05/06/zyphra-releases-zaya1-8b-a-reasoning-moe-trained-on-amd-hardware-that-punches-far-above-its-weight-class/)
6. [Hacker News — ZAYA1-8B community discussion](https://news.ycombinator.com/item?id=48047082)
7. [IBM Newsroom — IBM and AMD collaborate with Zyphra on AI infrastructure](https://newsroom.ibm.com/2025-10-01-ibm-and-amd-collaborate-with-zyphra-on-next-generation-ai-infrastructure)

---
_Generated by EarlyTerms · https://earlyterms.com/term/zaya1-8b_
