Inception’s breakthrough diffusion-based approach to language generation enables the world’s fastest, most efficient AI models with best-in-class quality.
The diffusion difference. From sequential to parallel
All other LLMs generate text one token at a time. Mercury diffusion LLMs (dLLMs) generate tokens in parallel, increasing speed and maximizing GPU efficiency.
Blazing-fast performance you can notice
Build the future of AI apps with Mercury

Lightning fast agents
Automate complex coding and other business workflows with with ultra-responsive AI.

Real-time voice
Engage naturally with AI in voice-powered workflows like customer support, translation, and immersive gaming.

Instant code editing
Stay in-the-flow with responsive autocomplete, intelligent tab suggestions, and fast chat responses.

Fast, creative co-pilots
Supercharge editorial and creative work—less waiting, more creating.

Rapid search
Instantly surface the right data from across your organization’s knowledge base.
Foundational models
Meet our family of diffusion models
Research
Led by visionary AI researchers
Our founders pioneered diffusion modeling and invented cornerstone AI technologies.

Loved by leaders and innovators
We’re available through major cloud providers like AWS Bedrock and Azure Foundry. Talk with us about fine-tuning and private deployments.
Integrate in seconds
Our models are OpenAI API compatible and a drop-in replacement for traditional LLMs.
Enterprise AI partner
We’re available through major cloud providers like AWS Bedrock and Azure Foundry.
Reliability at scale
Get 99.5%+ uptime and priority support with custom SLAs.



