I compared two fast AI inference, here is the result
4/17/2025

I mean, sure, benchmarks are cool and all. But I care more about what it feels like to actually use them.
What are these?
Groq runs Meta’s Llama models on custom chips called LPUs — Language Processing Units. Think of them as ultra-fast processors built specifically for running large language models efficiently.
Cerebras takes a different route. It uses its own massive chip, the Wafer-Scale Engine (WSE), which is optimized for AI workloads. Instead of processing tasks in a traditional CPU/GPU setup, Cerebras processes the entire model in one place, massively reducing memory bottlenecks.
- Both are built for speed, but they do it differently:
 
Groq focuses on ultra-low latency and high throughput for tokens per second (T/s).
Cerebras aims for end-to-end efficiency by running models like Llama 3 70B on a single chip, resulting in huge speed advantages.
In a recent test I ran:
Groq: ~275 tokens/sec
Cerebras: ~2185 tokens/sec
Same model. Same parameter. Same prompt. That’s a big leap — Cerebras is seriously fast.
About those two companies
Groq
- Founded: 2016 by Jonathan Ross and Douglas Wightman, both former Google engineers .Groq+5Wikipedia+5Wikipédia, l’encyclopédie libre+5
 - Headquarters: Mountain View, California.
 - Core Technology: Develops custom AI inference chips known as LPUs (Language Processing Units), optimized for high-speed execution of large language models.
 - Funding:
 - Total Raised: Over $1 billion across five funding rounds .
 - Latest Round: August 2024, secured $640 million in a Series D round led by BlackRock, Cisco Investments, and Samsung Catalyst Fund, achieving a valuation of $2.8 billion .
 - Additional Commitment: In February 2025, received a $1.5 billion commitment from Saudi Arabia to expand AI chip deployment in the region .Clay+7Tech & Data VC+7Groq+7Reuters
 - Recent Developments:
 - Launched GroqCloud, a developer platform providing API access to their LPUs.
 - Acquired Definitive Intelligence in March 2024 to enhance cloud-based AI solutions .Dealroom+2Wikipédia, l’encyclopédie libre+2Wikipedia+2
 
⚙️ Cerebras Systems
- Founded: 2015 by Andrew Feldman, Gary Lauterbach, Michael James, Sean Lie, and Jean-Philippe Fricker .Wikipedia
 - Headquarters: Sunnyvale, California.
 - Core Technology: Creator of the Wafer-Scale Engine (WSE), the largest computer chip designed to accelerate AI workloads by processing entire models on a single chip.
 - Funding:
 - Total Raised: Approximately $720 million over six funding rounds .
 - Latest Round: November 2021, raised $250 million in a Series F round led by Alpha Wave Global, valuing the company at over $4 billion .Dealroom+3Tech & Data VC+3Tech & Data VC+3Yahoo Finance+3Cerebras+3Tech & Data VC+3
 - IPO Status:
 - Filed for an IPO in September 2024, aiming to raise up to $1 billion.
 - The IPO has been delayed due to a national security review by the Committee on Foreign Investment in the United States (CFIUS) concerning a $335 million investment from Abu Dhabi-based G42 .Yahoo Finance+1axios.com+1Reuters+1Wikipedia+1
 - Notable Deployments:
 - Collaborated with Mayo Clinic to develop genomic foundation models for personalized healthcare .
 - Partnered with G42 to build AI supercomputers and train large language models, including the Arabic language model Jais .Wikipedia+1Reuters+1