Growth Story · No. 05

ElevenLabs / ElevenLabs Inc.

From a stealth-mode TTS bet to an $11B voice-AI platform in under four years

ElevenLabs spent 12 months in stealth before shipping a beta in January 2023. The product went viral in days — first as a creator phenomenon, then as a misuse scandal, then as the default voice infrastructure for media, publishers, and voice agents. Every funding round bundled a model release. Every model release re-set the ceiling on what voice AI was assumed to do.

12 min readFounded 2022-0422 events tracked9 deep dives

01Timeline

ARR, valuation, and every GTM move, on one timeline.

Events split into four horizontal bands by type. Markers with a halo jump to a deep-dive section below. Hover anything for a summary; click external markers to jump to the original source.

ProductFundingMediaClick for deep diveARRValuation

02Platform Mix

Which channels mattered when.

Cursor used six platforms differently. Some carried the entire arc; some were episodic catalysts; one was the discipline of staying off.

𝕏X (Twitter)

All stages — load-bearing

Founder + product launch channel

Mati Staniszewski (@matiii) and the @elevenlabsio handle drive every major launch. Audio demos travel exceptionally well on X — voice-clip tweets autoplay and get re-shared with the original audio attached, which is rare. Each model release lands as a clip thread first.

⚡ Catalyst moment

Eleven v3 alpha launch tweet (June 5, 2025) — audio-tag demo clips shared by founders, Karpathy, and the AI-builder crowd. Clip-native delivery is what made v3 feel like a 'new category' on day one.

View tweet

✓ Works when

When the product output is itself a shareable artifact (audio clip, voice demo). The platform autoplays audio inline — make every launch a clip, not a screenshot

✗ Don't expect

If the team posts press-release prose. Voice AI Twitter expects new audio in the post, not links to a blog

@elevenlabsio on X↗

▶YouTube

Pre-inflection + Hypergrowth

Demo amplification + investor narrative

Two layers: founder long-form podcasts (Sequoia Training Data, a16z Show, Nothing Left Unsaid) and creator tutorials. Mati Staniszewski's investor-podcast circuit through 2025 reset perception from 'TTS startup' to 'voice infrastructure company.' Creator tutorials drive self-serve sign-ups continuously in the background.

⚡ Catalyst moment

Mati Staniszewski on Sequoia's Training Data (mid-2025) — the long-form artifact a16z, ICONIQ, and later Sequoia could circulate inside their own LP and exec networks before each subsequent funding round.

Watch episode

✓ Works when

When the founder can carry a 60-90 minute investor-grade conversation, AND the product has new demos in every episode. The double layer — founder + creator — compounds

✗ Don't expect

One-off keynote talks with no follow-up. The pattern only works as a continuous interview cadence, not single appearances

ElevenLabs on YouTube↗

YHacker News

Latent + Platform build-out

Technical credibility validator

Multilingual v2, Conversational AI v1, Eleven v3, and Reader all landed on HN front page with hundreds of comments. The HN signal mattered most for the conversational-AI launch in November 2024 — it proved the platform pivot was technically credible to the developers ElevenLabs needed to build agents on top.

⚡ Catalyst moment

Conversational AI v1 launch threads (Nov 2024) — front-page placement with serious technical scrutiny. Two months later: Series C at $3.3B.

Read on HN

✓ Works when

When the launch has measurable technical novelty — a new model, a new architecture, real benchmarks. HN voters reward demos that reveal capability

✗ Don't expect

Don't post 'we hit $X ARR' or pricing changes. HN punishes business-update framing

Search HN for ElevenLabs↗

r/Reddit

Creator-first virality + ongoing

Use-case discovery + retention layer

r/ElevenLabs and adjacent communities (r/AIVoiceCloning, r/audiobooks, r/IndieDev) became the place where creators trade voice IDs, prompt tricks, and use-case templates. Less acquisition, more activation — it's where new users learn how to get good output in their first hour.

⚡ Catalyst moment

No single moment. The community formed organically around the free tier in 2023 and matured as a self-serve support hub through 2024-2025.

Open r/cursor

✓ Works when

When you have a free tier generous enough that users want to share what they made. The community emerges from the product, not from outreach

✗ Don't expect

If you try to seed posts. The community detects vendor accounts within days

Visit r/ElevenLabs↗

inLinkedIn

Voice-agent hypergrowth (enterprise)

Enterprise + investor signal

Mati Staniszewski's LinkedIn has carried real weight since mid-2025. ARR milestones, funding rounds, and customer wins announced as personal posts get re-shared inside Deutsche Telekom-style enterprise procurement networks. Higher score than typical for a dev-tools company because the buyer for voice agents is an enterprise contact-center exec, not a developer.

⚡ Catalyst moment

Mati Staniszewski's $200M ARR LinkedIn post (Aug 2025) — disclosed numbers, customer logos, and explicit 'building toward IPO' framing all in one post. Set the tone for the September tender and February Series D.

View source

✓ Works when

When your buyer is enterprise and your category is 'voice infrastructure / contact center.' Every CIO and head of CX is on LinkedIn — not on X

✗ Don't expect

For purely creator-targeted launches. LinkedIn audiences won't engage with a v3 audio-tag demo the way X will

ElevenLabs on LinkedIn↗

◉Instagram

Creator-first virality

Consumer-creator amplification

Unusually high score for a B2B AI company — and it's earned. Voice clones (Darth Vader, Iconic Voices, celebrity parodies) routinely cross from TikTok to Instagram Reels without ElevenLabs lifting a finger. Reader-app demos with Judy Garland and James Dean read native to the format. The company doesn't post heavily; the creators do.

⚡ Catalyst moment

Iconic Voices launch coverage (July 2024) — Garland reading 'Wizard of Oz' clips circulated as Instagram Reels through CBS, CNN, and Variety social handles. Mainstream-press distribution, not paid.

View source

✓ Works when

When the product output has visual + audio appeal (voice clips paired with celebrity faces, video demos). Instagram is downstream of TikTok creator content, not a primary channel

✗ Don't expect

For developer features, API launches, conversational-AI configuration. Skip Instagram for those entirely

@elevenlabsio on Instagram↗

03Synthesis

The full thesis.

The big-picture read on what actually drove the curve — before zooming in on each key moment.

ElevenLabs did not have a slow burn.

The product went from stealth to one million users in five months, from one million users to a $1.1B unicorn in 12 more, and from there to $330M ARR and an $11B Series D in 24. The whole arc fits inside 46 months. What looks like luck is actually a five-move pattern that the founders ran four times in a row.

The dubbing thesis was the moat

Mati Staniszewski (ex-Palantir) and Piotr Dabkowski (ex-Google) grew up in Poland watching badly dubbed American films. The original product idea, sketched in 2020, was fixing that — voice that crosses languages with the speaker's emotion intact.

That insight pre-committed ElevenLabs to two architectural choices most competitors did not make:

Multilingual from day one. The first beta in January 2023 shipped with English and Polish. Eleven Multilingual v2 (August 2023) covered nearly 30 languages with the original speaker's accent preserved. By v3 (June 2025), the count was 70+.
Emotion as a first-class output. Not "TTS that sounds OK" but "voice that conveys feeling across language barriers." The audio-tag syntax in v3 — [excited], [whispers], [laughing] — is the natural endpoint of that thesis, six years from the original sketch.

Competitors with a generic TTS framing (Resemble, Murf, WellSaid Labs) optimized for narration quality. ElevenLabs optimized for emotional cross-lingual transfer. The framing constrained the product roadmap in a way that paid off every release.

Five months in stealth, then the scandals

ElevenLabs incorporated in April 2022 and did not ship until January 23, 2023. That's nine months of model training and infrastructure work before any user touched the product.

The launch arc that followed compressed into days:

Date	Event
Jan 23, 2023	Public beta + $2M pre-seed announced
Jan 30, 2023	4chan abuses voice cloning (Emma Watson, Joe Rogan, Ben Shapiro)
Jan 31, 2023	ElevenLabs ships paid-only voice cloning + AI detection tool
Jun 2023	1M registered users — five months from launch

The 4chan incident is the first thing that should have killed the company. Instead, ElevenLabs absorbed it as a forced-trust posture: voice cloning behind paid ID verification, classifier for AI-generated audio, account-level traceability — all shipped within days.

It did the same thing 12 months later. On January 26, 2024, Pindrop traced a fake Biden New Hampshire-primary robocall to ElevenLabs. The account was suspended within 72 hours, the company gave a clear public statement, and the Biden episode became a case-study citation in every "responsible AI" panel for the rest of the year.

Most companies hide misuse. ElevenLabs treated each incident as a chance to publicly demonstrate that the platform had auditable controls. The trust narrative ended up doing real GTM work — Deutsche Telekom and large enterprise contracts in 2025 cited operational discipline as a reason to commit.

Every funding round was a product bundle

Look at the cadence:

Round	Date	Bundled launch
Pre-seed $2M	Jan 23, 2023	Public beta
Series A $19M @ $100M	Jun 21, 2023	New voice products
Series B $80M @ $1.1B	Jan 22, 2024	Voice Marketplace + Dubbing Studio + Mobile SDK
Series C $180M @ $3.3B	Jan 30, 2025	Conversational AI v1 had shipped 10 weeks earlier
Tender $100M @ $6.6B	Sep 8, 2025	Bundled with $200M ARR disclosure
Series D $500M @ $11B	Feb 4, 2026	Bundled with $330M ARR + IPO talk

Six rounds. Six bundled milestones. Every announcement window doubled as a product window.

The underlying logic is straightforward: a solo "$X funding" announcement gets you 3–5 days of capital-press coverage. A "$X funding + $Y ARR + new product" bundle gets you the same window across capital press, dev press, telco trade press, and SaaS press — for the same announcement budget.

The platform pivot that mattered

Most TTS companies stopped at API-as-a-product. ElevenLabs made a deliberate move to platform tier in November 2024.

Conversational AI v1 (November 18, 2024) integrated TTS + STT + LLM orchestration into a single agent stack. Conversational AI 2.0 (June 3, 2025) added native turn-taking, language detection, multi-character mode, and batch outbound calling.

The competitive geometry changed. In November 2024, ElevenLabs was selling against other TTS APIs (Cartesia, PlayHT, Resemble). By mid-2025 it was selling against Vapi, Retell, and the contact-center incumbents (NICE, Genesys, Five9) — a much larger market with much larger contract sizes.

The strategic-investor list on the Series C tells the same story: new strategic checks from Deutsche Telekom, NTT DOCOMO Ventures, RingCentral Ventures, HubSpot Ventures, and LG Technology Ventures (Salesforce Ventures, in the round, was a returning investor from earlier rounds). Telco, CRM, and consumer electronics — not creator tools. The pivot up the stack was the precondition for those checks.

Creator distribution did the brand work for free

ElevenLabs' visible marketing budget through 2024 was small. The acquisition machine was creator-first:

Voice clones travel native on social. A Darth Vader clip on TikTok. A Judy Garland reading on Instagram. The product output is itself the share unit. Most B2B tools envy this.
The free tier is the marketing. A generous free quota means creators experiment, and the experiments turn into Reels, TikToks, YouTube shorts. ElevenLabs gets the brand impression for free.
Voice Marketplace as a flywheel. Creators upload custom voices, other users discover and use them, the original creator earns. Three-way alignment that gives ElevenLabs viral content as a byproduct.
Iconic Voices as PR primer. Garland / Dean / Reynolds / Olivier (July 2024) put ElevenLabs in CNN, CBS, and Variety — outlets dev-tool companies almost never reach. The estate-licensing angle was the news hook.

When Mati Staniszewski went on Sequoia's Training Data, a16z Show, and Lenny's adjacent podcasts through 2024-2025, the founder-as-IP pattern translated directly to investor-narrative work. Different audience from the TikTok creators, same compounding mechanism.

The pattern, distilled

Six moves ElevenLabs ran. Each one is reusable in any AI infrastructure category.

Lock the thesis to a multilingual, emotion-preserving frame from day one. The framing constrained the roadmap (v2 → v3 → audio tags) in a way that made each release feel like progress on the same promise.
Bundle every funding round with at least one product launch. Same press budget, 3-4× the coverage surface. Six rounds in a row, never broken.
Treat misuse as a forced-trust audit, not a PR crisis. Two scandals in 12 months. Both were absorbed as proof of operational discipline. Telco and enterprise procurement reads the response, not the incident.
Move up the stack before competitors do. TTS API → conversational platform was an 11-week jump (Conv-AI v1 in Nov 2024, Series C in Jan 2025 with telco strategics on board). Competitors who stayed at API tier are now selling into a smaller market.
Make the product output the share unit. Voice clips are autoplay-native on X, Instagram, TikTok. Free tier turns creators into a brand-extension layer the company doesn't pay for.
Run the founder-as-IP loop for investors specifically. Long-form podcasts (Training Data, a16z Show, Nothing Left Unsaid) timed to between funding rounds. Each new round happens after a podcast circuit, not during it.

What's specific to ElevenLabs

The playbook is reusable, but four preconditions kept it from being available to most teams:

A category where the output is itself a share unit. Audio clips autoplay on every social platform. Most categories — code, data pipelines, infrastructure dashboards — don't have this property. If your output isn't a share unit, the creator flywheel won't ignite.
A multilingual founding insight, not a US-centric one. The dubbing-pain-point origin gave ElevenLabs a pre-committed reason to ship 70+ languages. A US-only TTS startup would not have built that surface.
Founders willing to absorb misuse incidents publicly. Most CEOs would have hidden the 4chan and Biden-deepfake stories. ElevenLabs front-ran each one — a posture that requires unusual founder calm.
Telco and enterprise demand timed to the platform pivot. The Conversational AI launch hit at exactly the moment Deutsche Telekom-class buyers were procuring voice infrastructure. A year earlier, the same product would have had no enterprise buyers.

If you have these four, the pattern holds. If you don't, the pattern still teaches what to build toward.

What's not in the public record

Things outside reporters can't see, that probably matter most:

The actual cost of model training in 2022-2023. Stealth mode is expensive. The pre-seed amount ($2M) is too small to fund a year of GPU work — the founders likely bootstrapped and burned personal capital. Specific numbers are private.
Real free-to-paid conversion rates. ElevenLabs has been generous with topline ARR but never disclosed conversion economics. The 1M-users-by-June-2023 figure could mean 5% paid or 0.5% paid — the gap matters.
The exact mechanics of enterprise sales motion. Deutsche Telekom and Revolut are named publicly. The contract sizes, sales cycle length, and PoC-to-deal conversion are not.
The competitive cost structure vs Cartesia, PlayHT, Vapi, Retell. Voice AI is one of the most crowded AI infrastructure categories. ElevenLabs' margin per million characters generated, vs competitors, is the question that determines whether the IPO narrative survives 2026.

These are the questions Sacra deep-dives, The Information enterprise reporting, and S-1 disclosures will eventually answer. Public traces alone get this story to about 70%. The last 30% is locked behind paywalls and S-1 due diligence.

04Deep Dives

9 key moments, fully unpacked.

For each: the catalyst, the concrete numbers, why it landed, and the reusable pattern underneath. Read straight through, or jump to any one.

04 / 012023-01-23

FundingBundled milestone

$2M Pre-Seed and a Public Beta — The Bundled Launch That Pulled in 1M Users in Five Months (Jan 2023)

ElevenLabs spent nine months in stealth, then announced funding and shipped a public beta in the same week. The free tier and clip-ready output did the rest.

Original source ↗

January 23, 2023. ElevenLabs announces a $2 million pre-seed led by Credo Ventures with Concept Ventures, and on the same day opens its text-to-speech beta to the public. English and Polish voices, free tier, no waitlist.

By June 2023 — five months later — the platform crosses one million registered users.

The stealth bet that preceded the launch

Mati Staniszewski (ex-Palantir) and Piotr Dabkowski (ex-Google) incorporated ElevenLabs in April 2022. From April 2022 to January 2023, the company was effectively dark: no public site beyond a landing page, no demos, no press, no product.

Nine months is a long stealth window for a $2M pre-seed company. The founders used it to train a TTS model that was meaningfully better than what was publicly available — Microsoft Azure, Google Cloud TTS, and Amazon Polly were the alternatives at the time, and they sounded robotic by comparison.

The discipline of staying dark is the unsung part. Most pre-seed founders ship a leaky beta to a small group on day 60 because they want feedback. ElevenLabs waited until the model was better than the incumbents.

The bundle: funding + beta + free tier

Three things hit at once on January 23:

Funding announcement. Pre-seed $2M, lead and participants disclosed.
Public beta open. Anyone could sign up that day.
Generous free tier. 10,000 characters per month free, paid plans starting at $5/month.

The free tier was the load-bearing piece. A creator could generate 30-60 seconds of audio without paying — enough to make a TikTok, a YouTube intro, or a Twitter clip. The first audio they made was almost certainly shareable.

What competitors offered	What ElevenLabs offered
API-only, paid tier minimum ~$50/mo	Free 10,000 characters + $5 minimum
Robotic narrator voices	Emotional, human-like output
English-only or thin multilingual	English + Polish + path to 28 more

The pricing wasn't a strategic move to undercut. It was a deliberate choice to put the product in front of the long tail of creators — TikTokers, podcasters, indie game devs, YouTubers — who would amplify it.

What "1M users in 5 months" actually looked like

The growth wasn't paid. It wasn't a Product Hunt launch (ElevenLabs ranked but didn't dominate). The mechanics were:

Twitter / X audio clips. Users posted "listen to this" tweets with ElevenLabs-generated voices. The clips autoplayed inline. Each share carried the brand.
Hacker News submissions. Beta launch hit HN front page; technically curious devs signed up to try.
TikTok creator usage. Voice-over for narration content, especially Reddit-story TikToks, took off through February-April.
Reddit threads. r/MachineLearning, r/AIVoiceCloning, r/sidehustles all surfaced ElevenLabs as the new tool that worked.

The growth curve is the signature of a product that fits a frame perfectly — every casual user becomes a distribution node because the output is itself the share unit.

The 4chan incident, one week later

The same launch that pulled in users pulled in abusers. By January 30, 2023, 4chan users had cloned Emma Watson, Joe Rogan, and Ben Shapiro to generate offensive content.

ElevenLabs responded the next business day with paid-only voice cloning, an AI detection tool, and traceability per generation. The story is in the next deep-dive — but it's worth noting here that the misuse was the byproduct of the same generosity that drove the user growth.

The free tier got ElevenLabs to 1M users. Voice cloning behind ID verification kept the company from being shut down for it.

The compounding effect on Series A

By June 2023, ElevenLabs had ~1M users on the platform. That metric — verifiable, auditable — closed the Series A on terms ($19M at ~$100M post) that would have been impossible from a cold start.

Date	Round	Valuation	Trigger
Jan 23, 2023	Pre-seed $2M	~$12M	Beta launch
Jun 21, 2023	Series A $19M	~$100M	1M users + voice products
Jan 22, 2024	Series B $80M	$1.1B	Multilingual v2 + Dubbing Studio

a16z, Nat Friedman, and Daniel Gross co-led the Series A. Mike Krieger (Instagram), Brendan Iribe (Oculus), Mustafa Suleyman (DeepMind), and Tim O'Reilly came in as angels — the kind of investor list a five-month-old company doesn't normally attract.

The user-count milestone made the round possible. The bundled launch in January made the user count possible.

Sources

Next →4chan misuse scandal

04 / 022023-01-30

MediaForced trust posture

The 4chan Voice-Cloning Scandal That Nearly Killed the Launch — And the 24-Hour Response That Saved It (Jan 2023)

Seven days after public beta, 4chan users cloned celebrities to generate abuse. ElevenLabs shipped paid-only cloning, an AI detector, and traceability the next business day. The crisis became the trust posture.

Original source ↗

January 30, 2023 — seven days after the public beta opened. Vice reports that 4chan users have used ElevenLabs to clone Emma Watson reading from "Mein Kampf," Joe Rogan and Ben Shapiro making racist comments, and David Attenborough delivering threats.

The story hits Slashdot, Futurism, OECD AI's incident database, and within 48 hours has been picked up by every major tech outlet covering AI risk.

By the next business day, ElevenLabs has shipped concrete changes. The crisis becomes the operating template.

What 4chan actually did

The 4chan abuse used the free tier's voice-cloning feature. With a 60-second audio sample of a target's voice, the platform could generate new audio in that voice saying anything.

Within one week of the public beta:

Target	Content
Emma Watson	Reading "Mein Kampf"
Joe Rogan	Racist remarks about AOC
Ben Shapiro	Hateful content about minorities
David Attenborough	Violent threats
Hillary Clinton	Transphobic content

The 4chan thread turned into a manual on how to use the product for harassment. By the time Vice published, the screenshots were everywhere on Twitter.

For most pre-seed startups, this is a company-ending event. Investors pull. Press goes negative. The product gets associated with the abuse forever.

The 24-hour response

ElevenLabs responded the next business day — January 31. The company published a statement acknowledging "an increasing number of voice cloning misuse cases" and shipped a set of immediate changes plus a roadmap of follow-ups:

Shipped within ~24 hours:

Voice cloning behind paid tier. No free voice cloning. Required payment information that creates an audit trail.
Per-generation traceability. Each piece of generated audio could be traced back to the specific account that produced it.
Manual verification path. Voice cloning of public figures required additional verification.

Shipped over the following months: 4. AI Speech Classifier. A free public tool that takes any audio clip and tells you whether it was generated by ElevenLabs. Released publicly in June 2023 alongside the Series A — five months after the initial 4chan response, but core to the long-term trust posture.

The immediate response was concrete. Not "we are taking this seriously." Specific safeguards plus a transparent roadmap.

The speed mattered more than the substance. Within a week of the abuse going viral, the company had a public technical answer. Most AI vendors needed months to respond to similar incidents in 2024-2025. ElevenLabs set the bar in week two of its existence.

Why the response worked

The 4chan incident could have been catastrophic. It became a case study for three reasons.

1. The response was technical, not legal. The classifier was a real working tool, not a terms-of-service update. Reporters could test it; it worked. That's a different kind of credibility than a press release.

2. The traceability claim was verifiable. ElevenLabs could (and did) trace specific abusive content back to specific accounts and ban them. The audit trail wasn't theoretical.

3. The company didn't deny the upside. The CEO did not claim voice cloning would be safe. He acknowledged that misuse was inherent to the technology and that the platform needed continuous safeguards. That framing — "yes this is dangerous, here's how we manage it" — held up across the next three years of incidents.

The pattern that recurred 12 months later

The 4chan response became the template ElevenLabs ran every time misuse hit. The most consequential rerun was January 26, 2024 — Pindrop traced a fake Biden New Hampshire-primary robocall to ElevenLabs. The company suspended the account within 72 hours, gave a clear public statement, and the Biden episode became the most-cited "responsible AI vendor" example for the rest of the year.

Incident	Days to public response	Concrete action
4chan celebrities (Jan 2023)	1 business day	Paid-only cloning + traceability (classifier shipped 5 months later, June 2023)
Biden robocall (Jan 2024)	3 days	Account ban + public statement + classifier reference

Same pattern, same speed, twice — and the response template was already built when the second incident hit. By late 2024, when Deutsche Telekom and other enterprise procurement teams ran due diligence on voice AI vendors, ElevenLabs' incident-response track record was a positive rather than a negative.

The hidden GTM payoff

There's a counter-intuitive truth in this incident. ElevenLabs did not lose customers from the 4chan story. The user count grew from sub-100K in late January to 1M by June 2023.

What happened instead: the misuse coverage advertised the product's capability. "ElevenLabs can clone any voice from a 60-second sample" was simultaneously the abuse vector and the most compelling demo of what the technology could do. Users who needed legitimate voice cloning — audiobook narrators, accessibility tools, voice-over artists — saw the same headlines.

The forced-trust posture meant ElevenLabs could absorb that attention without becoming "the deepfake company." Cartesia, Resemble, and PlayHT got similar capability headlines through 2023-2024 but without the same operational track record. The incident-response gap turned into a trust gap.

Sources

← Previous$2M pre-seed + public beta Next →Series B $80M / $1.1B

04 / 032024-01-22

FundingBundled milestone

Series B $80M at $1.1B — The 21-Month Unicorn That Bundled Three Product Launches Into One Press Window (Jan 2024)

ElevenLabs' Series B announcement carried Voice Marketplace, Dubbing Studio, and Mobile SDK in the same press release. Same announcement budget, four times the coverage.

Original source ↗

January 22, 2024. ElevenLabs announces an $80 million Series B led by Andreessen Horowitz, with Sequoia Capital, Nat Friedman, and Daniel Gross. Valuation: $1.1 billion. Twenty-one months from incorporation.

It's the fastest European AI company to reach unicorn status at that point — and the announcement isn't about the round.

The product bundle that ran with the round

The Series B press release named four launches in the same window:

Product	What it was
Voice Marketplace	Creator-uploaded voices, royalty-share model
Dubbing Studio	Pro video translation with editor controls
Mobile SDK	iOS / Android voice integration for app developers
AI Speech Classifier (re-emphasized)	Public tool for detecting AI-generated audio (originally shipped June 2023 with Series A)

Each one would have been a standalone press story. Bundled together with $80M and a $1.1B valuation, they generated a coverage cascade across four distinct press categories:

TechCrunch / Bloomberg / Forbes / Fortune — the funding round
Slator / VentureBeat / The Verge — Dubbing Studio
9to5Mac / Android Central — Mobile SDK
AI / ML trade press — AI Speech Classifier and creator marketplace

A single Series B announcement covered four news beats. Same announcement spend, ~4× the surface area.

Why the bundle worked specifically here

The bundle wasn't arbitrary. Each launch was strategically tied to the funding narrative.

Voice Marketplace = "ElevenLabs is becoming a platform, not a TTS API." That re-framing supported the unicorn valuation. A TTS API doesn't justify $1.1B; a marketplace with creator network effects does.

Dubbing Studio = "ElevenLabs is going after media-industry budgets." Slator and VentureBeat are read by people who buy localization at scale — Netflix, Audible, Warner Bros. discovery dollars are an order of magnitude bigger than indie creator subscriptions.

Mobile SDK = "ElevenLabs is becoming infrastructure." App developers integrating voice features means recurring API revenue, not one-off creator subs.

AI Speech Classifier (re-emphasized) = "ElevenLabs is the responsible AI vendor." This is the trust-posture work — January 22, 2024 was four days before the Biden robocall story broke. The classifier becoming part of the Series B framing helped the company absorb the Biden incident without losing the narrative.

The investor list told the strategic story

Series A was Nat Friedman, Daniel Gross, and a16z — the AI-native angel pattern.

Series B added Sequoia (capital-press signal) and kept the same team (continuity signal). The cap table going into 2024 looked like this:

Round	Lead	Notable participants
Pre-seed (Jan 2023)	Credo Ventures	Concept Ventures
Series A (Jun 2023)	Nat Friedman / Daniel Gross / a16z	Mike Krieger, Brendan Iribe, Mustafa Suleyman, Tim O'Reilly
Series B (Jan 2024)	a16z	Sequoia, Nat Friedman, Daniel Gross

Note: a16z double-led. That's a meaningful signal — the same fund leading consecutive rounds means internal conviction is high enough to defend the markup at the partnership.

The 21-month milestone

Milestone	Months since incorporation
Public beta	9
1M users	14
Series A	14
Out of beta + Multilingual v2	16
Series B + Unicorn	21

For comparison, the median path to unicorn status for AI infrastructure companies in 2022-2024 was roughly 36-48 months. ElevenLabs hit it in 21.

The compression is the result of the bundle pattern. Each round cleared the next product launch's runway; each product launch supported the next round's valuation. The two-step ratchet repeated four times — pre-seed, A, B, C — and never broke.

The Biden incident, four days later

The Series B press cycle was still running when the Biden robocall story broke on January 26, 2024. The proximity was coincidence — but the response template was already in place from the 4chan incident a year earlier.

ElevenLabs banned the account within 72 hours, issued a public statement, and the AI Speech Classifier (already part of the Series B narrative) was the technical answer to "how do you prevent this?"

The Series B unicorn announcement and the Biden robocall response landed in the same week. The trust posture, embedded in the funding announcement, did the GTM work for both.

Sources

← Previous4chan misuse scandal Next →Biden deepfake response

04 / 042024-01-26

MediaForced trust posture

The Biden Robocall Deepfake — How a 72-Hour Account Ban Became Enterprise Sales Collateral (Jan 2024)

Pindrop traced a fake Biden New Hampshire-primary robocall to ElevenLabs. The company suspended the account in 72 hours. By year-end the response was being cited in enterprise procurement decisions.

Original source ↗

January 26, 2024. Pindrop Security publishes its analysis of an AI-generated robocall sent to thousands of New Hampshire Democratic primary voters days earlier. The robocall used a synthetic Joe Biden voice telling people not to vote.

Pindrop's forensic analysis traces the audio to ElevenLabs.

The company suspends the account by the end of the same week. Bloomberg, the Financial Times, the Wall Street Journal, NBC, CNN, Reuters, and the Associated Press all cover the response.

The chronology

The incident moved fast across regulators, press, and ElevenLabs:

Date	Event
Jan 21-22	Robocall reaches NH voters before primary
Jan 23	NH Attorney General opens criminal investigation
Jan 25	Pindrop completes forensic analysis, identifies ElevenLabs
Jan 26	Bloomberg breaks ElevenLabs link; account suspended within 72 hours
Jan 27	FCC announces process to ban AI-generated robocalls
Feb 8	FCC formally outlaws AI voice in robocalls (citing this incident)
Feb 23	Account creator publicly identified (linked to a Steve Kramer / Lingo Telecom)

The FCC ban on AI-generated robocalls — passed February 8 — explicitly cited the Biden incident as the trigger. ElevenLabs' technology was the named example in regulatory rule-making.

What the 72-hour response actually contained

Three concrete actions in the first week:

1. Account suspension. The user who generated the audio was banned. ElevenLabs' per-generation traceability (already in place since the 4chan response in January 2023) made identification straightforward.

2. Public statement. "We are dedicated to preventing the misuse of audio AI tools and take any incidents of misuse extremely seriously." Plain language, no hedging on the technical link.

3. AI Speech Classifier reference. The free public tool — first launched in June 2023 alongside the Series A — was re-surfaced as the technical answer to "how do we know if audio is from ElevenLabs?" — Pindrop had used a similar method.

What the response did not include: denial, deflection, or claims that the platform was being unfairly targeted. The framing was direct acknowledgment plus operational evidence.

Why this set the bar for the industry

Voice AI misuse incidents in 2024 affected most major vendors. The response patterns diverged.

Vendor	Major 2024 incident	Public response
ElevenLabs	Biden robocall (Jan)	72-hour ban, public statement, classifier reference
Cartesia	Limited public incidents	N/A in 2024
PlayHT	Limited public incidents	N/A in 2024
Microsoft (VALL-E)	Restricted release	Kept models private to avoid this risk

ElevenLabs was the only vendor that a) had its product publicly linked to a high-profile election interference incident and b) absorbed the link without losing operational credibility.

The contrast with Microsoft's VALL-E response is the most instructive. Microsoft kept VALL-E private specifically because it didn't want to be in this position. ElevenLabs took the public position and built the operating muscle. The market rewarded the muscle by 2025.

The enterprise sales effect

The response template paid off in enterprise procurement through 2024-2025.

By the time Deutsche Telekom, NTT DOCOMO Ventures, RingCentral Ventures, HubSpot Ventures, and LG Technology Ventures came in on the Series C in January 2025 (with Salesforce Ventures returning), voice-AI vendor due diligence routinely included questions about misuse handling. ElevenLabs could point to a 12-month operational track record:

4chan incident (Jan 2023) → paid-only cloning + traceability
Biden robocall (Jan 2024) → 72-hour ban + FCC engagement
2024 election cycle → no further high-profile incidents involving ElevenLabs

That track record was the differentiator vs Cartesia and PlayHT in enterprise sales motions. Multiple Sacra and Contrary Research notes cite "operational discipline on misuse" as part of why ElevenLabs won contact-center and telco RFPs.

The "trust posture as GTM" pattern

The pattern is rare and worth naming explicitly:

High-stakes misuse incident → forced public attention
Speed-of-response is a verifiable signal → 72 hours becomes a benchmark
Concrete technical safeguards already shipped → the response is operational, not promotional
Pattern repeats credibly across multiple incidents → enterprise procurement starts to count this as risk mitigation
Telco / enterprise / regulated-industry contracts close → trust becomes revenue

ElevenLabs ran this loop three times in 24 months. Each iteration compounded the next. By Series C, the trust posture was generating revenue, not just defusing crises.

Most vendors treat misuse as a PR problem. ElevenLabs treated misuse as a continuous operational test of whether the company can hold enterprise trust. The press did the marketing for free.

Sources

← PreviousSeries B $80M / $1.1B Next →Iconic Voices launch

04 / 052024-07-03

MediaAudience boundary push

Iconic Voices — How Licensing Garland, Dean, Reynolds, and Olivier Pulled ElevenLabs Into CNN, CBS, and Variety (Jul 2024)

Estate-licensed AI voice clones of four Hollywood legends turned a Reader-app feature into a mainstream-press story. The deal mechanics quietly redefined the public conversation about voice AI ethics.

Original source ↗

July 3, 2024. ElevenLabs announces "Iconic Voices" — AI voice clones of Judy Garland, James Dean, Burt Reynolds, and Sir Laurence Olivier — licensed through CMG Worldwide and integrated into the ElevenReader app launched a week earlier.

Within 48 hours, the story is in CNN Business, CBS News, Variety, Tubefilter, Designboom, Tom's Guide, and Entrepreneur Magazine.

The deal structure that made it possible

The licensing wasn't ad hoc. ElevenLabs and CMG Worldwide — the Beverly Hills IP firm that represents the Garland, Dean, Reynolds, and Olivier estates — built a constrained-use framework:

Term	Detail
Use cases	Reader app only — books, articles, PDFs
Voice scope	Voices not added to broader ElevenLabs audio database
Estate consent	Per-voice estate sign-off on permitted use
New work generation	Not permitted — voices restricted to reading existing text
Family endorsement	Liza Minnelli (Garland's daughter) issued public statement

The constraints were the news angle. "AI clones dead celebrities" is a horror-show headline. "Estate-licensed AI voice clones with family endorsement, restricted to audiobooks" is a respectful-tribute headline. The press took the second framing because the framework forced it.

Why the press picked it up

Mainstream press almost never covers voice AI infrastructure launches. Conv-AI v1 in November 2024 — arguably a more important product moment — got AI trade press only.

Iconic Voices got mainstream press for three structural reasons:

1. Recognizable cultural artifacts. Judy Garland reading "The Wonderful Wizard of Oz" is a story that doesn't require explanation. Voice AI infrastructure is an explanatory headline. The cultural artifact carried the news beat.

2. Pre-resolved ethics question. Estate licensing + family endorsement closed off the ethical objection before reporters could write it. CBS, CNN, and Variety could publish without needing a "but is this OK?" rebuttal section.

3. Visual + audio share unit. Print outlets ran video clips of "Garland reading" content. The clips played native on Instagram Reels, TikTok, and YouTube Shorts when the same outlets posted social cuts. The story propagated for free across platforms.

The Iconic Voices launch is the rare voice AI story that landed in CBS Sunday Morning territory, not just TechCrunch. The deal structure was the reason.

What it did for the brand

Through mid-2024, ElevenLabs' brand was:

Within AI / dev community: best-in-class TTS, generous free tier, ongoing misuse stories
Outside the AI community: the company that made the Biden deepfake possible

Iconic Voices flipped the second framing. The same week that ElevenLabs was getting Variety coverage for Garland and Dean, the company was visibly moving past the deepfake association in mainstream press.

Press cycle	Dominant frame
Jan-Feb 2024	Biden robocall, AI election interference
Mar-May 2024	Mayor Adams clone, ongoing AI ethics stories
Jun-Aug 2024	ElevenReader, Iconic Voices, audiobook future
Sep-Nov 2024	Conversational AI v1, platform pivot

The brand work mattered for what came next. The Series C in January 2025 brought Deutsche Telekom, NTT DOCOMO Ventures, HubSpot Ventures, and Salesforce Ventures — strategic investors whose internal champions would have struggled to push for an investment in "the deepfake company." After Iconic Voices, ElevenLabs was a story those champions could tell internally.

The Reader app's role

ElevenReader (launched June 25, 2024) is the consumer surface that made Iconic Voices coherent. Without a place for users to actually listen to Garland reading, the story would have been "ElevenLabs licenses celebrity voices" — a vendor announcement, not a product.

The bundle:

Jun 25, 2024: ElevenReader iOS launches (Android shortly after). Free app, read any text aloud with natural AI narration.
Jul 3, 2024: Iconic Voices Collection launches inside Reader. Garland, Dean, Reynolds, Olivier as premium tier.
Subsequent months: More licensed voices added; Reader becomes the consumer entry point to ElevenLabs.

The eight-day gap between Reader launch and Iconic Voices launch is deliberate. Reader establishes the frame ("an audio app for books and articles"). Iconic Voices makes the frame newsworthy. Together they do what neither would have done alone.

What competitors couldn't replicate

By July 2024, every voice AI vendor could clone a celebrity voice given a sample. The capability was commoditized.

What ElevenLabs had that competitors didn't: the estate relationship infrastructure. CMG Worldwide doesn't sign with vendors who haven't built operational trust on misuse. The 4chan response (Jan 2023), the Biden response (Jan 2024), and the public AI Speech Classifier were the reasons the deal was possible.

Cartesia and PlayHT could match the technical clone quality. Neither could close a CMG Worldwide licensing deal in 2024. The trust posture became the moat.

Sources

← PreviousBiden deepfake response Next →Conversational AI v1 ships

04 / 062024-11-18

ProductTech narrative upgrade

Conversational AI v1 — The 11-Week Platform Pivot That Reset the Entire Sales Motion (Nov 2024)

ElevenLabs went from TTS API to integrated voice-agent platform on November 18, 2024. Eleven weeks later, telco and CRM strategics led the Series C. The pivot up-the-stack happened faster than competitors could react.

Original source ↗

November 18, 2024. ElevenLabs ships Conversational AI v1 — a platform layer that combines TTS, speech-to-text, and LLM orchestration into a single agent stack. Developers can now build full conversational agents inside the ElevenLabs developer console.

Eleven weeks later, the Series C closes at $3.3B with strategic investors from telco, CRM, and contact-center categories.

What the launch actually shipped

Conversational AI v1 was a platform-tier product, not a feature. Four components in one console:

Voice (TTS): ElevenLabs' existing Eleven Multilingual v2 model
Speech-to-text: Native ASR for handling user input
LLM orchestration: Connects to OpenAI, Anthropic, or self-hosted LLMs
Knowledge base: Files / URLs / text blocks as agent context

The configuration surface was extensive — voice, latency, stability, conversation length, authentication. SDK support for Python, JavaScript, React, and Swift, plus a WebSocket API.

In other words: everything a developer needed to build a working voice agent in one place. No need to wire together 5 vendors.

Why the timing mattered

The voice-agent category was forming in late 2024. The competitive landscape:

Vendor	Position in Nov 2024	Stack
ElevenLabs	Best TTS, now full platform	Vertically integrated
Vapi	Voice agent platform, no own TTS	Stack of best-in-class APIs
Retell	Voice agent platform, no own TTS	Stack of best-in-class APIs
Cartesia	Best TTS competitor, no agent layer	TTS only
PlayHT	TTS, building agent features	TTS + thin agent layer
Deepgram	STT leader, building TTS	STT + TTS, no agent

ElevenLabs was the only vendor with both a top-tier proprietary TTS model and a full agent stack. Vapi and Retell were stitching ElevenLabs' TTS into their stacks — making the platform pivot a direct competitive threat to them.

The Conv-AI v1 launch effectively folded the cost of Vapi and Retell into ElevenLabs' own platform. A developer who had been paying for ElevenLabs TTS + Vapi orchestration could now collapse the bill.

The eleven-week sequence

The launch was step one of a tightly-timed run:

Date	Event
Nov 18, 2024	Conv AI v1 launches
Late Nov 2024	$90M ARR disclosed (Sacra / Information)
Dec 2024	$120M ARR (year-end)
Jan 2025	Series C term-sheet activity (signaled by funding press)
Jan 30, 2025	Series C $180M @ $3.3B
Feb 22-23, 2025	a16z + ElevenLabs worldwide hackathon (voice agents theme)

Eleven weeks from product launch to closed funding round. That speed isn't possible without:

Pre-existing investor relationships (a16z + ICONIQ already engaged)
Verifiable revenue inflection ($25M → $90M → $120M ARR through 2024)
A product launch that re-classified the category

The Series C new strategic investor list — Deutsche Telekom, NTT DOCOMO Ventures, RingCentral Ventures, HubSpot Ventures, and LG Technology Ventures (with Salesforce Ventures returning) — all came in because Conv-AI v1 had repositioned ElevenLabs from TTS to telco-and-CRM infrastructure.

What the strategic investors actually bought

Each of the four telco / CRM strategics had a specific reason to invest:

Investor	Strategic angle
Deutsche Telekom	European telco voice agents for B2B services
NTT DOCOMO Ventures	Japan-market voice agents for contact centers
RingCentral Ventures	UCaaS / contact-center voice integration
HubSpot Ventures	SMB CRM voice agent layer
LG Technology Ventures	Voice for consumer-electronics surfaces

Five strategic investors, five distinct enterprise integration paths. None of them would have invested in a TTS API. All of them had a clear thesis on a voice-agent platform.

The platform pivot was the precondition for the strategic check pile. ElevenLabs went from "interesting AI startup" to "potential infrastructure partner" in 11 weeks.

The Conv-AI 2.0 follow-up

The Conv-AI v1 launch was step one. Conversational AI 2.0 (June 3, 2025) was the credibility extension:

Native turn-taking model (handles hesitations, interruptions, filler words)
Integrated language detection (no manual config)
Multi-character mode (single agent, multiple personas)
Batch outbound calling (parallel call initiation)

The 2.0 launch was deliberately seven months after v1 — a release cadence that signaled "this is a category we own." Vapi and Retell were still on their first or second platform iteration. ElevenLabs had two generations shipped.

The cadence is the GTM. Not the individual features.

What the pivot did to the ARR curve

The platform-tier shift compounded the revenue curve in a way that pure TTS scale wouldn't have:

Date	ARR	Driver
Q1 2024	$25M	TTS API + Dubbing Studio
Q4 2024	$120M	TTS scale + early Conv-AI adoption
Aug 2025	$200M	Conv-AI 2.0 + enterprise contracts
Dec 2025	$330M+	Voice agents at scale, enterprise approaching 50% of revenue

The Q4 2024 → Aug 2025 jump (from $120M to $200M in eight months) is where Conv-AI revenue starts becoming visible in the topline. The Aug 2025 → Dec 2025 jump (from $200M to $330M in five months) is where enterprise deals begin closing at scale.

Without Conv-AI v1 in November 2024, the curve plateaus around $200M ARR — a respectable TTS company. With Conv-AI v1, the curve continues compressing into the Series D and IPO trajectory.

What to copy

The platform pivot has to ship before competitors can reposition. ElevenLabs had an 11-week window where it was the only vendor with proprietary TTS + integrated agent stack. By the time Vapi and Retell built their own TTS or signed exclusive deals, ElevenLabs had already closed the strategic Series C.
Strategic investors are the validation, not the round. Deutsche Telekom and NTT DOCOMO didn't drive ElevenLabs' valuation. They drove the narrative that ElevenLabs is voice infrastructure, not a creator tool. That narrative powered the next two rounds.
Cadence matters more than feature count. Conv-AI v1 → 2.0 in seven months is a tighter cadence than competitors could match. The cadence itself becomes a competitive moat — every customer betting on a vendor wants to know the vendor will keep shipping.

Sources

← PreviousIconic Voices launch Next →Series C $180M / $3.3B

04 / 072025-01-30

FundingBundled milestone

Series C $180M at $3.3B — The Telco / CRM Strategic Stack That Reframed ElevenLabs as Voice Infrastructure (Jan 2025)

a16z and ICONIQ co-led, but the headline was the new strategic-investor list: Deutsche Telekom, NTT DOCOMO, RingCentral, HubSpot, LG Technology Ventures (Salesforce was a returning investor). The Series C wasn't capital — it was distribution embedded in a cap table.

Original source ↗

January 30, 2025. ElevenLabs announces a $180 million Series C co-led by Andreessen Horowitz and ICONIQ Growth, valuing the company at $3.3 billion. Three times the Series B valuation from twelve months earlier.

NEA, Sequoia, World Innovation Lab, Valor, Endeavor Catalyst, and Lunate are also in. ICONIQ partner Seth Pierrepont joins the board.

The financial-press story is the valuation jump. The strategic story is the strategic investor list.

The strategic investor pattern

The Series C brought five new strategic investors with overlapping but distinct angles (Salesforce Ventures, also in the round, was a returning investor from earlier rounds):

Investor	What they buy from ElevenLabs	Strategic vector
Deutsche Telekom	Voice agents for European B2B	Telco / SMB
NTT DOCOMO Ventures	Voice agents for Japan contact centers	Asia-Pacific telco
RingCentral Ventures	UCaaS voice integration	UCaaS / contact center
HubSpot Ventures	SMB CRM voice agent layer	SMB CRM
LG Technology Ventures	Voice for consumer-electronics surfaces	Consumer hardware

Five new strategic investors, five distinct enterprise channels. Each one had an internal deployment thesis before writing the check.

This is a different kind of round than Series A or B. Pre-seed and Series A bring capital. Series B brings capital + brand (a16z + Sequoia). Series C with this strategic stack brings capital + brand + distribution access through five enterprise vectors that ElevenLabs would otherwise have spent years building bottom-up.

The 11-week timeline that made it possible

The Series C closed eleven weeks after Conversational AI v1 shipped in November 2024. That sequence was the load-bearing maneuver:

Date	Event
Nov 18, 2024	Conversational AI v1 launches
Late Nov	$90M ARR disclosed
Dec 31, 2024	$120M ARR (year-end)
Jan 30, 2025	Series C closes
Feb 22-23	a16z + ElevenLabs worldwide hackathon (voice agents)

The platform pivot in November positioned ElevenLabs as voice infrastructure, not a TTS API. The strategics could only invest in the platform-tier story. They could not have invested in a TTS-API story — telco and CRM procurement teams don't deploy TTS APIs as standalone products.

The Conv-AI v1 launch did the strategic-investor recruiting. The Series C announcement was the close.

What the round bundled

True to the ElevenLabs cadence, the Series C wasn't a standalone announcement. It bundled:

Round close ($180M @ $3.3B)
Board addition (ICONIQ partner Seth Pierrepont)
Strategic distribution launch (Deutsche Telekom partnership signaling)
Enterprise customer reveals (Disclosed in subsequent press) — the ARR disclosure pulling forward customer naming
a16z hackathon series (Announced two weeks later, February 22-23)

Five news beats inside a 14-day window. Same announcement budget, multiplicative coverage.

The valuation math

The trajectory across 24 months:

Round	Date	Round size	Valuation	Multiple on prior
Pre-seed	Jan 2023	$2M	~$12M	—
Series A	Jun 2023	$19M	~$100M	8.3×
Series B	Jan 2024	$80M	$1.1B	11×
Series C	Jan 2025	$180M	$3.3B	3×
Tender	Sep 2025	$100M (secondary)	$6.6B	2×
Series D	Feb 2026	$500M	$11B	1.7×

The Series C 3× markup is smaller than Series A's 8× or Series B's 11×, but the round size is much larger. The valuation expansion shifts from "narrative repricing" to "revenue-supported."

At $3.3B / $120M ARR, the multiple is ~27×. By Series D at $11B / $330M ARR, it compresses to ~33×. The multiple stayed roughly stable from Series C through Series D — meaning the valuation growth was earned by ARR growth, not by re-rating.

Why the Series C was the inflection round

Every prior round had been a step up. The Series C was a category change.

Before the Series C, ElevenLabs was an AI voice company. After the Series C — with Deutsche Telekom, NTT DOCOMO, HubSpot, RingCentral, and LG Technology Ventures all newly on the cap table (alongside returning Salesforce Ventures) — ElevenLabs was a voice infrastructure company. The label change opened up enterprise contracts that were structurally unavailable to AI-vendor-positioned competitors.

Cartesia and PlayHT could match ElevenLabs on TTS quality through 2025. Neither could match the strategic investor stack. Vapi and Retell could match the agent platform; neither had Deutsche Telekom on speed-dial.

The Series C wasn't the moment ElevenLabs was best. It was the moment ElevenLabs became unmatchable on a specific competitive vector — strategic distribution access.

The downstream effect on Series D

The Series D in February 2026 (Sequoia-led, $500M @ $11B) was the validation round for the Series C strategy. The strategic investors from January 2025 had become the largest enterprise customers by late 2025 — Deutsche Telekom and Revolut named publicly in the Series D coverage.

The flywheel:

Conv-AI v1 launches (Nov 2024)
Strategic investors join Series C (Jan 2025)
Strategic investors deploy ElevenLabs internally (2025)
Deployments become enterprise contracts (mid-late 2025)
Enterprise contracts generate $330M+ ARR by year-end 2025
ARR closes Series D at $11B (Feb 2026)

The Series C strategics weren't a marketing flourish. They were the GTM motion for the next 12 months.

Sources

← PreviousConversational AI v1 ships Next →Eleven v3 (alpha) launches

04 / 082025-06-05

ProductTech narrative upgrade

Eleven v3 — How an Audio-Tag Syntax Made Voice AI Feel Like a New Category (Jun 2025)

70+ languages, multi-speaker dialogue, and inline tags like [excited] and [whispers]. The v3 alpha turned voice synthesis into a stage-direction language — and the demo clips traveled native on every social platform.

Original source ↗

June 5, 2025. ElevenLabs releases Eleven v3 in public alpha. 70+ languages, multi-speaker dialogue, and a new audio-tag syntax that lets developers control emotion, tone, and delivery inline with the text.

The launch tweet from @elevenlabsio gets re-shared by Mati Staniszewski, Andrej Karpathy, and the AI-builder Twitter circle. Within 48 hours, audio-tag demos are circulating across X, TikTok, Instagram Reels, and YouTube Shorts.

The syntax that changed the demo grammar

Audio tags are words wrapped in brackets that the v3 model interprets as performance cues, not text:

"That was incredible! [excited] I never thought we'd actually pull it off. 
[whispers] But we have to be careful — they might still be watching. 
[laughing nervously] What do we do now?"

The output is a single audio clip with three distinct emotional registers — excitement, whispered tension, nervous laughter — controlled by markup, not by separate generation calls.

This is a shift in how voice AI is used. Previously, getting emotion variation required either:

Multiple generation calls with different prompts, then stitching
Voice direction in the source text ("she said excitedly"), with limited model interpretation
A separate fine-tuned model per emotion

v3 collapses all three into inline syntax. The cognitive model is now closer to writing a screenplay than calling an API.

Why the demo grammar mattered for distribution

Most TTS upgrades launch with side-by-side audio comparisons. They demo poorly because:

The improvement is incremental and hard to hear on phone speakers
Side-by-side requires the user to listen to both clips
The shareability is low — one clip is enough; two clips is friction

v3 launched with single-clip demos that contained the variation inside one audio file. A 15-second clip would shift between excitement, whispering, and laughter — within the same generation. The "wow" moment was self-contained.

Demo format	Share-friction	"New category" feel
Side-by-side audio	High	Low
Single clip with multiple voices	Medium	Medium
Single clip with audio-tag-driven emotion shifts	Low	High

The format mattered as much as the model quality. v3 demos went viral because they fit social-platform attention spans.

The cadence: v1 to v3 in 28 months

The model release cadence shows the deliberate pacing:

Model	Release	Months between
Beta TTS (English / Polish)	Jan 2023	—
Eleven Multilingual v1	May 2023	4
Eleven Multilingual v2	Aug 2023	3
Eleven Turbo v2	Apr 2024	8
Eleven Turbo v2.5	Aug 2024	4
Eleven v3 (alpha)	Jun 2025	10

The 10-month gap between Turbo v2.5 and v3 is the longest pause in the history of the company. v3 was a generational shift, not an iteration — and the launch positioning matched: "the most expressive Text to Speech model ever."

The pause was strategic. Conversational AI v1 (Nov 2024) and Conv-AI 2.0 (Jun 2025) needed to be the priority through that period, because the platform pivot was the load-bearing GTM move. v3 launching alongside Conv-AI 2.0 (two days apart, June 3 and June 5) bundled the model release with the platform release in the same announcement window.

How the launch propagated

The v3 launch followed a clear distribution pattern:

Day 1 (Jun 5)

Launch tweet from @elevenlabsio with audio-tag demo clips
Mati Staniszewski's personal X account amplifies
AI-builder Twitter (Karpathy, Andrej Karpathy-adjacent accounts) re-shares

Days 2-3

Hacker News front page (Eleven v3 alpha thread, hundreds of comments)
Product Hunt launch (top product of the day)
VentureBeat / TechCrunch coverage of audio-tag syntax

Days 4-7

Creator demos start appearing on TikTok and Instagram Reels
Audio-tag syntax tutorials on YouTube
Reddit r/ElevenLabs, r/MachineLearning threads

Weeks 2-4

Integration into creator workflows (audiobook narrators, indie game devs)
Third-party tools and SDKs adopting the syntax
Use-case content (best audio tags for X, Y, Z)

The propagation worked because each platform got a different format of the same demo. X got 30-second audio threads. TikTok got 15-second creator clips. YouTube got 5-minute "how to use audio tags" tutorials. Same launch, four formats, four audiences.

The pricing trick that drove adoption

ElevenLabs offered v3 alpha at 80% off credit pricing through June 30, 2025. That's not a discount — that's a deliberate adoption forcing function.

A creator on the free or starter tier who wanted to try v3 could generate 4-5× more audio than they normally could. Heavy users who would have hit caps with v3's higher-quality output were given headroom. By the end of June, v3 was the default model in most user workflows because it had been the cheapest model.

When the discount ended on July 1, the switching cost back to older models was the cognitive cost of un-learning the audio-tag syntax. Most users stayed on v3 even at full price.

The 80%-off alpha was the cheapest cohort-acquisition campaign ElevenLabs ever ran. By the time pricing normalized, the audio-tag syntax was the user expectation.

What v3 did to the competitive landscape

Cartesia, PlayHT, Resemble, and others released TTS upgrades through 2025. None matched v3's audio-tag syntax. The closest equivalent was OpenAI's voice mode for ChatGPT, which had emotional range but no developer-facing markup.

By Q4 2025, "audio tags" had become a category requirement. Vendors evaluating their roadmaps had to choose: ship a v3-equivalent syntax, or accept that ElevenLabs would be the default for emotionally rich voice work.

The Series D narrative in February 2026 leaned heavily on v3 as proof that ElevenLabs was the model leader, not just the platform leader. Sequoia's lead position in the round (vs Sequoia not leading any prior round) signaled that the model story had become investable in its own right.

Sources

← PreviousSeries C $180M / $3.3B Next →Series D $500M / $11B

04 / 092026-02-04

FundingBundled milestone

Series D $500M at $11B — Sequoia Leads the IPO-Track Round (Feb 2026)

Sequoia takes the lead from a16z and ICONIQ. Mati Staniszewski tells the press the company is 'building toward an IPO.' Triples the September secondary valuation in five months.

Original source ↗

February 4, 2026. ElevenLabs announces a $500 million Series D led by Sequoia Capital, with participation from Andreessen Horowitz, ICONIQ, Lightspeed Venture Partners, Bond, and Evantic Capital. Valuation: $11 billion.

The round triples the $6.6B secondary valuation from the September 2025 employee tender — five months earlier. ARR closed 2025 at $330M+, disclosed three weeks before the round.

CEO Mati Staniszewski tells TechCrunch and CNBC the company is "building toward an IPO."

What changed in the lead investor

Across six rounds, the lead-investor pattern shows the company's narrative arc:

Round	Lead	Implied frame
Pre-seed (Jan 2023)	Credo Ventures	European seed-stage AI
Series A (Jun 2023)	a16z + Nat Friedman + Daniel Gross	AI-native angel + a16z
Series B (Jan 2024)	a16z	Unicorn growth
Series C (Jan 2025)	a16z + ICONIQ	Platform + strategic distribution
Tender (Sep 2025)	Sequoia + ICONIQ (co-led)	Bridge to growth-stage
Series D (Feb 2026)	Sequoia	IPO trajectory

Sequoia taking the lead — for the first time in the company's history — is the signal. Sequoia's late-stage practice (the Sequoia Capital growth fund) leads rounds for companies on credible paths to public markets. The fund's portfolio includes Stripe, Klarna, Snowflake (pre-IPO), and Datadog (pre-IPO).

The lead change is the most direct public signal that ElevenLabs is in the IPO-prep cohort.

The bundled disclosures

True to the cadence, the Series D didn't fire alone. The press window included:

Disclosure	Detail
$500M round	Largest in ElevenLabs history
$11B valuation	3.3× the Series C valuation 12 months earlier
$330M+ ARR	Year-end 2025 (disclosed Jan 13, 2026)
Nvidia investment	Re-emphasized — Nvidia first announced Sept 2025; the Series D framing positioned it as an infrastructure-tier endorsement
Enterprise customers	Deutsche Telekom, Revolut named publicly
IPO commentary	"Building toward an IPO" — first explicit IPO frame

Six news beats. One press window. Same announcement budget, multiplicative coverage — the same playbook ElevenLabs has run on every funding round since pre-seed.

Why the Nvidia investment carried so much weight

Nvidia's strategic investment in ElevenLabs was first announced in September 2025 — Tech.eu, Music Business Worldwide, and others covered it at the time, with Jensen Huang publicly endorsing the company. The Series D press cycle in February 2026 re-emphasized it as the IPO-track narrative crystallized. Three things this signals:

1. Strategic-customer / strategic-investor convergence. Nvidia uses ElevenLabs internally for audio generation. The strategic check confirmed an existing customer relationship — the September 2025 announcement was both the deal disclosure AND a customer reveal in one news cycle.

2. AI infrastructure validation. Nvidia's portfolio of strategic investments includes CoreWeave, Lambda Labs, Hugging Face, Inflection (pre-Microsoft), and Cohere. Joining that list places ElevenLabs in the AI-infrastructure category, not just voice AI.

3. Market depth check. Nvidia's diligence process is unusually rigorous. A company that passed Nvidia's strategic-investment review in late 2025 is not a year away from breaking — it's at infrastructure-grade operating maturity.

The valuation framework

The valuation expansion math from Series C → Series D:

Date	ARR	Valuation	Multiple
Jan 2025 (Series C)	$120M	$3.3B	27×
Aug 2025	$200M	$5.3B (interpolated)	27×
Sep 2025 (Tender)	$200M	$6.6B	33×
Dec 2025 (year-end)	$330M	$9.0B (interpolated)	27×
Feb 2026 (Series D)	$330M	$11B	33×

The multiple stayed in a narrow band (27-33×) across 13 months. That's revenue-supported expansion, not narrative re-rating.

For comparison, public-market AI infrastructure multiples in early 2026:

Snowflake: ~12× ARR
Datadog: ~14× ARR
Cloudflare: ~18× ARR
Palantir: ~30× ARR
ElevenLabs (private): 33× ARR

ElevenLabs at 33× is at the high end of public-market multiples but inside the range. The implied IPO valuation at $500M-$1B ARR (likely 2027 timing) would be $15-25B — broadly consistent with the Series D pricing.

What "building toward an IPO" means in practice

Mati Staniszewski's IPO comment is unusually specific for a CEO. Most founders dodge IPO questions; saying "building toward an IPO" sets a public expectation.

Three things that change after this kind of statement:

Hiring profile shifts. CFO and Chief Legal Officer hires become priorities. Compliance, audit, and SOX-readiness work begin in earnest.
Reporting discipline tightens. ARR disclosures, customer reveals, and metric transparency become quarterly rather than ad hoc.
Strategic-investor relationships deepen. Telco / CRM strategics from the Series C become anchor customers for the IPO narrative.

The five-year arc from incorporation to IPO target track record (April 2022 → late 2026 IPO filing window) compresses an enterprise-software company timeline that historically averaged 8-12 years.

What's not in the disclosure

The Series D press leaves three things deliberately unsaid:

The IPO timing. "Building toward" is open-ended. Filing window could be late 2026, mid-2027, or further out.
The competitive cost structure. ElevenLabs has not disclosed gross margin or unit economics. Voice AI compute is expensive; the margin question matters for IPO valuation.
The strategic-investor contract economics. Deutsche Telekom and Revolut are named, but contract sizes and commit periods are private.

These are the questions S-1 disclosure will eventually answer. The Series D narrative says the company is on the path. The S-1 will say whether the path is durable.

Sources

← PreviousEleven v3 (alpha) launches ↑ Back to top