Growth Story · No. 05

ElevenLabs / ElevenLabs Inc.

From a stealth-mode TTS bet to an $11B voice-AI platform in under four years

ElevenLabs spent 12 months in stealth before shipping a beta in January 2023. The product went viral in days — first as a creator phenomenon, then as a misuse scandal, then as the default voice infrastructure for media, publishers, and voice agents. Every funding round bundled a model release. Every model release re-set the ceiling on what voice AI was assumed to do.

12 min readFounded 2022-0422 events tracked9 deep dives
01Timeline

ARR, valuation, and every GTM move, on one timeline.

Events split into four horizontal bands by type. Markers with a halo jump to a deep-dive section below. Hover anything for a summary; click external markers to jump to the original source.

ProductFundingMediaClick for deep diveARRValuation
Polish stealthCreator-first viralityPlatform build-outVoice-agent hypergrowth0$100M$200M$300M$400MARR$5.0B$10B$15BValuation2023202420252026$25M$90M$120M$200M$330M$12M$100M$1.1B$3.3B$6.6B$11B$2M pre-seed + public beta4chan misuse scandalEleven Multilingual v2 + …Series B $80M / $1.1B + D…Biden deepfake robocall +…Iconic Voices: Garland, D…Conversational AI v1 shipsSeries C $180M / $3.3BEleven v3 (alpha) launches$100M employee tender / $…Series D $500M / $11B + I…ProductFundingMedia
02Platform Mix

Which channels mattered when.

Cursor used six platforms differently. Some carried the entire arc; some were episodic catalysts; one was the discipline of staying off.

𝕏X (Twitter)
All stages — load-bearing

Founder + product launch channel

Mati Staniszewski (@matiii) and the @elevenlabsio handle drive every major launch. Audio demos travel exceptionally well on X — voice-clip tweets autoplay and get re-shared with the original audio attached, which is rare. Each model release lands as a clip thread first.

⚡ Catalyst moment

Eleven v3 alpha launch tweet (June 5, 2025) — audio-tag demo clips shared by founders, Karpathy, and the AI-builder crowd. Clip-native delivery is what made v3 feel like a 'new category' on day one.

View tweet
✓ Works when

When the product output is itself a shareable artifact (audio clip, voice demo). The platform autoplays audio inline — make every launch a clip, not a screenshot

✗ Don't expect

If the team posts press-release prose. Voice AI Twitter expects new audio in the post, not links to a blog

YouTube
Pre-inflection + Hypergrowth

Demo amplification + investor narrative

Two layers: founder long-form podcasts (Sequoia Training Data, a16z Show, Nothing Left Unsaid) and creator tutorials. Mati Staniszewski's investor-podcast circuit through 2025 reset perception from 'TTS startup' to 'voice infrastructure company.' Creator tutorials drive self-serve sign-ups continuously in the background.

⚡ Catalyst moment

Mati Staniszewski on Sequoia's Training Data (mid-2025) — the long-form artifact a16z, ICONIQ, and later Sequoia could circulate inside their own LP and exec networks before each subsequent funding round.

Watch episode
✓ Works when

When the founder can carry a 60-90 minute investor-grade conversation, AND the product has new demos in every episode. The double layer — founder + creator — compounds

✗ Don't expect

One-off keynote talks with no follow-up. The pattern only works as a continuous interview cadence, not single appearances

YHacker News
Latent + Platform build-out

Technical credibility validator

Multilingual v2, Conversational AI v1, Eleven v3, and Reader all landed on HN front page with hundreds of comments. The HN signal mattered most for the conversational-AI launch in November 2024 — it proved the platform pivot was technically credible to the developers ElevenLabs needed to build agents on top.

⚡ Catalyst moment

Conversational AI v1 launch threads (Nov 2024) — front-page placement with serious technical scrutiny. Two months later: Series C at $3.3B.

Read on HN
✓ Works when

When the launch has measurable technical novelty — a new model, a new architecture, real benchmarks. HN voters reward demos that reveal capability

✗ Don't expect

Don't post 'we hit $X ARR' or pricing changes. HN punishes business-update framing

r/Reddit
Creator-first virality + ongoing

Use-case discovery + retention layer

r/ElevenLabs and adjacent communities (r/AIVoiceCloning, r/audiobooks, r/IndieDev) became the place where creators trade voice IDs, prompt tricks, and use-case templates. Less acquisition, more activation — it's where new users learn how to get good output in their first hour.

⚡ Catalyst moment

No single moment. The community formed organically around the free tier in 2023 and matured as a self-serve support hub through 2024-2025.

Open r/cursor
✓ Works when

When you have a free tier generous enough that users want to share what they made. The community emerges from the product, not from outreach

✗ Don't expect

If you try to seed posts. The community detects vendor accounts within days

inLinkedIn
Voice-agent hypergrowth (enterprise)

Enterprise + investor signal

Mati Staniszewski's LinkedIn has carried real weight since mid-2025. ARR milestones, funding rounds, and customer wins announced as personal posts get re-shared inside Deutsche Telekom-style enterprise procurement networks. Higher score than typical for a dev-tools company because the buyer for voice agents is an enterprise contact-center exec, not a developer.

⚡ Catalyst moment

Mati Staniszewski's $200M ARR LinkedIn post (Aug 2025) — disclosed numbers, customer logos, and explicit 'building toward IPO' framing all in one post. Set the tone for the September tender and February Series D.

View source
✓ Works when

When your buyer is enterprise and your category is 'voice infrastructure / contact center.' Every CIO and head of CX is on LinkedIn — not on X

✗ Don't expect

For purely creator-targeted launches. LinkedIn audiences won't engage with a v3 audio-tag demo the way X will

Instagram
Creator-first virality

Consumer-creator amplification

Unusually high score for a B2B AI company — and it's earned. Voice clones (Darth Vader, Iconic Voices, celebrity parodies) routinely cross from TikTok to Instagram Reels without ElevenLabs lifting a finger. Reader-app demos with Judy Garland and James Dean read native to the format. The company doesn't post heavily; the creators do.

⚡ Catalyst moment

Iconic Voices launch coverage (July 2024) — Garland reading 'Wizard of Oz' clips circulated as Instagram Reels through CBS, CNN, and Variety social handles. Mainstream-press distribution, not paid.

View source
✓ Works when

When the product output has visual + audio appeal (voice clips paired with celebrity faces, video demos). Instagram is downstream of TikTok creator content, not a primary channel

✗ Don't expect

For developer features, API launches, conversational-AI configuration. Skip Instagram for those entirely

03Synthesis

The full thesis.

The big-picture read on what actually drove the curve — before zooming in on each key moment.

ElevenLabs did not have a slow burn.

The product went from stealth to one million users in five months, from one million users to a $1.1B unicorn in 12 more, and from there to $330M ARR and an $11B Series D in 24. The whole arc fits inside 46 months. What looks like luck is actually a five-move pattern that the founders ran four times in a row.

The dubbing thesis was the moat

Mati Staniszewski (ex-Palantir) and Piotr Dabkowski (ex-Google) grew up in Poland watching badly dubbed American films. The original product idea, sketched in 2020, was fixing that — voice that crosses languages with the speaker's emotion intact.

That insight pre-committed ElevenLabs to two architectural choices most competitors did not make:

  • Multilingual from day one. The first beta in January 2023 shipped with English and Polish. Eleven Multilingual v2 (August 2023) covered nearly 30 languages with the original speaker's accent preserved. By v3 (June 2025), the count was 70+.
  • Emotion as a first-class output. Not "TTS that sounds OK" but "voice that conveys feeling across language barriers." The audio-tag syntax in v3 — [excited], [whispers], [laughing] — is the natural endpoint of that thesis, six years from the original sketch.

Competitors with a generic TTS framing (Resemble, Murf, WellSaid Labs) optimized for narration quality. ElevenLabs optimized for emotional cross-lingual transfer. The framing constrained the product roadmap in a way that paid off every release.

Five months in stealth, then the scandals

ElevenLabs incorporated in April 2022 and did not ship until January 23, 2023. That's nine months of model training and infrastructure work before any user touched the product.

The launch arc that followed compressed into days:

DateEvent
Jan 23, 2023Public beta + $2M pre-seed announced
Jan 30, 20234chan abuses voice cloning (Emma Watson, Joe Rogan, Ben Shapiro)
Jan 31, 2023ElevenLabs ships paid-only voice cloning + AI detection tool
Jun 20231M registered users — five months from launch

The 4chan incident is the first thing that should have killed the company. Instead, ElevenLabs absorbed it as a forced-trust posture: voice cloning behind paid ID verification, classifier for AI-generated audio, account-level traceability — all shipped within days.

It did the same thing 12 months later. On January 26, 2024, Pindrop traced a fake Biden New Hampshire-primary robocall to ElevenLabs. The account was suspended within 72 hours, the company gave a clear public statement, and the Biden episode became a case-study citation in every "responsible AI" panel for the rest of the year.

Most companies hide misuse. ElevenLabs treated each incident as a chance to publicly demonstrate that the platform had auditable controls. The trust narrative ended up doing real GTM work — Deutsche Telekom and large enterprise contracts in 2025 cited operational discipline as a reason to commit.

Every funding round was a product bundle

Look at the cadence:

RoundDateBundled launch
Pre-seed $2MJan 23, 2023Public beta
Series A $19M @ $100MJun 21, 2023New voice products
Series B $80M @ $1.1BJan 22, 2024Voice Marketplace + Dubbing Studio + Mobile SDK
Series C $180M @ $3.3BJan 30, 2025Conversational AI v1 had shipped 10 weeks earlier
Tender $100M @ $6.6BSep 8, 2025Bundled with $200M ARR disclosure
Series D $500M @ $11BFeb 4, 2026Bundled with $330M ARR + IPO talk

Six rounds. Six bundled milestones. Every announcement window doubled as a product window.

The underlying logic is straightforward: a solo "$X funding" announcement gets you 3–5 days of capital-press coverage. A "$X funding + $Y ARR + new product" bundle gets you the same window across capital press, dev press, telco trade press, and SaaS press — for the same announcement budget.

The platform pivot that mattered

Most TTS companies stopped at API-as-a-product. ElevenLabs made a deliberate move to platform tier in November 2024.

Conversational AI v1 (November 18, 2024) integrated TTS + STT + LLM orchestration into a single agent stack. Conversational AI 2.0 (June 3, 2025) added native turn-taking, language detection, multi-character mode, and batch outbound calling.

The competitive geometry changed. In November 2024, ElevenLabs was selling against other TTS APIs (Cartesia, PlayHT, Resemble). By mid-2025 it was selling against Vapi, Retell, and the contact-center incumbents (NICE, Genesys, Five9) — a much larger market with much larger contract sizes.

The strategic-investor list on the Series C tells the same story: new strategic checks from Deutsche Telekom, NTT DOCOMO Ventures, RingCentral Ventures, HubSpot Ventures, and LG Technology Ventures (Salesforce Ventures, in the round, was a returning investor from earlier rounds). Telco, CRM, and consumer electronics — not creator tools. The pivot up the stack was the precondition for those checks.

Creator distribution did the brand work for free

ElevenLabs' visible marketing budget through 2024 was small. The acquisition machine was creator-first:

  • Voice clones travel native on social. A Darth Vader clip on TikTok. A Judy Garland reading on Instagram. The product output is itself the share unit. Most B2B tools envy this.
  • The free tier is the marketing. A generous free quota means creators experiment, and the experiments turn into Reels, TikToks, YouTube shorts. ElevenLabs gets the brand impression for free.
  • Voice Marketplace as a flywheel. Creators upload custom voices, other users discover and use them, the original creator earns. Three-way alignment that gives ElevenLabs viral content as a byproduct.
  • Iconic Voices as PR primer. Garland / Dean / Reynolds / Olivier (July 2024) put ElevenLabs in CNN, CBS, and Variety — outlets dev-tool companies almost never reach. The estate-licensing angle was the news hook.

When Mati Staniszewski went on Sequoia's Training Data, a16z Show, and Lenny's adjacent podcasts through 2024-2025, the founder-as-IP pattern translated directly to investor-narrative work. Different audience from the TikTok creators, same compounding mechanism.

The pattern, distilled

Six moves ElevenLabs ran. Each one is reusable in any AI infrastructure category.

  1. Lock the thesis to a multilingual, emotion-preserving frame from day one. The framing constrained the roadmap (v2 → v3 → audio tags) in a way that made each release feel like progress on the same promise.
  2. Bundle every funding round with at least one product launch. Same press budget, 3-4× the coverage surface. Six rounds in a row, never broken.
  3. Treat misuse as a forced-trust audit, not a PR crisis. Two scandals in 12 months. Both were absorbed as proof of operational discipline. Telco and enterprise procurement reads the response, not the incident.
  4. Move up the stack before competitors do. TTS API → conversational platform was an 11-week jump (Conv-AI v1 in Nov 2024, Series C in Jan 2025 with telco strategics on board). Competitors who stayed at API tier are now selling into a smaller market.
  5. Make the product output the share unit. Voice clips are autoplay-native on X, Instagram, TikTok. Free tier turns creators into a brand-extension layer the company doesn't pay for.
  6. Run the founder-as-IP loop for investors specifically. Long-form podcasts (Training Data, a16z Show, Nothing Left Unsaid) timed to between funding rounds. Each new round happens after a podcast circuit, not during it.

What's not in the public record

Things outside reporters can't see, that probably matter most:

  • The actual cost of model training in 2022-2023. Stealth mode is expensive. The pre-seed amount ($2M) is too small to fund a year of GPU work — the founders likely bootstrapped and burned personal capital. Specific numbers are private.
  • Real free-to-paid conversion rates. ElevenLabs has been generous with topline ARR but never disclosed conversion economics. The 1M-users-by-June-2023 figure could mean 5% paid or 0.5% paid — the gap matters.
  • The exact mechanics of enterprise sales motion. Deutsche Telekom and Revolut are named publicly. The contract sizes, sales cycle length, and PoC-to-deal conversion are not.
  • The competitive cost structure vs Cartesia, PlayHT, Vapi, Retell. Voice AI is one of the most crowded AI infrastructure categories. ElevenLabs' margin per million characters generated, vs competitors, is the question that determines whether the IPO narrative survives 2026.

These are the questions Sacra deep-dives, The Information enterprise reporting, and S-1 disclosures will eventually answer. Public traces alone get this story to about 70%. The last 30% is locked behind paywalls and S-1 due diligence.

04 / 012023-01-23
FundingBundled milestone

$2M Pre-Seed and a Public Beta — The Bundled Launch That Pulled in 1M Users in Five Months (Jan 2023)

ElevenLabs spent nine months in stealth, then announced funding and shipped a public beta in the same week. The free tier and clip-ready output did the rest.

Original source ↗

January 23, 2023. ElevenLabs announces a $2 million pre-seed led by Credo Ventures with Concept Ventures, and on the same day opens its text-to-speech beta to the public. English and Polish voices, free tier, no waitlist.

By June 2023 — five months later — the platform crosses one million registered users.

The stealth bet that preceded the launch

Mati Staniszewski (ex-Palantir) and Piotr Dabkowski (ex-Google) incorporated ElevenLabs in April 2022. From April 2022 to January 2023, the company was effectively dark: no public site beyond a landing page, no demos, no press, no product.

Nine months is a long stealth window for a $2M pre-seed company. The founders used it to train a TTS model that was meaningfully better than what was publicly available — Microsoft Azure, Google Cloud TTS, and Amazon Polly were the alternatives at the time, and they sounded robotic by comparison.

The discipline of staying dark is the unsung part. Most pre-seed founders ship a leaky beta to a small group on day 60 because they want feedback. ElevenLabs waited until the model was better than the incumbents.

The bundle: funding + beta + free tier

Three things hit at once on January 23:

  • Funding announcement. Pre-seed $2M, lead and participants disclosed.
  • Public beta open. Anyone could sign up that day.
  • Generous free tier. 10,000 characters per month free, paid plans starting at $5/month.

The free tier was the load-bearing piece. A creator could generate 30-60 seconds of audio without paying — enough to make a TikTok, a YouTube intro, or a Twitter clip. The first audio they made was almost certainly shareable.

What competitors offeredWhat ElevenLabs offered
API-only, paid tier minimum ~$50/moFree 10,000 characters + $5 minimum
Robotic narrator voicesEmotional, human-like output
English-only or thin multilingualEnglish + Polish + path to 28 more

The pricing wasn't a strategic move to undercut. It was a deliberate choice to put the product in front of the long tail of creators — TikTokers, podcasters, indie game devs, YouTubers — who would amplify it.

What "1M users in 5 months" actually looked like

The growth wasn't paid. It wasn't a Product Hunt launch (ElevenLabs ranked but didn't dominate). The mechanics were:

  • Twitter / X audio clips. Users posted "listen to this" tweets with ElevenLabs-generated voices. The clips autoplayed inline. Each share carried the brand.
  • Hacker News submissions. Beta launch hit HN front page; technically curious devs signed up to try.
  • TikTok creator usage. Voice-over for narration content, especially Reddit-story TikToks, took off through February-April.
  • Reddit threads. r/MachineLearning, r/AIVoiceCloning, r/sidehustles all surfaced ElevenLabs as the new tool that worked.

The growth curve is the signature of a product that fits a frame perfectly — every casual user becomes a distribution node because the output is itself the share unit.

The 4chan incident, one week later

The same launch that pulled in users pulled in abusers. By January 30, 2023, 4chan users had cloned Emma Watson, Joe Rogan, and Ben Shapiro to generate offensive content.

ElevenLabs responded the next business day with paid-only voice cloning, an AI detection tool, and traceability per generation. The story is in the next deep-dive — but it's worth noting here that the misuse was the byproduct of the same generosity that drove the user growth.

The free tier got ElevenLabs to 1M users. Voice cloning behind ID verification kept the company from being shut down for it.

The compounding effect on Series A

By June 2023, ElevenLabs had ~1M users on the platform. That metric — verifiable, auditable — closed the Series A on terms ($19M at ~$100M post) that would have been impossible from a cold start.

DateRoundValuationTrigger
Jan 23, 2023Pre-seed $2M~$12MBeta launch
Jun 21, 2023Series A $19M~$100M1M users + voice products
Jan 22, 2024Series B $80M$1.1BMultilingual v2 + Dubbing Studio

a16z, Nat Friedman, and Daniel Gross co-led the Series A. Mike Krieger (Instagram), Brendan Iribe (Oculus), Mustafa Suleyman (DeepMind), and Tim O'Reilly came in as angels — the kind of investor list a five-month-old company doesn't normally attract.

The user-count milestone made the round possible. The bundled launch in January made the user count possible.

Sources

04 / 022023-01-30
MediaForced trust posture

The 4chan Voice-Cloning Scandal That Nearly Killed the Launch — And the 24-Hour Response That Saved It (Jan 2023)

Seven days after public beta, 4chan users cloned celebrities to generate abuse. ElevenLabs shipped paid-only cloning, an AI detector, and traceability the next business day. The crisis became the trust posture.

Original source ↗

January 30, 2023 — seven days after the public beta opened. Vice reports that 4chan users have used ElevenLabs to clone Emma Watson reading from "Mein Kampf," Joe Rogan and Ben Shapiro making racist comments, and David Attenborough delivering threats.

The story hits Slashdot, Futurism, OECD AI's incident database, and within 48 hours has been picked up by every major tech outlet covering AI risk.

By the next business day, ElevenLabs has shipped concrete changes. The crisis becomes the operating template.

What 4chan actually did

The 4chan abuse used the free tier's voice-cloning feature. With a 60-second audio sample of a target's voice, the platform could generate new audio in that voice saying anything.

Within one week of the public beta:

TargetContent
Emma WatsonReading "Mein Kampf"
Joe RoganRacist remarks about AOC
Ben ShapiroHateful content about minorities
David AttenboroughViolent threats
Hillary ClintonTransphobic content

The 4chan thread turned into a manual on how to use the product for harassment. By the time Vice published, the screenshots were everywhere on Twitter.

For most pre-seed startups, this is a company-ending event. Investors pull. Press goes negative. The product gets associated with the abuse forever.

The 24-hour response

ElevenLabs responded the next business day — January 31. The company published a statement acknowledging "an increasing number of voice cloning misuse cases" and shipped a set of immediate changes plus a roadmap of follow-ups:

Shipped within ~24 hours:

  1. Voice cloning behind paid tier. No free voice cloning. Required payment information that creates an audit trail.
  2. Per-generation traceability. Each piece of generated audio could be traced back to the specific account that produced it.
  3. Manual verification path. Voice cloning of public figures required additional verification.

Shipped over the following months: 4. AI Speech Classifier. A free public tool that takes any audio clip and tells you whether it was generated by ElevenLabs. Released publicly in June 2023 alongside the Series A — five months after the initial 4chan response, but core to the long-term trust posture.

The immediate response was concrete. Not "we are taking this seriously." Specific safeguards plus a transparent roadmap.

The speed mattered more than the substance. Within a week of the abuse going viral, the company had a public technical answer. Most AI vendors needed months to respond to similar incidents in 2024-2025. ElevenLabs set the bar in week two of its existence.

Why the response worked

The 4chan incident could have been catastrophic. It became a case study for three reasons.

1. The response was technical, not legal. The classifier was a real working tool, not a terms-of-service update. Reporters could test it; it worked. That's a different kind of credibility than a press release.

2. The traceability claim was verifiable. ElevenLabs could (and did) trace specific abusive content back to specific accounts and ban them. The audit trail wasn't theoretical.

3. The company didn't deny the upside. The CEO did not claim voice cloning would be safe. He acknowledged that misuse was inherent to the technology and that the platform needed continuous safeguards. That framing — "yes this is dangerous, here's how we manage it" — held up across the next three years of incidents.

The pattern that recurred 12 months later

The 4chan response became the template ElevenLabs ran every time misuse hit. The most consequential rerun was January 26, 2024 — Pindrop traced a fake Biden New Hampshire-primary robocall to ElevenLabs. The company suspended the account within 72 hours, gave a clear public statement, and the Biden episode became the most-cited "responsible AI vendor" example for the rest of the year.

IncidentDays to public responseConcrete action
4chan celebrities (Jan 2023)1 business dayPaid-only cloning + traceability (classifier shipped 5 months later, June 2023)
Biden robocall (Jan 2024)3 daysAccount ban + public statement + classifier reference

Same pattern, same speed, twice — and the response template was already built when the second incident hit. By late 2024, when Deutsche Telekom and other enterprise procurement teams ran due diligence on voice AI vendors, ElevenLabs' incident-response track record was a positive rather than a negative.

The hidden GTM payoff

There's a counter-intuitive truth in this incident. ElevenLabs did not lose customers from the 4chan story. The user count grew from sub-100K in late January to 1M by June 2023.

What happened instead: the misuse coverage advertised the product's capability. "ElevenLabs can clone any voice from a 60-second sample" was simultaneously the abuse vector and the most compelling demo of what the technology could do. Users who needed legitimate voice cloning — audiobook narrators, accessibility tools, voice-over artists — saw the same headlines.

The forced-trust posture meant ElevenLabs could absorb that attention without becoming "the deepfake company." Cartesia, Resemble, and PlayHT got similar capability headlines through 2023-2024 but without the same operational track record. The incident-response gap turned into a trust gap.

Sources

04 / 032024-01-22
FundingBundled milestone

Series B $80M at $1.1B — The 21-Month Unicorn That Bundled Three Product Launches Into One Press Window (Jan 2024)

ElevenLabs' Series B announcement carried Voice Marketplace, Dubbing Studio, and Mobile SDK in the same press release. Same announcement budget, four times the coverage.

Original source ↗

January 22, 2024. ElevenLabs announces an $80 million Series B led by Andreessen Horowitz, with Sequoia Capital, Nat Friedman, and Daniel Gross. Valuation: $1.1 billion. Twenty-one months from incorporation.

It's the fastest European AI company to reach unicorn status at that point — and the announcement isn't about the round.

The product bundle that ran with the round

The Series B press release named four launches in the same window:

ProductWhat it was
Voice MarketplaceCreator-uploaded voices, royalty-share model
Dubbing StudioPro video translation with editor controls
Mobile SDKiOS / Android voice integration for app developers
AI Speech Classifier (re-emphasized)Public tool for detecting AI-generated audio (originally shipped June 2023 with Series A)

Each one would have been a standalone press story. Bundled together with $80M and a $1.1B valuation, they generated a coverage cascade across four distinct press categories:

  • TechCrunch / Bloomberg / Forbes / Fortune — the funding round
  • Slator / VentureBeat / The Verge — Dubbing Studio
  • 9to5Mac / Android Central — Mobile SDK
  • AI / ML trade press — AI Speech Classifier and creator marketplace

A single Series B announcement covered four news beats. Same announcement spend, ~4× the surface area.

Why the bundle worked specifically here

The bundle wasn't arbitrary. Each launch was strategically tied to the funding narrative.

Voice Marketplace = "ElevenLabs is becoming a platform, not a TTS API." That re-framing supported the unicorn valuation. A TTS API doesn't justify $1.1B; a marketplace with creator network effects does.

Dubbing Studio = "ElevenLabs is going after media-industry budgets." Slator and VentureBeat are read by people who buy localization at scale — Netflix, Audible, Warner Bros. discovery dollars are an order of magnitude bigger than indie creator subscriptions.

Mobile SDK = "ElevenLabs is becoming infrastructure." App developers integrating voice features means recurring API revenue, not one-off creator subs.

AI Speech Classifier (re-emphasized) = "ElevenLabs is the responsible AI vendor." This is the trust-posture work — January 22, 2024 was four days before the Biden robocall story broke. The classifier becoming part of the Series B framing helped the company absorb the Biden incident without losing the narrative.

The investor list told the strategic story

Series A was Nat Friedman, Daniel Gross, and a16z — the AI-native angel pattern.

Series B added Sequoia (capital-press signal) and kept the same team (continuity signal). The cap table going into 2024 looked like this:

RoundLeadNotable participants
Pre-seed (Jan 2023)Credo VenturesConcept Ventures
Series A (Jun 2023)Nat Friedman / Daniel Gross / a16zMike Krieger, Brendan Iribe, Mustafa Suleyman, Tim O'Reilly
Series B (Jan 2024)a16zSequoia, Nat Friedman, Daniel Gross

Note: a16z double-led. That's a meaningful signal — the same fund leading consecutive rounds means internal conviction is high enough to defend the markup at the partnership.

The 21-month milestone

MilestoneMonths since incorporation
Public beta9
1M users14
Series A14
Out of beta + Multilingual v216
Series B + Unicorn21

For comparison, the median path to unicorn status for AI infrastructure companies in 2022-2024 was roughly 36-48 months. ElevenLabs hit it in 21.

The compression is the result of the bundle pattern. Each round cleared the next product launch's runway; each product launch supported the next round's valuation. The two-step ratchet repeated four times — pre-seed, A, B, C — and never broke.

The Biden incident, four days later

The Series B press cycle was still running when the Biden robocall story broke on January 26, 2024. The proximity was coincidence — but the response template was already in place from the 4chan incident a year earlier.

ElevenLabs banned the account within 72 hours, issued a public statement, and the AI Speech Classifier (already part of the Series B narrative) was the technical answer to "how do you prevent this?"

The Series B unicorn announcement and the Biden robocall response landed in the same week. The trust posture, embedded in the funding announcement, did the GTM work for both.

Sources

04 / 042024-01-26
MediaForced trust posture

The Biden Robocall Deepfake — How a 72-Hour Account Ban Became Enterprise Sales Collateral (Jan 2024)

Pindrop traced a fake Biden New Hampshire-primary robocall to ElevenLabs. The company suspended the account in 72 hours. By year-end the response was being cited in enterprise procurement decisions.

Original source ↗

January 26, 2024. Pindrop Security publishes its analysis of an AI-generated robocall sent to thousands of New Hampshire Democratic primary voters days earlier. The robocall used a synthetic Joe Biden voice telling people not to vote.

Pindrop's forensic analysis traces the audio to ElevenLabs.

The company suspends the account by the end of the same week. Bloomberg, the Financial Times, the Wall Street Journal, NBC, CNN, Reuters, and the Associated Press all cover the response.

The chronology

The incident moved fast across regulators, press, and ElevenLabs:

DateEvent
Jan 21-22Robocall reaches NH voters before primary
Jan 23NH Attorney General opens criminal investigation
Jan 25Pindrop completes forensic analysis, identifies ElevenLabs
Jan 26Bloomberg breaks ElevenLabs link; account suspended within 72 hours
Jan 27FCC announces process to ban AI-generated robocalls
Feb 8FCC formally outlaws AI voice in robocalls (citing this incident)
Feb 23Account creator publicly identified (linked to a Steve Kramer / Lingo Telecom)

The FCC ban on AI-generated robocalls — passed February 8 — explicitly cited the Biden incident as the trigger. ElevenLabs' technology was the named example in regulatory rule-making.

What the 72-hour response actually contained

Three concrete actions in the first week:

1. Account suspension. The user who generated the audio was banned. ElevenLabs' per-generation traceability (already in place since the 4chan response in January 2023) made identification straightforward.

2. Public statement. "We are dedicated to preventing the misuse of audio AI tools and take any incidents of misuse extremely seriously." Plain language, no hedging on the technical link.

3. AI Speech Classifier reference. The free public tool — first launched in June 2023 alongside the Series A — was re-surfaced as the technical answer to "how do we know if audio is from ElevenLabs?" — Pindrop had used a similar method.

What the response did not include: denial, deflection, or claims that the platform was being unfairly targeted. The framing was direct acknowledgment plus operational evidence.

Why this set the bar for the industry

Voice AI misuse incidents in 2024 affected most major vendors. The response patterns diverged.

VendorMajor 2024 incidentPublic response
ElevenLabsBiden robocall (Jan)72-hour ban, public statement, classifier reference
CartesiaLimited public incidentsN/A in 2024
PlayHTLimited public incidentsN/A in 2024
Microsoft (VALL-E)Restricted releaseKept models private to avoid this risk

ElevenLabs was the only vendor that a) had its product publicly linked to a high-profile election interference incident and b) absorbed the link without losing operational credibility.

The contrast with Microsoft's VALL-E response is the most instructive. Microsoft kept VALL-E private specifically because it didn't want to be in this position. ElevenLabs took the public position and built the operating muscle. The market rewarded the muscle by 2025.

The enterprise sales effect

The response template paid off in enterprise procurement through 2024-2025.

By the time Deutsche Telekom, NTT DOCOMO Ventures, RingCentral Ventures, HubSpot Ventures, and LG Technology Ventures came in on the Series C in January 2025 (with Salesforce Ventures returning), voice-AI vendor due diligence routinely included questions about misuse handling. ElevenLabs could point to a 12-month operational track record:

  • 4chan incident (Jan 2023) → paid-only cloning + traceability
  • Biden robocall (Jan 2024) → 72-hour ban + FCC engagement
  • 2024 election cycle → no further high-profile incidents involving ElevenLabs

That track record was the differentiator vs Cartesia and PlayHT in enterprise sales motions. Multiple Sacra and Contrary Research notes cite "operational discipline on misuse" as part of why ElevenLabs won contact-center and telco RFPs.

The "trust posture as GTM" pattern

The pattern is rare and worth naming explicitly:

  1. High-stakes misuse incident → forced public attention
  2. Speed-of-response is a verifiable signal → 72 hours becomes a benchmark
  3. Concrete technical safeguards already shipped → the response is operational, not promotional
  4. Pattern repeats credibly across multiple incidents → enterprise procurement starts to count this as risk mitigation
  5. Telco / enterprise / regulated-industry contracts close → trust becomes revenue

ElevenLabs ran this loop three times in 24 months. Each iteration compounded the next. By Series C, the trust posture was generating revenue, not just defusing crises.

Most vendors treat misuse as a PR problem. ElevenLabs treated misuse as a continuous operational test of whether the company can hold enterprise trust. The press did the marketing for free.

Sources

04 / 052024-07-03
MediaAudience boundary push

Iconic Voices — How Licensing Garland, Dean, Reynolds, and Olivier Pulled ElevenLabs Into CNN, CBS, and Variety (Jul 2024)

Estate-licensed AI voice clones of four Hollywood legends turned a Reader-app feature into a mainstream-press story. The deal mechanics quietly redefined the public conversation about voice AI ethics.

Original source ↗

July 3, 2024. ElevenLabs announces "Iconic Voices" — AI voice clones of Judy Garland, James Dean, Burt Reynolds, and Sir Laurence Olivier — licensed through CMG Worldwide and integrated into the ElevenReader app launched a week earlier.

Within 48 hours, the story is in CNN Business, CBS News, Variety, Tubefilter, Designboom, Tom's Guide, and Entrepreneur Magazine.

The deal structure that made it possible

The licensing wasn't ad hoc. ElevenLabs and CMG Worldwide — the Beverly Hills IP firm that represents the Garland, Dean, Reynolds, and Olivier estates — built a constrained-use framework:

TermDetail
Use casesReader app only — books, articles, PDFs
Voice scopeVoices not added to broader ElevenLabs audio database
Estate consentPer-voice estate sign-off on permitted use
New work generationNot permitted — voices restricted to reading existing text
Family endorsementLiza Minnelli (Garland's daughter) issued public statement

The constraints were the news angle. "AI clones dead celebrities" is a horror-show headline. "Estate-licensed AI voice clones with family endorsement, restricted to audiobooks" is a respectful-tribute headline. The press took the second framing because the framework forced it.

Why the press picked it up

Mainstream press almost never covers voice AI infrastructure launches. Conv-AI v1 in November 2024 — arguably a more important product moment — got AI trade press only.

Iconic Voices got mainstream press for three structural reasons:

1. Recognizable cultural artifacts. Judy Garland reading "The Wonderful Wizard of Oz" is a story that doesn't require explanation. Voice AI infrastructure is an explanatory headline. The cultural artifact carried the news beat.

2. Pre-resolved ethics question. Estate licensing + family endorsement closed off the ethical objection before reporters could write it. CBS, CNN, and Variety could publish without needing a "but is this OK?" rebuttal section.

3. Visual + audio share unit. Print outlets ran video clips of "Garland reading" content. The clips played native on Instagram Reels, TikTok, and YouTube Shorts when the same outlets posted social cuts. The story propagated for free across platforms.

The Iconic Voices launch is the rare voice AI story that landed in CBS Sunday Morning territory, not just TechCrunch. The deal structure was the reason.

What it did for the brand

Through mid-2024, ElevenLabs' brand was:

  • Within AI / dev community: best-in-class TTS, generous free tier, ongoing misuse stories
  • Outside the AI community: the company that made the Biden deepfake possible

Iconic Voices flipped the second framing. The same week that ElevenLabs was getting Variety coverage for Garland and Dean, the company was visibly moving past the deepfake association in mainstream press.

Press cycleDominant frame
Jan-Feb 2024Biden robocall, AI election interference
Mar-May 2024Mayor Adams clone, ongoing AI ethics stories
Jun-Aug 2024ElevenReader, Iconic Voices, audiobook future
Sep-Nov 2024Conversational AI v1, platform pivot

The brand work mattered for what came next. The Series C in January 2025 brought Deutsche Telekom, NTT DOCOMO Ventures, HubSpot Ventures, and Salesforce Ventures — strategic investors whose internal champions would have struggled to push for an investment in "the deepfake company." After Iconic Voices, ElevenLabs was a story those champions could tell internally.

The Reader app's role

ElevenReader (launched June 25, 2024) is the consumer surface that made Iconic Voices coherent. Without a place for users to actually listen to Garland reading, the story would have been "ElevenLabs licenses celebrity voices" — a vendor announcement, not a product.

The bundle:

  • Jun 25, 2024: ElevenReader iOS launches (Android shortly after). Free app, read any text aloud with natural AI narration.
  • Jul 3, 2024: Iconic Voices Collection launches inside Reader. Garland, Dean, Reynolds, Olivier as premium tier.
  • Subsequent months: More licensed voices added; Reader becomes the consumer entry point to ElevenLabs.

The eight-day gap between Reader launch and Iconic Voices launch is deliberate. Reader establishes the frame ("an audio app for books and articles"). Iconic Voices makes the frame newsworthy. Together they do what neither would have done alone.

What competitors couldn't replicate

By July 2024, every voice AI vendor could clone a celebrity voice given a sample. The capability was commoditized.

What ElevenLabs had that competitors didn't: the estate relationship infrastructure. CMG Worldwide doesn't sign with vendors who haven't built operational trust on misuse. The 4chan response (Jan 2023), the Biden response (Jan 2024), and the public AI Speech Classifier were the reasons the deal was possible.

Cartesia and PlayHT could match the technical clone quality. Neither could close a CMG Worldwide licensing deal in 2024. The trust posture became the moat.

Sources

04 / 062024-11-18
ProductTech narrative upgrade

Conversational AI v1 — The 11-Week Platform Pivot That Reset the Entire Sales Motion (Nov 2024)

ElevenLabs went from TTS API to integrated voice-agent platform on November 18, 2024. Eleven weeks later, telco and CRM strategics led the Series C. The pivot up-the-stack happened faster than competitors could react.

Original source ↗

November 18, 2024. ElevenLabs ships Conversational AI v1 — a platform layer that combines TTS, speech-to-text, and LLM orchestration into a single agent stack. Developers can now build full conversational agents inside the ElevenLabs developer console.

Eleven weeks later, the Series C closes at $3.3B with strategic investors from telco, CRM, and contact-center categories.

What the launch actually shipped

Conversational AI v1 was a platform-tier product, not a feature. Four components in one console:

  • Voice (TTS): ElevenLabs' existing Eleven Multilingual v2 model
  • Speech-to-text: Native ASR for handling user input
  • LLM orchestration: Connects to OpenAI, Anthropic, or self-hosted LLMs
  • Knowledge base: Files / URLs / text blocks as agent context

The configuration surface was extensive — voice, latency, stability, conversation length, authentication. SDK support for Python, JavaScript, React, and Swift, plus a WebSocket API.

In other words: everything a developer needed to build a working voice agent in one place. No need to wire together 5 vendors.

Why the timing mattered

The voice-agent category was forming in late 2024. The competitive landscape:

VendorPosition in Nov 2024Stack
ElevenLabsBest TTS, now full platformVertically integrated
VapiVoice agent platform, no own TTSStack of best-in-class APIs
RetellVoice agent platform, no own TTSStack of best-in-class APIs
CartesiaBest TTS competitor, no agent layerTTS only
PlayHTTTS, building agent featuresTTS + thin agent layer
DeepgramSTT leader, building TTSSTT + TTS, no agent

ElevenLabs was the only vendor with both a top-tier proprietary TTS model and a full agent stack. Vapi and Retell were stitching ElevenLabs' TTS into their stacks — making the platform pivot a direct competitive threat to them.

The Conv-AI v1 launch effectively folded the cost of Vapi and Retell into ElevenLabs' own platform. A developer who had been paying for ElevenLabs TTS + Vapi orchestration could now collapse the bill.

The eleven-week sequence

The launch was step one of a tightly-timed run:

DateEvent
Nov 18, 2024Conv AI v1 launches
Late Nov 2024$90M ARR disclosed (Sacra / Information)
Dec 2024$120M ARR (year-end)
Jan 2025Series C term-sheet activity (signaled by funding press)
Jan 30, 2025Series C $180M @ $3.3B
Feb 22-23, 2025a16z + ElevenLabs worldwide hackathon (voice agents theme)

Eleven weeks from product launch to closed funding round. That speed isn't possible without:

  1. Pre-existing investor relationships (a16z + ICONIQ already engaged)
  2. Verifiable revenue inflection ($25M → $90M → $120M ARR through 2024)
  3. A product launch that re-classified the category

The Series C new strategic investor list — Deutsche Telekom, NTT DOCOMO Ventures, RingCentral Ventures, HubSpot Ventures, and LG Technology Ventures (with Salesforce Ventures returning) — all came in because Conv-AI v1 had repositioned ElevenLabs from TTS to telco-and-CRM infrastructure.

What the strategic investors actually bought

Each of the four telco / CRM strategics had a specific reason to invest:

InvestorStrategic angle
Deutsche TelekomEuropean telco voice agents for B2B services
NTT DOCOMO VenturesJapan-market voice agents for contact centers
RingCentral VenturesUCaaS / contact-center voice integration
HubSpot VenturesSMB CRM voice agent layer
LG Technology VenturesVoice for consumer-electronics surfaces

Five strategic investors, five distinct enterprise integration paths. None of them would have invested in a TTS API. All of them had a clear thesis on a voice-agent platform.

The platform pivot was the precondition for the strategic check pile. ElevenLabs went from "interesting AI startup" to "potential infrastructure partner" in 11 weeks.

The Conv-AI 2.0 follow-up

The Conv-AI v1 launch was step one. Conversational AI 2.0 (June 3, 2025) was the credibility extension:

  • Native turn-taking model (handles hesitations, interruptions, filler words)
  • Integrated language detection (no manual config)
  • Multi-character mode (single agent, multiple personas)
  • Batch outbound calling (parallel call initiation)

The 2.0 launch was deliberately seven months after v1 — a release cadence that signaled "this is a category we own." Vapi and Retell were still on their first or second platform iteration. ElevenLabs had two generations shipped.

The cadence is the GTM. Not the individual features.

What the pivot did to the ARR curve

The platform-tier shift compounded the revenue curve in a way that pure TTS scale wouldn't have:

DateARRDriver
Q1 2024$25MTTS API + Dubbing Studio
Q4 2024$120MTTS scale + early Conv-AI adoption
Aug 2025$200MConv-AI 2.0 + enterprise contracts
Dec 2025$330M+Voice agents at scale, enterprise approaching 50% of revenue

The Q4 2024 → Aug 2025 jump (from $120M to $200M in eight months) is where Conv-AI revenue starts becoming visible in the topline. The Aug 2025 → Dec 2025 jump (from $200M to $330M in five months) is where enterprise deals begin closing at scale.

Without Conv-AI v1 in November 2024, the curve plateaus around $200M ARR — a respectable TTS company. With Conv-AI v1, the curve continues compressing into the Series D and IPO trajectory.

Sources

04 / 072025-01-30
FundingBundled milestone

Series C $180M at $3.3B — The Telco / CRM Strategic Stack That Reframed ElevenLabs as Voice Infrastructure (Jan 2025)

a16z and ICONIQ co-led, but the headline was the new strategic-investor list: Deutsche Telekom, NTT DOCOMO, RingCentral, HubSpot, LG Technology Ventures (Salesforce was a returning investor). The Series C wasn't capital — it was distribution embedded in a cap table.

Original source ↗

January 30, 2025. ElevenLabs announces a $180 million Series C co-led by Andreessen Horowitz and ICONIQ Growth, valuing the company at $3.3 billion. Three times the Series B valuation from twelve months earlier.

NEA, Sequoia, World Innovation Lab, Valor, Endeavor Catalyst, and Lunate are also in. ICONIQ partner Seth Pierrepont joins the board.

The financial-press story is the valuation jump. The strategic story is the strategic investor list.

The strategic investor pattern

The Series C brought five new strategic investors with overlapping but distinct angles (Salesforce Ventures, also in the round, was a returning investor from earlier rounds):

InvestorWhat they buy from ElevenLabsStrategic vector
Deutsche TelekomVoice agents for European B2BTelco / SMB
NTT DOCOMO VenturesVoice agents for Japan contact centersAsia-Pacific telco
RingCentral VenturesUCaaS voice integrationUCaaS / contact center
HubSpot VenturesSMB CRM voice agent layerSMB CRM
LG Technology VenturesVoice for consumer-electronics surfacesConsumer hardware

Five new strategic investors, five distinct enterprise channels. Each one had an internal deployment thesis before writing the check.

This is a different kind of round than Series A or B. Pre-seed and Series A bring capital. Series B brings capital + brand (a16z + Sequoia). Series C with this strategic stack brings capital + brand + distribution access through five enterprise vectors that ElevenLabs would otherwise have spent years building bottom-up.

The 11-week timeline that made it possible

The Series C closed eleven weeks after Conversational AI v1 shipped in November 2024. That sequence was the load-bearing maneuver:

DateEvent
Nov 18, 2024Conversational AI v1 launches
Late Nov$90M ARR disclosed
Dec 31, 2024$120M ARR (year-end)
Jan 30, 2025Series C closes
Feb 22-23a16z + ElevenLabs worldwide hackathon (voice agents)

The platform pivot in November positioned ElevenLabs as voice infrastructure, not a TTS API. The strategics could only invest in the platform-tier story. They could not have invested in a TTS-API story — telco and CRM procurement teams don't deploy TTS APIs as standalone products.

The Conv-AI v1 launch did the strategic-investor recruiting. The Series C announcement was the close.

What the round bundled

True to the ElevenLabs cadence, the Series C wasn't a standalone announcement. It bundled:

  • Round close ($180M @ $3.3B)
  • Board addition (ICONIQ partner Seth Pierrepont)
  • Strategic distribution launch (Deutsche Telekom partnership signaling)
  • Enterprise customer reveals (Disclosed in subsequent press) — the ARR disclosure pulling forward customer naming
  • a16z hackathon series (Announced two weeks later, February 22-23)

Five news beats inside a 14-day window. Same announcement budget, multiplicative coverage.

The valuation math

The trajectory across 24 months:

RoundDateRound sizeValuationMultiple on prior
Pre-seedJan 2023$2M~$12M
Series AJun 2023$19M~$100M8.3×
Series BJan 2024$80M$1.1B11×
Series CJan 2025$180M$3.3B
TenderSep 2025$100M (secondary)$6.6B
Series DFeb 2026$500M$11B1.7×

The Series C 3× markup is smaller than Series A's 8× or Series B's 11×, but the round size is much larger. The valuation expansion shifts from "narrative repricing" to "revenue-supported."

At $3.3B / $120M ARR, the multiple is ~27×. By Series D at $11B / $330M ARR, it compresses to ~33×. The multiple stayed roughly stable from Series C through Series D — meaning the valuation growth was earned by ARR growth, not by re-rating.

Why the Series C was the inflection round

Every prior round had been a step up. The Series C was a category change.

Before the Series C, ElevenLabs was an AI voice company. After the Series C — with Deutsche Telekom, NTT DOCOMO, HubSpot, RingCentral, and LG Technology Ventures all newly on the cap table (alongside returning Salesforce Ventures) — ElevenLabs was a voice infrastructure company. The label change opened up enterprise contracts that were structurally unavailable to AI-vendor-positioned competitors.

Cartesia and PlayHT could match ElevenLabs on TTS quality through 2025. Neither could match the strategic investor stack. Vapi and Retell could match the agent platform; neither had Deutsche Telekom on speed-dial.

The Series C wasn't the moment ElevenLabs was best. It was the moment ElevenLabs became unmatchable on a specific competitive vector — strategic distribution access.

The downstream effect on Series D

The Series D in February 2026 (Sequoia-led, $500M @ $11B) was the validation round for the Series C strategy. The strategic investors from January 2025 had become the largest enterprise customers by late 2025 — Deutsche Telekom and Revolut named publicly in the Series D coverage.

The flywheel:

  1. Conv-AI v1 launches (Nov 2024)
  2. Strategic investors join Series C (Jan 2025)
  3. Strategic investors deploy ElevenLabs internally (2025)
  4. Deployments become enterprise contracts (mid-late 2025)
  5. Enterprise contracts generate $330M+ ARR by year-end 2025
  6. ARR closes Series D at $11B (Feb 2026)

The Series C strategics weren't a marketing flourish. They were the GTM motion for the next 12 months.

Sources

04 / 082025-06-05
ProductTech narrative upgrade

Eleven v3 — How an Audio-Tag Syntax Made Voice AI Feel Like a New Category (Jun 2025)

70+ languages, multi-speaker dialogue, and inline tags like [excited] and [whispers]. The v3 alpha turned voice synthesis into a stage-direction language — and the demo clips traveled native on every social platform.

Original source ↗

June 5, 2025. ElevenLabs releases Eleven v3 in public alpha. 70+ languages, multi-speaker dialogue, and a new audio-tag syntax that lets developers control emotion, tone, and delivery inline with the text.

The launch tweet from @elevenlabsio gets re-shared by Mati Staniszewski, Andrej Karpathy, and the AI-builder Twitter circle. Within 48 hours, audio-tag demos are circulating across X, TikTok, Instagram Reels, and YouTube Shorts.

The syntax that changed the demo grammar

Audio tags are words wrapped in brackets that the v3 model interprets as performance cues, not text:

"That was incredible! [excited] I never thought we'd actually pull it off. 
[whispers] But we have to be careful — they might still be watching. 
[laughing nervously] What do we do now?"

The output is a single audio clip with three distinct emotional registers — excitement, whispered tension, nervous laughter — controlled by markup, not by separate generation calls.

This is a shift in how voice AI is used. Previously, getting emotion variation required either:

  1. Multiple generation calls with different prompts, then stitching
  2. Voice direction in the source text ("she said excitedly"), with limited model interpretation
  3. A separate fine-tuned model per emotion

v3 collapses all three into inline syntax. The cognitive model is now closer to writing a screenplay than calling an API.

Why the demo grammar mattered for distribution

Most TTS upgrades launch with side-by-side audio comparisons. They demo poorly because:

  • The improvement is incremental and hard to hear on phone speakers
  • Side-by-side requires the user to listen to both clips
  • The shareability is low — one clip is enough; two clips is friction

v3 launched with single-clip demos that contained the variation inside one audio file. A 15-second clip would shift between excitement, whispering, and laughter — within the same generation. The "wow" moment was self-contained.

Demo formatShare-friction"New category" feel
Side-by-side audioHighLow
Single clip with multiple voicesMediumMedium
Single clip with audio-tag-driven emotion shiftsLowHigh

The format mattered as much as the model quality. v3 demos went viral because they fit social-platform attention spans.

The cadence: v1 to v3 in 28 months

The model release cadence shows the deliberate pacing:

ModelReleaseMonths between
Beta TTS (English / Polish)Jan 2023
Eleven Multilingual v1May 20234
Eleven Multilingual v2Aug 20233
Eleven Turbo v2Apr 20248
Eleven Turbo v2.5Aug 20244
Eleven v3 (alpha)Jun 202510

The 10-month gap between Turbo v2.5 and v3 is the longest pause in the history of the company. v3 was a generational shift, not an iteration — and the launch positioning matched: "the most expressive Text to Speech model ever."

The pause was strategic. Conversational AI v1 (Nov 2024) and Conv-AI 2.0 (Jun 2025) needed to be the priority through that period, because the platform pivot was the load-bearing GTM move. v3 launching alongside Conv-AI 2.0 (two days apart, June 3 and June 5) bundled the model release with the platform release in the same announcement window.

How the launch propagated

The v3 launch followed a clear distribution pattern:

Day 1 (Jun 5)

  • Launch tweet from @elevenlabsio with audio-tag demo clips
  • Mati Staniszewski's personal X account amplifies
  • AI-builder Twitter (Karpathy, Andrej Karpathy-adjacent accounts) re-shares

Days 2-3

  • Hacker News front page (Eleven v3 alpha thread, hundreds of comments)
  • Product Hunt launch (top product of the day)
  • VentureBeat / TechCrunch coverage of audio-tag syntax

Days 4-7

  • Creator demos start appearing on TikTok and Instagram Reels
  • Audio-tag syntax tutorials on YouTube
  • Reddit r/ElevenLabs, r/MachineLearning threads

Weeks 2-4

  • Integration into creator workflows (audiobook narrators, indie game devs)
  • Third-party tools and SDKs adopting the syntax
  • Use-case content (best audio tags for X, Y, Z)

The propagation worked because each platform got a different format of the same demo. X got 30-second audio threads. TikTok got 15-second creator clips. YouTube got 5-minute "how to use audio tags" tutorials. Same launch, four formats, four audiences.

The pricing trick that drove adoption

ElevenLabs offered v3 alpha at 80% off credit pricing through June 30, 2025. That's not a discount — that's a deliberate adoption forcing function.

A creator on the free or starter tier who wanted to try v3 could generate 4-5× more audio than they normally could. Heavy users who would have hit caps with v3's higher-quality output were given headroom. By the end of June, v3 was the default model in most user workflows because it had been the cheapest model.

When the discount ended on July 1, the switching cost back to older models was the cognitive cost of un-learning the audio-tag syntax. Most users stayed on v3 even at full price.

The 80%-off alpha was the cheapest cohort-acquisition campaign ElevenLabs ever ran. By the time pricing normalized, the audio-tag syntax was the user expectation.

What v3 did to the competitive landscape

Cartesia, PlayHT, Resemble, and others released TTS upgrades through 2025. None matched v3's audio-tag syntax. The closest equivalent was OpenAI's voice mode for ChatGPT, which had emotional range but no developer-facing markup.

By Q4 2025, "audio tags" had become a category requirement. Vendors evaluating their roadmaps had to choose: ship a v3-equivalent syntax, or accept that ElevenLabs would be the default for emotionally rich voice work.

The Series D narrative in February 2026 leaned heavily on v3 as proof that ElevenLabs was the model leader, not just the platform leader. Sequoia's lead position in the round (vs Sequoia not leading any prior round) signaled that the model story had become investable in its own right.

Sources

04 / 092026-02-04
FundingBundled milestone

Series D $500M at $11B — Sequoia Leads the IPO-Track Round (Feb 2026)

Sequoia takes the lead from a16z and ICONIQ. Mati Staniszewski tells the press the company is 'building toward an IPO.' Triples the September secondary valuation in five months.

Original source ↗

February 4, 2026. ElevenLabs announces a $500 million Series D led by Sequoia Capital, with participation from Andreessen Horowitz, ICONIQ, Lightspeed Venture Partners, Bond, and Evantic Capital. Valuation: $11 billion.

The round triples the $6.6B secondary valuation from the September 2025 employee tender — five months earlier. ARR closed 2025 at $330M+, disclosed three weeks before the round.

CEO Mati Staniszewski tells TechCrunch and CNBC the company is "building toward an IPO."

What changed in the lead investor

Across six rounds, the lead-investor pattern shows the company's narrative arc:

RoundLeadImplied frame
Pre-seed (Jan 2023)Credo VenturesEuropean seed-stage AI
Series A (Jun 2023)a16z + Nat Friedman + Daniel GrossAI-native angel + a16z
Series B (Jan 2024)a16zUnicorn growth
Series C (Jan 2025)a16z + ICONIQPlatform + strategic distribution
Tender (Sep 2025)Sequoia + ICONIQ (co-led)Bridge to growth-stage
Series D (Feb 2026)SequoiaIPO trajectory

Sequoia taking the lead — for the first time in the company's history — is the signal. Sequoia's late-stage practice (the Sequoia Capital growth fund) leads rounds for companies on credible paths to public markets. The fund's portfolio includes Stripe, Klarna, Snowflake (pre-IPO), and Datadog (pre-IPO).

The lead change is the most direct public signal that ElevenLabs is in the IPO-prep cohort.

The bundled disclosures

True to the cadence, the Series D didn't fire alone. The press window included:

DisclosureDetail
$500M roundLargest in ElevenLabs history
$11B valuation3.3× the Series C valuation 12 months earlier
$330M+ ARRYear-end 2025 (disclosed Jan 13, 2026)
Nvidia investmentRe-emphasized — Nvidia first announced Sept 2025; the Series D framing positioned it as an infrastructure-tier endorsement
Enterprise customersDeutsche Telekom, Revolut named publicly
IPO commentary"Building toward an IPO" — first explicit IPO frame

Six news beats. One press window. Same announcement budget, multiplicative coverage — the same playbook ElevenLabs has run on every funding round since pre-seed.

Why the Nvidia investment carried so much weight

Nvidia's strategic investment in ElevenLabs was first announced in September 2025 — Tech.eu, Music Business Worldwide, and others covered it at the time, with Jensen Huang publicly endorsing the company. The Series D press cycle in February 2026 re-emphasized it as the IPO-track narrative crystallized. Three things this signals:

1. Strategic-customer / strategic-investor convergence. Nvidia uses ElevenLabs internally for audio generation. The strategic check confirmed an existing customer relationship — the September 2025 announcement was both the deal disclosure AND a customer reveal in one news cycle.

2. AI infrastructure validation. Nvidia's portfolio of strategic investments includes CoreWeave, Lambda Labs, Hugging Face, Inflection (pre-Microsoft), and Cohere. Joining that list places ElevenLabs in the AI-infrastructure category, not just voice AI.

3. Market depth check. Nvidia's diligence process is unusually rigorous. A company that passed Nvidia's strategic-investment review in late 2025 is not a year away from breaking — it's at infrastructure-grade operating maturity.

The valuation framework

The valuation expansion math from Series C → Series D:

DateARRValuationMultiple
Jan 2025 (Series C)$120M$3.3B27×
Aug 2025$200M$5.3B (interpolated)27×
Sep 2025 (Tender)$200M$6.6B33×
Dec 2025 (year-end)$330M$9.0B (interpolated)27×
Feb 2026 (Series D)$330M$11B33×

The multiple stayed in a narrow band (27-33×) across 13 months. That's revenue-supported expansion, not narrative re-rating.

For comparison, public-market AI infrastructure multiples in early 2026:

  • Snowflake: ~12× ARR
  • Datadog: ~14× ARR
  • Cloudflare: ~18× ARR
  • Palantir: ~30× ARR
  • ElevenLabs (private): 33× ARR

ElevenLabs at 33× is at the high end of public-market multiples but inside the range. The implied IPO valuation at $500M-$1B ARR (likely 2027 timing) would be $15-25B — broadly consistent with the Series D pricing.

What "building toward an IPO" means in practice

Mati Staniszewski's IPO comment is unusually specific for a CEO. Most founders dodge IPO questions; saying "building toward an IPO" sets a public expectation.

Three things that change after this kind of statement:

  1. Hiring profile shifts. CFO and Chief Legal Officer hires become priorities. Compliance, audit, and SOX-readiness work begin in earnest.
  2. Reporting discipline tightens. ARR disclosures, customer reveals, and metric transparency become quarterly rather than ad hoc.
  3. Strategic-investor relationships deepen. Telco / CRM strategics from the Series C become anchor customers for the IPO narrative.

The five-year arc from incorporation to IPO target track record (April 2022 → late 2026 IPO filing window) compresses an enterprise-software company timeline that historically averaged 8-12 years.

What's not in the disclosure

The Series D press leaves three things deliberately unsaid:

  • The IPO timing. "Building toward" is open-ended. Filing window could be late 2026, mid-2027, or further out.
  • The competitive cost structure. ElevenLabs has not disclosed gross margin or unit economics. Voice AI compute is expensive; the margin question matters for IPO valuation.
  • The strategic-investor contract economics. Deutsche Telekom and Revolut are named, but contract sizes and commit periods are private.

These are the questions S-1 disclosure will eventually answer. The Series D narrative says the company is on the path. The S-1 will say whether the path is durable.

Sources