From a stealth-mode TTS bet to an $11B voice-AI platform in under four years
ElevenLabs spent 12 months in stealth before shipping a beta in January 2023. The product went viral in days — first as a creator phenomenon, then as a misuse scandal, then as the default voice infrastructure for media, publishers, and voice agents. Every funding round bundled a model release. Every model release re-set the ceiling on what voice AI was assumed to do.
12 min readFounded 2022-0422 events tracked9 deep dives
01Timeline
ARR, valuation, and every GTM move, on one timeline.
Events split into four horizontal bands by type. Markers with a halo jump to a deep-dive section below. Hover anything for a summary; click external markers to jump to the original source.
ProductFundingMediaClick for deep diveARRValuation
02Platform Mix
Which channels mattered when.
Cursor used six platforms differently. Some carried the entire arc; some were episodic catalysts; one was the discipline of staying off.
𝕏X (Twitter)
All stages — load-bearing
Founder + product launch channel
Mati Staniszewski (@matiii) and the @elevenlabsio handle drive every major launch. Audio demos travel exceptionally well on X — voice-clip tweets autoplay and get re-shared with the original audio attached, which is rare. Each model release lands as a clip thread first.
⚡ Catalyst moment
Eleven v3 alpha launch tweet (June 5, 2025) — audio-tag demo clips shared by founders, Karpathy, and the AI-builder crowd. Clip-native delivery is what made v3 feel like a 'new category' on day one.
When the product output is itself a shareable artifact (audio clip, voice demo). The platform autoplays audio inline — make every launch a clip, not a screenshot
✗ Don't expect
If the team posts press-release prose. Voice AI Twitter expects new audio in the post, not links to a blog
Two layers: founder long-form podcasts (Sequoia Training Data, a16z Show, Nothing Left Unsaid) and creator tutorials. Mati Staniszewski's investor-podcast circuit through 2025 reset perception from 'TTS startup' to 'voice infrastructure company.' Creator tutorials drive self-serve sign-ups continuously in the background.
⚡ Catalyst moment
Mati Staniszewski on Sequoia's Training Data (mid-2025) — the long-form artifact a16z, ICONIQ, and later Sequoia could circulate inside their own LP and exec networks before each subsequent funding round.
When the founder can carry a 60-90 minute investor-grade conversation, AND the product has new demos in every episode. The double layer — founder + creator — compounds
✗ Don't expect
One-off keynote talks with no follow-up. The pattern only works as a continuous interview cadence, not single appearances
Multilingual v2, Conversational AI v1, Eleven v3, and Reader all landed on HN front page with hundreds of comments. The HN signal mattered most for the conversational-AI launch in November 2024 — it proved the platform pivot was technically credible to the developers ElevenLabs needed to build agents on top.
⚡ Catalyst moment
Conversational AI v1 launch threads (Nov 2024) — front-page placement with serious technical scrutiny. Two months later: Series C at $3.3B.
r/ElevenLabs and adjacent communities (r/AIVoiceCloning, r/audiobooks, r/IndieDev) became the place where creators trade voice IDs, prompt tricks, and use-case templates. Less acquisition, more activation — it's where new users learn how to get good output in their first hour.
⚡ Catalyst moment
No single moment. The community formed organically around the free tier in 2023 and matured as a self-serve support hub through 2024-2025.
Mati Staniszewski's LinkedIn has carried real weight since mid-2025. ARR milestones, funding rounds, and customer wins announced as personal posts get re-shared inside Deutsche Telekom-style enterprise procurement networks. Higher score than typical for a dev-tools company because the buyer for voice agents is an enterprise contact-center exec, not a developer.
⚡ Catalyst moment
Mati Staniszewski's $200M ARR LinkedIn post (Aug 2025) — disclosed numbers, customer logos, and explicit 'building toward IPO' framing all in one post. Set the tone for the September tender and February Series D.
Unusually high score for a B2B AI company — and it's earned. Voice clones (Darth Vader, Iconic Voices, celebrity parodies) routinely cross from TikTok to Instagram Reels without ElevenLabs lifting a finger. Reader-app demos with Judy Garland and James Dean read native to the format. The company doesn't post heavily; the creators do.
⚡ Catalyst moment
Iconic Voices launch coverage (July 2024) — Garland reading 'Wizard of Oz' clips circulated as Instagram Reels through CBS, CNN, and Variety social handles. Mainstream-press distribution, not paid.
When the product output has visual + audio appeal (voice clips paired with celebrity faces, video demos). Instagram is downstream of TikTok creator content, not a primary channel
✗ Don't expect
For developer features, API launches, conversational-AI configuration. Skip Instagram for those entirely
The big-picture read on what actually drove the curve — before zooming in on each key moment.
ElevenLabs did not have a slow burn.
The product went from stealth to one million users in five months, from one million users to a $1.1B unicorn in 12 more, and from there to $330M ARR and an $11B Series D in 24. The whole arc fits inside 46 months. What looks like luck is actually a five-move pattern that the founders ran four times in a row.
The dubbing thesis was the moat
Mati Staniszewski (ex-Palantir) and Piotr Dabkowski (ex-Google) grew up in Poland watching badly dubbed American films. The original product idea, sketched in 2020, was fixing that — voice that crosses languages with the speaker's emotion intact.
That insight pre-committed ElevenLabs to two architectural choices most competitors did not make:
Multilingual from day one. The first beta in January 2023 shipped with English and Polish. Eleven Multilingual v2 (August 2023) covered nearly 30 languages with the original speaker's accent preserved. By v3 (June 2025), the count was 70+.
Emotion as a first-class output. Not "TTS that sounds OK" but "voice that conveys feeling across language barriers." The audio-tag syntax in v3 — [excited], [whispers], [laughing] — is the natural endpoint of that thesis, six years from the original sketch.
Competitors with a generic TTS framing (Resemble, Murf, WellSaid Labs) optimized for narration quality. ElevenLabs optimized for emotional cross-lingual transfer. The framing constrained the product roadmap in a way that paid off every release.
Five months in stealth, then the scandals
ElevenLabs incorporated in April 2022 and did not ship until January 23, 2023. That's nine months of model training and infrastructure work before any user touched the product.
The launch arc that followed compressed into days:
Date
Event
Jan 23, 2023
Public beta + $2M pre-seed announced
Jan 30, 2023
4chan abuses voice cloning (Emma Watson, Joe Rogan, Ben Shapiro)
Jan 31, 2023
ElevenLabs ships paid-only voice cloning + AI detection tool
Jun 2023
1M registered users — five months from launch
The 4chan incident is the first thing that should have killed the company. Instead, ElevenLabs absorbed it as a forced-trust posture: voice cloning behind paid ID verification, classifier for AI-generated audio, account-level traceability — all shipped within days.
It did the same thing 12 months later. On January 26, 2024, Pindrop traced a fake Biden New Hampshire-primary robocall to ElevenLabs. The account was suspended within 72 hours, the company gave a clear public statement, and the Biden episode became a case-study citation in every "responsible AI" panel for the rest of the year.
Most companies hide misuse. ElevenLabs treated each incident as a chance to publicly demonstrate that the platform had auditable controls. The trust narrative ended up doing real GTM work — Deutsche Telekom and large enterprise contracts in 2025 cited operational discipline as a reason to commit.
Every funding round was a product bundle
Look at the cadence:
Round
Date
Bundled launch
Pre-seed $2M
Jan 23, 2023
Public beta
Series A $19M @ $100M
Jun 21, 2023
New voice products
Series B $80M @ $1.1B
Jan 22, 2024
Voice Marketplace + Dubbing Studio + Mobile SDK
Series C $180M @ $3.3B
Jan 30, 2025
Conversational AI v1 had shipped 10 weeks earlier
Tender $100M @ $6.6B
Sep 8, 2025
Bundled with $200M ARR disclosure
Series D $500M @ $11B
Feb 4, 2026
Bundled with $330M ARR + IPO talk
Six rounds. Six bundled milestones. Every announcement window doubled as a product window.
The underlying logic is straightforward: a solo "$X funding" announcement gets you 3–5 days of capital-press coverage. A "$X funding + $Y ARR + new product" bundle gets you the same window across capital press, dev press, telco trade press, and SaaS press — for the same announcement budget.
The platform pivot that mattered
Most TTS companies stopped at API-as-a-product. ElevenLabs made a deliberate move to platform tier in November 2024.
Conversational AI v1 (November 18, 2024) integrated TTS + STT + LLM orchestration into a single agent stack. Conversational AI 2.0 (June 3, 2025) added native turn-taking, language detection, multi-character mode, and batch outbound calling.
The competitive geometry changed. In November 2024, ElevenLabs was selling against other TTS APIs (Cartesia, PlayHT, Resemble). By mid-2025 it was selling against Vapi, Retell, and the contact-center incumbents (NICE, Genesys, Five9) — a much larger market with much larger contract sizes.
The strategic-investor list on the Series C tells the same story: new strategic checks from Deutsche Telekom, NTT DOCOMO Ventures, RingCentral Ventures, HubSpot Ventures, and LG Technology Ventures (Salesforce Ventures, in the round, was a returning investor from earlier rounds). Telco, CRM, and consumer electronics — not creator tools. The pivot up the stack was the precondition for those checks.
Creator distribution did the brand work for free
ElevenLabs' visible marketing budget through 2024 was small. The acquisition machine was creator-first:
Voice clones travel native on social. A Darth Vader clip on TikTok. A Judy Garland reading on Instagram. The product output is itself the share unit. Most B2B tools envy this.
The free tier is the marketing. A generous free quota means creators experiment, and the experiments turn into Reels, TikToks, YouTube shorts. ElevenLabs gets the brand impression for free.
Voice Marketplace as a flywheel. Creators upload custom voices, other users discover and use them, the original creator earns. Three-way alignment that gives ElevenLabs viral content as a byproduct.
Iconic Voices as PR primer. Garland / Dean / Reynolds / Olivier (July 2024) put ElevenLabs in CNN, CBS, and Variety — outlets dev-tool companies almost never reach. The estate-licensing angle was the news hook.
When Mati Staniszewski went on Sequoia's Training Data, a16z Show, and Lenny's adjacent podcasts through 2024-2025, the founder-as-IP pattern translated directly to investor-narrative work. Different audience from the TikTok creators, same compounding mechanism.
The pattern, distilled
Six moves ElevenLabs ran. Each one is reusable in any AI infrastructure category.
Lock the thesis to a multilingual, emotion-preserving frame from day one. The framing constrained the roadmap (v2 → v3 → audio tags) in a way that made each release feel like progress on the same promise.
Bundle every funding round with at least one product launch. Same press budget, 3-4× the coverage surface. Six rounds in a row, never broken.
Treat misuse as a forced-trust audit, not a PR crisis. Two scandals in 12 months. Both were absorbed as proof of operational discipline. Telco and enterprise procurement reads the response, not the incident.
Move up the stack before competitors do. TTS API → conversational platform was an 11-week jump (Conv-AI v1 in Nov 2024, Series C in Jan 2025 with telco strategics on board). Competitors who stayed at API tier are now selling into a smaller market.
Make the product output the share unit. Voice clips are autoplay-native on X, Instagram, TikTok. Free tier turns creators into a brand-extension layer the company doesn't pay for.
Run the founder-as-IP loop for investors specifically. Long-form podcasts (Training Data, a16z Show, Nothing Left Unsaid) timed to between funding rounds. Each new round happens after a podcast circuit, not during it.
What's not in the public record
Things outside reporters can't see, that probably matter most:
The actual cost of model training in 2022-2023. Stealth mode is expensive. The pre-seed amount ($2M) is too small to fund a year of GPU work — the founders likely bootstrapped and burned personal capital. Specific numbers are private.
Real free-to-paid conversion rates. ElevenLabs has been generous with topline ARR but never disclosed conversion economics. The 1M-users-by-June-2023 figure could mean 5% paid or 0.5% paid — the gap matters.
The exact mechanics of enterprise sales motion. Deutsche Telekom and Revolut are named publicly. The contract sizes, sales cycle length, and PoC-to-deal conversion are not.
The competitive cost structure vs Cartesia, PlayHT, Vapi, Retell. Voice AI is one of the most crowded AI infrastructure categories. ElevenLabs' margin per million characters generated, vs competitors, is the question that determines whether the IPO narrative survives 2026.
These are the questions Sacra deep-dives, The Information enterprise reporting, and S-1 disclosures will eventually answer. Public traces alone get this story to about 70%. The last 30% is locked behind paywalls and S-1 due diligence.
04Deep Dives
9 key moments, fully unpacked.
For each: the catalyst, the concrete numbers, why it landed, and the reusable pattern underneath. Read straight through, or jump to any one.
04 / 012023-01-23
FundingBundled milestone
$2M Pre-Seed and a Public Beta — The Bundled Launch That Pulled in 1M Users in Five Months (Jan 2023)
ElevenLabs spent nine months in stealth, then announced funding and shipped a public beta in the same week. The free tier and clip-ready output did the rest.
January 23, 2023. ElevenLabs announces a $2 million pre-seed led by Credo Ventures with Concept Ventures, and on the same day opens its text-to-speech beta to the public. English and Polish voices, free tier, no waitlist.
By June 2023 — five months later — the platform crosses one million registered users.
The stealth bet that preceded the launch
Mati Staniszewski (ex-Palantir) and Piotr Dabkowski (ex-Google) incorporated ElevenLabs in April 2022. From April 2022 to January 2023, the company was effectively dark: no public site beyond a landing page, no demos, no press, no product.
Nine months is a long stealth window for a $2M pre-seed company. The founders used it to train a TTS model that was meaningfully better than what was publicly available — Microsoft Azure, Google Cloud TTS, and Amazon Polly were the alternatives at the time, and they sounded robotic by comparison.
The discipline of staying dark is the unsung part. Most pre-seed founders ship a leaky beta to a small group on day 60 because they want feedback. ElevenLabs waited until the model was better than the incumbents.
The bundle: funding + beta + free tier
Three things hit at once on January 23:
Funding announcement. Pre-seed $2M, lead and participants disclosed.
Public beta open. Anyone could sign up that day.
Generous free tier. 10,000 characters per month free, paid plans starting at $5/month.
The free tier was the load-bearing piece. A creator could generate 30-60 seconds of audio without paying — enough to make a TikTok, a YouTube intro, or a Twitter clip. The first audio they made was almost certainly shareable.
What competitors offered
What ElevenLabs offered
API-only, paid tier minimum ~$50/mo
Free 10,000 characters + $5 minimum
Robotic narrator voices
Emotional, human-like output
English-only or thin multilingual
English + Polish + path to 28 more
The pricing wasn't a strategic move to undercut. It was a deliberate choice to put the product in front of the long tail of creators — TikTokers, podcasters, indie game devs, YouTubers — who would amplify it.
What "1M users in 5 months" actually looked like
The growth wasn't paid. It wasn't a Product Hunt launch (ElevenLabs ranked but didn't dominate). The mechanics were:
Twitter / X audio clips. Users posted "listen to this" tweets with ElevenLabs-generated voices. The clips autoplayed inline. Each share carried the brand.
Hacker News submissions. Beta launch hit HN front page; technically curious devs signed up to try.
TikTok creator usage. Voice-over for narration content, especially Reddit-story TikToks, took off through February-April.
Reddit threads. r/MachineLearning, r/AIVoiceCloning, r/sidehustles all surfaced ElevenLabs as the new tool that worked.
The growth curve is the signature of a product that fits a frame perfectly — every casual user becomes a distribution node because the output is itself the share unit.
The 4chan incident, one week later
The same launch that pulled in users pulled in abusers. By January 30, 2023, 4chan users had cloned Emma Watson, Joe Rogan, and Ben Shapiro to generate offensive content.
ElevenLabs responded the next business day with paid-only voice cloning, an AI detection tool, and traceability per generation. The story is in the next deep-dive — but it's worth noting here that the misuse was the byproduct of the same generosity that drove the user growth.
The free tier got ElevenLabs to 1M users. Voice cloning behind ID verification kept the company from being shut down for it.
The compounding effect on Series A
By June 2023, ElevenLabs had ~1M users on the platform. That metric — verifiable, auditable — closed the Series A on terms ($19M at ~$100M post) that would have been impossible from a cold start.
Date
Round
Valuation
Trigger
Jan 23, 2023
Pre-seed $2M
~$12M
Beta launch
Jun 21, 2023
Series A $19M
~$100M
1M users + voice products
Jan 22, 2024
Series B $80M
$1.1B
Multilingual v2 + Dubbing Studio
a16z, Nat Friedman, and Daniel Gross co-led the Series A. Mike Krieger (Instagram), Brendan Iribe (Oculus), Mustafa Suleyman (DeepMind), and Tim O'Reilly came in as angels — the kind of investor list a five-month-old company doesn't normally attract.
The user-count milestone made the round possible. The bundled launch in January made the user count possible.
The 4chan Voice-Cloning Scandal That Nearly Killed the Launch — And the 24-Hour Response That Saved It (Jan 2023)
Seven days after public beta, 4chan users cloned celebrities to generate abuse. ElevenLabs shipped paid-only cloning, an AI detector, and traceability the next business day. The crisis became the trust posture.
January 30, 2023 — seven days after the public beta opened. Vice reports that 4chan users have used ElevenLabs to clone Emma Watson reading from "Mein Kampf," Joe Rogan and Ben Shapiro making racist comments, and David Attenborough delivering threats.
The story hits Slashdot, Futurism, OECD AI's incident database, and within 48 hours has been picked up by every major tech outlet covering AI risk.
By the next business day, ElevenLabs has shipped concrete changes. The crisis becomes the operating template.
What 4chan actually did
The 4chan abuse used the free tier's voice-cloning feature. With a 60-second audio sample of a target's voice, the platform could generate new audio in that voice saying anything.
Within one week of the public beta:
Target
Content
Emma Watson
Reading "Mein Kampf"
Joe Rogan
Racist remarks about AOC
Ben Shapiro
Hateful content about minorities
David Attenborough
Violent threats
Hillary Clinton
Transphobic content
The 4chan thread turned into a manual on how to use the product for harassment. By the time Vice published, the screenshots were everywhere on Twitter.
For most pre-seed startups, this is a company-ending event. Investors pull. Press goes negative. The product gets associated with the abuse forever.
The 24-hour response
ElevenLabs responded the next business day — January 31. The company published a statement acknowledging "an increasing number of voice cloning misuse cases" and shipped a set of immediate changes plus a roadmap of follow-ups:
Shipped within ~24 hours:
Voice cloning behind paid tier. No free voice cloning. Required payment information that creates an audit trail.
Per-generation traceability. Each piece of generated audio could be traced back to the specific account that produced it.
Manual verification path. Voice cloning of public figures required additional verification.
Shipped over the following months:
4. AI Speech Classifier. A free public tool that takes any audio clip and tells you whether it was generated by ElevenLabs. Released publicly in June 2023 alongside the Series A — five months after the initial 4chan response, but core to the long-term trust posture.
The immediate response was concrete. Not "we are taking this seriously." Specific safeguards plus a transparent roadmap.
The speed mattered more than the substance. Within a week of the abuse going viral, the company had a public technical answer. Most AI vendors needed months to respond to similar incidents in 2024-2025. ElevenLabs set the bar in week two of its existence.
Why the response worked
The 4chan incident could have been catastrophic. It became a case study for three reasons.
1. The response was technical, not legal. The classifier was a real working tool, not a terms-of-service update. Reporters could test it; it worked. That's a different kind of credibility than a press release.
2. The traceability claim was verifiable. ElevenLabs could (and did) trace specific abusive content back to specific accounts and ban them. The audit trail wasn't theoretical.
3. The company didn't deny the upside. The CEO did not claim voice cloning would be safe. He acknowledged that misuse was inherent to the technology and that the platform needed continuous safeguards. That framing — "yes this is dangerous, here's how we manage it" — held up across the next three years of incidents.
The pattern that recurred 12 months later
The 4chan response became the template ElevenLabs ran every time misuse hit. The most consequential rerun was January 26, 2024 — Pindrop traced a fake Biden New Hampshire-primary robocall to ElevenLabs. The company suspended the account within 72 hours, gave a clear public statement, and the Biden episode became the most-cited "responsible AI vendor" example for the rest of the year.
Account ban + public statement + classifier reference
Same pattern, same speed, twice — and the response template was already built when the second incident hit. By late 2024, when Deutsche Telekom and other enterprise procurement teams ran due diligence on voice AI vendors, ElevenLabs' incident-response track record was a positive rather than a negative.
The hidden GTM payoff
There's a counter-intuitive truth in this incident. ElevenLabs did not lose customers from the 4chan story. The user count grew from sub-100K in late January to 1M by June 2023.
What happened instead: the misuse coverage advertised the product's capability. "ElevenLabs can clone any voice from a 60-second sample" was simultaneously the abuse vector and the most compelling demo of what the technology could do. Users who needed legitimate voice cloning — audiobook narrators, accessibility tools, voice-over artists — saw the same headlines.
The forced-trust posture meant ElevenLabs could absorb that attention without becoming "the deepfake company." Cartesia, Resemble, and PlayHT got similar capability headlines through 2023-2024 but without the same operational track record. The incident-response gap turned into a trust gap.
Series B $80M at $1.1B — The 21-Month Unicorn That Bundled Three Product Launches Into One Press Window (Jan 2024)
ElevenLabs' Series B announcement carried Voice Marketplace, Dubbing Studio, and Mobile SDK in the same press release. Same announcement budget, four times the coverage.
January 22, 2024. ElevenLabs announces an $80 million Series B led by Andreessen Horowitz, with Sequoia Capital, Nat Friedman, and Daniel Gross. Valuation: $1.1 billion. Twenty-one months from incorporation.
It's the fastest European AI company to reach unicorn status at that point — and the announcement isn't about the round.
The product bundle that ran with the round
The Series B press release named four launches in the same window:
Product
What it was
Voice Marketplace
Creator-uploaded voices, royalty-share model
Dubbing Studio
Pro video translation with editor controls
Mobile SDK
iOS / Android voice integration for app developers
AI Speech Classifier (re-emphasized)
Public tool for detecting AI-generated audio (originally shipped June 2023 with Series A)
Each one would have been a standalone press story. Bundled together with $80M and a $1.1B valuation, they generated a coverage cascade across four distinct press categories:
AI / ML trade press — AI Speech Classifier and creator marketplace
A single Series B announcement covered four news beats. Same announcement spend, ~4× the surface area.
Why the bundle worked specifically here
The bundle wasn't arbitrary. Each launch was strategically tied to the funding narrative.
Voice Marketplace = "ElevenLabs is becoming a platform, not a TTS API." That re-framing supported the unicorn valuation. A TTS API doesn't justify $1.1B; a marketplace with creator network effects does.
Dubbing Studio = "ElevenLabs is going after media-industry budgets." Slator and VentureBeat are read by people who buy localization at scale — Netflix, Audible, Warner Bros. discovery dollars are an order of magnitude bigger than indie creator subscriptions.
Mobile SDK = "ElevenLabs is becoming infrastructure." App developers integrating voice features means recurring API revenue, not one-off creator subs.
AI Speech Classifier (re-emphasized) = "ElevenLabs is the responsible AI vendor." This is the trust-posture work — January 22, 2024 was four days before the Biden robocall story broke. The classifier becoming part of the Series B framing helped the company absorb the Biden incident without losing the narrative.
The investor list told the strategic story
Series A was Nat Friedman, Daniel Gross, and a16z — the AI-native angel pattern.
Series B added Sequoia (capital-press signal) and kept the same team (continuity signal). The cap table going into 2024 looked like this:
Round
Lead
Notable participants
Pre-seed (Jan 2023)
Credo Ventures
Concept Ventures
Series A (Jun 2023)
Nat Friedman / Daniel Gross / a16z
Mike Krieger, Brendan Iribe, Mustafa Suleyman, Tim O'Reilly
Series B (Jan 2024)
a16z
Sequoia, Nat Friedman, Daniel Gross
Note: a16z double-led. That's a meaningful signal — the same fund leading consecutive rounds means internal conviction is high enough to defend the markup at the partnership.
The 21-month milestone
Milestone
Months since incorporation
Public beta
9
1M users
14
Series A
14
Out of beta + Multilingual v2
16
Series B + Unicorn
21
For comparison, the median path to unicorn status for AI infrastructure companies in 2022-2024 was roughly 36-48 months. ElevenLabs hit it in 21.
The compression is the result of the bundle pattern. Each round cleared the next product launch's runway; each product launch supported the next round's valuation. The two-step ratchet repeated four times — pre-seed, A, B, C — and never broke.
The Biden incident, four days later
The Series B press cycle was still running when the Biden robocall story broke on January 26, 2024. The proximity was coincidence — but the response template was already in place from the 4chan incident a year earlier.
ElevenLabs banned the account within 72 hours, issued a public statement, and the AI Speech Classifier (already part of the Series B narrative) was the technical answer to "how do you prevent this?"
The Series B unicorn announcement and the Biden robocall response landed in the same week. The trust posture, embedded in the funding announcement, did the GTM work for both.
The Biden Robocall Deepfake — How a 72-Hour Account Ban Became Enterprise Sales Collateral (Jan 2024)
Pindrop traced a fake Biden New Hampshire-primary robocall to ElevenLabs. The company suspended the account in 72 hours. By year-end the response was being cited in enterprise procurement decisions.
January 26, 2024. Pindrop Security publishes its analysis of an AI-generated robocall sent to thousands of New Hampshire Democratic primary voters days earlier. The robocall used a synthetic Joe Biden voice telling people not to vote.
Pindrop's forensic analysis traces the audio to ElevenLabs.
The company suspends the account by the end of the same week. Bloomberg, the Financial Times, the Wall Street Journal, NBC, CNN, Reuters, and the Associated Press all cover the response.
The chronology
The incident moved fast across regulators, press, and ElevenLabs:
Bloomberg breaks ElevenLabs link; account suspended within 72 hours
Jan 27
FCC announces process to ban AI-generated robocalls
Feb 8
FCC formally outlaws AI voice in robocalls (citing this incident)
Feb 23
Account creator publicly identified (linked to a Steve Kramer / Lingo Telecom)
The FCC ban on AI-generated robocalls — passed February 8 — explicitly cited the Biden incident as the trigger. ElevenLabs' technology was the named example in regulatory rule-making.
What the 72-hour response actually contained
Three concrete actions in the first week:
1. Account suspension. The user who generated the audio was banned. ElevenLabs' per-generation traceability (already in place since the 4chan response in January 2023) made identification straightforward.
2. Public statement. "We are dedicated to preventing the misuse of audio AI tools and take any incidents of misuse extremely seriously." Plain language, no hedging on the technical link.
3. AI Speech Classifier reference. The free public tool — first launched in June 2023 alongside the Series A — was re-surfaced as the technical answer to "how do we know if audio is from ElevenLabs?" — Pindrop had used a similar method.
What the response did not include: denial, deflection, or claims that the platform was being unfairly targeted. The framing was direct acknowledgment plus operational evidence.
Why this set the bar for the industry
Voice AI misuse incidents in 2024 affected most major vendors. The response patterns diverged.
Vendor
Major 2024 incident
Public response
ElevenLabs
Biden robocall (Jan)
72-hour ban, public statement, classifier reference
Cartesia
Limited public incidents
N/A in 2024
PlayHT
Limited public incidents
N/A in 2024
Microsoft (VALL-E)
Restricted release
Kept models private to avoid this risk
ElevenLabs was the only vendor that a) had its product publicly linked to a high-profile election interference incident and b) absorbed the link without losing operational credibility.
The contrast with Microsoft's VALL-E response is the most instructive. Microsoft kept VALL-E private specifically because it didn't want to be in this position. ElevenLabs took the public position and built the operating muscle. The market rewarded the muscle by 2025.
The enterprise sales effect
The response template paid off in enterprise procurement through 2024-2025.
By the time Deutsche Telekom, NTT DOCOMO Ventures, RingCentral Ventures, HubSpot Ventures, and LG Technology Ventures came in on the Series C in January 2025 (with Salesforce Ventures returning), voice-AI vendor due diligence routinely included questions about misuse handling. ElevenLabs could point to a 12-month operational track record:
Biden robocall (Jan 2024) → 72-hour ban + FCC engagement
2024 election cycle → no further high-profile incidents involving ElevenLabs
That track record was the differentiator vs Cartesia and PlayHT in enterprise sales motions. Multiple Sacra and Contrary Research notes cite "operational discipline on misuse" as part of why ElevenLabs won contact-center and telco RFPs.
The "trust posture as GTM" pattern
The pattern is rare and worth naming explicitly:
High-stakes misuse incident → forced public attention
Speed-of-response is a verifiable signal → 72 hours becomes a benchmark
Concrete technical safeguards already shipped → the response is operational, not promotional
Pattern repeats credibly across multiple incidents → enterprise procurement starts to count this as risk mitigation
ElevenLabs ran this loop three times in 24 months. Each iteration compounded the next. By Series C, the trust posture was generating revenue, not just defusing crises.
Most vendors treat misuse as a PR problem. ElevenLabs treated misuse as a continuous operational test of whether the company can hold enterprise trust. The press did the marketing for free.
Iconic Voices — How Licensing Garland, Dean, Reynolds, and Olivier Pulled ElevenLabs Into CNN, CBS, and Variety (Jul 2024)
Estate-licensed AI voice clones of four Hollywood legends turned a Reader-app feature into a mainstream-press story. The deal mechanics quietly redefined the public conversation about voice AI ethics.
July 3, 2024. ElevenLabs announces "Iconic Voices" — AI voice clones of Judy Garland, James Dean, Burt Reynolds, and Sir Laurence Olivier — licensed through CMG Worldwide and integrated into the ElevenReader app launched a week earlier.
Within 48 hours, the story is in CNN Business, CBS News, Variety, Tubefilter, Designboom, Tom's Guide, and Entrepreneur Magazine.
The deal structure that made it possible
The licensing wasn't ad hoc. ElevenLabs and CMG Worldwide — the Beverly Hills IP firm that represents the Garland, Dean, Reynolds, and Olivier estates — built a constrained-use framework:
Term
Detail
Use cases
Reader app only — books, articles, PDFs
Voice scope
Voices not added to broader ElevenLabs audio database
Estate consent
Per-voice estate sign-off on permitted use
New work generation
Not permitted — voices restricted to reading existing text
Family endorsement
Liza Minnelli (Garland's daughter) issued public statement
The constraints were the news angle. "AI clones dead celebrities" is a horror-show headline. "Estate-licensed AI voice clones with family endorsement, restricted to audiobooks" is a respectful-tribute headline. The press took the second framing because the framework forced it.
Why the press picked it up
Mainstream press almost never covers voice AI infrastructure launches. Conv-AI v1 in November 2024 — arguably a more important product moment — got AI trade press only.
Iconic Voices got mainstream press for three structural reasons:
1. Recognizable cultural artifacts. Judy Garland reading "The Wonderful Wizard of Oz" is a story that doesn't require explanation. Voice AI infrastructure is an explanatory headline. The cultural artifact carried the news beat.
2. Pre-resolved ethics question. Estate licensing + family endorsement closed off the ethical objection before reporters could write it. CBS, CNN, and Variety could publish without needing a "but is this OK?" rebuttal section.
3. Visual + audio share unit. Print outlets ran video clips of "Garland reading" content. The clips played native on Instagram Reels, TikTok, and YouTube Shorts when the same outlets posted social cuts. The story propagated for free across platforms.
The Iconic Voices launch is the rare voice AI story that landed in CBS Sunday Morning territory, not just TechCrunch. The deal structure was the reason.
What it did for the brand
Through mid-2024, ElevenLabs' brand was:
Within AI / dev community: best-in-class TTS, generous free tier, ongoing misuse stories
Outside the AI community: the company that made the Biden deepfake possible
Iconic Voices flipped the second framing. The same week that ElevenLabs was getting Variety coverage for Garland and Dean, the company was visibly moving past the deepfake association in mainstream press.
Press cycle
Dominant frame
Jan-Feb 2024
Biden robocall, AI election interference
Mar-May 2024
Mayor Adams clone, ongoing AI ethics stories
Jun-Aug 2024
ElevenReader, Iconic Voices, audiobook future
Sep-Nov 2024
Conversational AI v1, platform pivot
The brand work mattered for what came next. The Series C in January 2025 brought Deutsche Telekom, NTT DOCOMO Ventures, HubSpot Ventures, and Salesforce Ventures — strategic investors whose internal champions would have struggled to push for an investment in "the deepfake company." After Iconic Voices, ElevenLabs was a story those champions could tell internally.
The Reader app's role
ElevenReader (launched June 25, 2024) is the consumer surface that made Iconic Voices coherent. Without a place for users to actually listen to Garland reading, the story would have been "ElevenLabs licenses celebrity voices" — a vendor announcement, not a product.
The bundle:
Jun 25, 2024: ElevenReader iOS launches (Android shortly after). Free app, read any text aloud with natural AI narration.
Subsequent months: More licensed voices added; Reader becomes the consumer entry point to ElevenLabs.
The eight-day gap between Reader launch and Iconic Voices launch is deliberate. Reader establishes the frame ("an audio app for books and articles"). Iconic Voices makes the frame newsworthy. Together they do what neither would have done alone.
What competitors couldn't replicate
By July 2024, every voice AI vendor could clone a celebrity voice given a sample. The capability was commoditized.
What ElevenLabs had that competitors didn't: the estate relationship infrastructure. CMG Worldwide doesn't sign with vendors who haven't built operational trust on misuse. The 4chan response (Jan 2023), the Biden response (Jan 2024), and the public AI Speech Classifier were the reasons the deal was possible.
Cartesia and PlayHT could match the technical clone quality. Neither could close a CMG Worldwide licensing deal in 2024. The trust posture became the moat.
Conversational AI v1 — The 11-Week Platform Pivot That Reset the Entire Sales Motion (Nov 2024)
ElevenLabs went from TTS API to integrated voice-agent platform on November 18, 2024. Eleven weeks later, telco and CRM strategics led the Series C. The pivot up-the-stack happened faster than competitors could react.
November 18, 2024. ElevenLabs ships Conversational AI v1 — a platform layer that combines TTS, speech-to-text, and LLM orchestration into a single agent stack. Developers can now build full conversational agents inside the ElevenLabs developer console.
Eleven weeks later, the Series C closes at $3.3B with strategic investors from telco, CRM, and contact-center categories.
What the launch actually shipped
Conversational AI v1 was a platform-tier product, not a feature. Four components in one console:
Voice (TTS): ElevenLabs' existing Eleven Multilingual v2 model
Speech-to-text: Native ASR for handling user input
LLM orchestration: Connects to OpenAI, Anthropic, or self-hosted LLMs
Knowledge base: Files / URLs / text blocks as agent context
The configuration surface was extensive — voice, latency, stability, conversation length, authentication. SDK support for Python, JavaScript, React, and Swift, plus a WebSocket API.
In other words: everything a developer needed to build a working voice agent in one place. No need to wire together 5 vendors.
Why the timing mattered
The voice-agent category was forming in late 2024. The competitive landscape:
Vendor
Position in Nov 2024
Stack
ElevenLabs
Best TTS, now full platform
Vertically integrated
Vapi
Voice agent platform, no own TTS
Stack of best-in-class APIs
Retell
Voice agent platform, no own TTS
Stack of best-in-class APIs
Cartesia
Best TTS competitor, no agent layer
TTS only
PlayHT
TTS, building agent features
TTS + thin agent layer
Deepgram
STT leader, building TTS
STT + TTS, no agent
ElevenLabs was the only vendor with both a top-tier proprietary TTS model and a full agent stack. Vapi and Retell were stitching ElevenLabs' TTS into their stacks — making the platform pivot a direct competitive threat to them.
The Conv-AI v1 launch effectively folded the cost of Vapi and Retell into ElevenLabs' own platform. A developer who had been paying for ElevenLabs TTS + Vapi orchestration could now collapse the bill.
The eleven-week sequence
The launch was step one of a tightly-timed run:
Date
Event
Nov 18, 2024
Conv AI v1 launches
Late Nov 2024
$90M ARR disclosed (Sacra / Information)
Dec 2024
$120M ARR (year-end)
Jan 2025
Series C term-sheet activity (signaled by funding press)
The Series C new strategic investor list — Deutsche Telekom, NTT DOCOMO Ventures, RingCentral Ventures, HubSpot Ventures, and LG Technology Ventures (with Salesforce Ventures returning) — all came in because Conv-AI v1 had repositioned ElevenLabs from TTS to telco-and-CRM infrastructure.
What the strategic investors actually bought
Each of the four telco / CRM strategics had a specific reason to invest:
Investor
Strategic angle
Deutsche Telekom
European telco voice agents for B2B services
NTT DOCOMO Ventures
Japan-market voice agents for contact centers
RingCentral Ventures
UCaaS / contact-center voice integration
HubSpot Ventures
SMB CRM voice agent layer
LG Technology Ventures
Voice for consumer-electronics surfaces
Five strategic investors, five distinct enterprise integration paths. None of them would have invested in a TTS API. All of them had a clear thesis on a voice-agent platform.
The platform pivot was the precondition for the strategic check pile. ElevenLabs went from "interesting AI startup" to "potential infrastructure partner" in 11 weeks.
The Conv-AI 2.0 follow-up
The Conv-AI v1 launch was step one. Conversational AI 2.0 (June 3, 2025) was the credibility extension:
Native turn-taking model (handles hesitations, interruptions, filler words)
The 2.0 launch was deliberately seven months after v1 — a release cadence that signaled "this is a category we own." Vapi and Retell were still on their first or second platform iteration. ElevenLabs had two generations shipped.
The cadence is the GTM. Not the individual features.
What the pivot did to the ARR curve
The platform-tier shift compounded the revenue curve in a way that pure TTS scale wouldn't have:
Date
ARR
Driver
Q1 2024
$25M
TTS API + Dubbing Studio
Q4 2024
$120M
TTS scale + early Conv-AI adoption
Aug 2025
$200M
Conv-AI 2.0 + enterprise contracts
Dec 2025
$330M+
Voice agents at scale, enterprise approaching 50% of revenue
The Q4 2024 → Aug 2025 jump (from $120M to $200M in eight months) is where Conv-AI revenue starts becoming visible in the topline. The Aug 2025 → Dec 2025 jump (from $200M to $330M in five months) is where enterprise deals begin closing at scale.
Without Conv-AI v1 in November 2024, the curve plateaus around $200M ARR — a respectable TTS company. With Conv-AI v1, the curve continues compressing into the Series D and IPO trajectory.
Series C $180M at $3.3B — The Telco / CRM Strategic Stack That Reframed ElevenLabs as Voice Infrastructure (Jan 2025)
a16z and ICONIQ co-led, but the headline was the new strategic-investor list: Deutsche Telekom, NTT DOCOMO, RingCentral, HubSpot, LG Technology Ventures (Salesforce was a returning investor). The Series C wasn't capital — it was distribution embedded in a cap table.
January 30, 2025. ElevenLabs announces a $180 million Series C co-led by Andreessen Horowitz and ICONIQ Growth, valuing the company at $3.3 billion. Three times the Series B valuation from twelve months earlier.
NEA, Sequoia, World Innovation Lab, Valor, Endeavor Catalyst, and Lunate are also in. ICONIQ partner Seth Pierrepont joins the board.
The financial-press story is the valuation jump. The strategic story is the strategic investor list.
The strategic investor pattern
The Series C brought five new strategic investors with overlapping but distinct angles (Salesforce Ventures, also in the round, was a returning investor from earlier rounds):
Investor
What they buy from ElevenLabs
Strategic vector
Deutsche Telekom
Voice agents for European B2B
Telco / SMB
NTT DOCOMO Ventures
Voice agents for Japan contact centers
Asia-Pacific telco
RingCentral Ventures
UCaaS voice integration
UCaaS / contact center
HubSpot Ventures
SMB CRM voice agent layer
SMB CRM
LG Technology Ventures
Voice for consumer-electronics surfaces
Consumer hardware
Five new strategic investors, five distinct enterprise channels. Each one had an internal deployment thesis before writing the check.
This is a different kind of round than Series A or B. Pre-seed and Series A bring capital. Series B brings capital + brand (a16z + Sequoia). Series C with this strategic stack brings capital + brand + distribution access through five enterprise vectors that ElevenLabs would otherwise have spent years building bottom-up.
The 11-week timeline that made it possible
The Series C closed eleven weeks after Conversational AI v1 shipped in November 2024. That sequence was the load-bearing maneuver:
The platform pivot in November positioned ElevenLabs as voice infrastructure, not a TTS API. The strategics could only invest in the platform-tier story. They could not have invested in a TTS-API story — telco and CRM procurement teams don't deploy TTS APIs as standalone products.
The Conv-AI v1 launch did the strategic-investor recruiting. The Series C announcement was the close.
What the round bundled
True to the ElevenLabs cadence, the Series C wasn't a standalone announcement. It bundled:
Round close ($180M @ $3.3B)
Board addition (ICONIQ partner Seth Pierrepont)
Strategic distribution launch (Deutsche Telekom partnership signaling)
Enterprise customer reveals (Disclosed in subsequent press) — the ARR disclosure pulling forward customer naming
a16z hackathon series (Announced two weeks later, February 22-23)
Five news beats inside a 14-day window. Same announcement budget, multiplicative coverage.
The valuation math
The trajectory across 24 months:
Round
Date
Round size
Valuation
Multiple on prior
Pre-seed
Jan 2023
$2M
~$12M
—
Series A
Jun 2023
$19M
~$100M
8.3×
Series B
Jan 2024
$80M
$1.1B
11×
Series C
Jan 2025
$180M
$3.3B
3×
Tender
Sep 2025
$100M (secondary)
$6.6B
2×
Series D
Feb 2026
$500M
$11B
1.7×
The Series C 3× markup is smaller than Series A's 8× or Series B's 11×, but the round size is much larger. The valuation expansion shifts from "narrative repricing" to "revenue-supported."
At $3.3B / $120M ARR, the multiple is ~27×. By Series D at $11B / $330M ARR, it compresses to ~33×. The multiple stayed roughly stable from Series C through Series D — meaning the valuation growth was earned by ARR growth, not by re-rating.
Why the Series C was the inflection round
Every prior round had been a step up. The Series C was a category change.
Before the Series C, ElevenLabs was an AI voice company. After the Series C — with Deutsche Telekom, NTT DOCOMO, HubSpot, RingCentral, and LG Technology Ventures all newly on the cap table (alongside returning Salesforce Ventures) — ElevenLabs was a voice infrastructure company. The label change opened up enterprise contracts that were structurally unavailable to AI-vendor-positioned competitors.
Cartesia and PlayHT could match ElevenLabs on TTS quality through 2025. Neither could match the strategic investor stack. Vapi and Retell could match the agent platform; neither had Deutsche Telekom on speed-dial.
The Series C wasn't the moment ElevenLabs was best. It was the moment ElevenLabs became unmatchable on a specific competitive vector — strategic distribution access.
The downstream effect on Series D
The Series D in February 2026 (Sequoia-led, $500M @ $11B) was the validation round for the Series C strategy. The strategic investors from January 2025 had become the largest enterprise customers by late 2025 — Deutsche Telekom and Revolut named publicly in the Series D coverage.
Eleven v3 — How an Audio-Tag Syntax Made Voice AI Feel Like a New Category (Jun 2025)
70+ languages, multi-speaker dialogue, and inline tags like [excited] and [whispers]. The v3 alpha turned voice synthesis into a stage-direction language — and the demo clips traveled native on every social platform.
June 5, 2025. ElevenLabs releases Eleven v3 in public alpha. 70+ languages, multi-speaker dialogue, and a new audio-tag syntax that lets developers control emotion, tone, and delivery inline with the text.
The launch tweet from @elevenlabsio gets re-shared by Mati Staniszewski, Andrej Karpathy, and the AI-builder Twitter circle. Within 48 hours, audio-tag demos are circulating across X, TikTok, Instagram Reels, and YouTube Shorts.
The syntax that changed the demo grammar
Audio tags are words wrapped in brackets that the v3 model interprets as performance cues, not text:
"That was incredible! [excited] I never thought we'd actually pull it off.
[whispers] But we have to be careful — they might still be watching.
[laughing nervously] What do we do now?"
The output is a single audio clip with three distinct emotional registers — excitement, whispered tension, nervous laughter — controlled by markup, not by separate generation calls.
This is a shift in how voice AI is used. Previously, getting emotion variation required either:
Multiple generation calls with different prompts, then stitching
Voice direction in the source text ("she said excitedly"), with limited model interpretation
A separate fine-tuned model per emotion
v3 collapses all three into inline syntax. The cognitive model is now closer to writing a screenplay than calling an API.
Why the demo grammar mattered for distribution
Most TTS upgrades launch with side-by-side audio comparisons. They demo poorly because:
The improvement is incremental and hard to hear on phone speakers
Side-by-side requires the user to listen to both clips
The shareability is low — one clip is enough; two clips is friction
v3 launched with single-clip demos that contained the variation inside one audio file. A 15-second clip would shift between excitement, whispering, and laughter — within the same generation. The "wow" moment was self-contained.
Demo format
Share-friction
"New category" feel
Side-by-side audio
High
Low
Single clip with multiple voices
Medium
Medium
Single clip with audio-tag-driven emotion shifts
Low
High
The format mattered as much as the model quality. v3 demos went viral because they fit social-platform attention spans.
The cadence: v1 to v3 in 28 months
The model release cadence shows the deliberate pacing:
Model
Release
Months between
Beta TTS (English / Polish)
Jan 2023
—
Eleven Multilingual v1
May 2023
4
Eleven Multilingual v2
Aug 2023
3
Eleven Turbo v2
Apr 2024
8
Eleven Turbo v2.5
Aug 2024
4
Eleven v3 (alpha)
Jun 2025
10
The 10-month gap between Turbo v2.5 and v3 is the longest pause in the history of the company. v3 was a generational shift, not an iteration — and the launch positioning matched: "the most expressive Text to Speech model ever."
The pause was strategic. Conversational AI v1 (Nov 2024) and Conv-AI 2.0 (Jun 2025) needed to be the priority through that period, because the platform pivot was the load-bearing GTM move. v3 launching alongside Conv-AI 2.0 (two days apart, June 3 and June 5) bundled the model release with the platform release in the same announcement window.
How the launch propagated
The v3 launch followed a clear distribution pattern:
Day 1 (Jun 5)
Launch tweet from @elevenlabsio with audio-tag demo clips
Mati Staniszewski's personal X account amplifies
AI-builder Twitter (Karpathy, Andrej Karpathy-adjacent accounts) re-shares
Days 2-3
Hacker News front page (Eleven v3 alpha thread, hundreds of comments)
Product Hunt launch (top product of the day)
VentureBeat / TechCrunch coverage of audio-tag syntax
Days 4-7
Creator demos start appearing on TikTok and Instagram Reels
Audio-tag syntax tutorials on YouTube
Reddit r/ElevenLabs, r/MachineLearning threads
Weeks 2-4
Integration into creator workflows (audiobook narrators, indie game devs)
Third-party tools and SDKs adopting the syntax
Use-case content (best audio tags for X, Y, Z)
The propagation worked because each platform got a different format of the same demo. X got 30-second audio threads. TikTok got 15-second creator clips. YouTube got 5-minute "how to use audio tags" tutorials. Same launch, four formats, four audiences.
The pricing trick that drove adoption
ElevenLabs offered v3 alpha at 80% off credit pricing through June 30, 2025. That's not a discount — that's a deliberate adoption forcing function.
A creator on the free or starter tier who wanted to try v3 could generate 4-5× more audio than they normally could. Heavy users who would have hit caps with v3's higher-quality output were given headroom. By the end of June, v3 was the default model in most user workflows because it had been the cheapest model.
When the discount ended on July 1, the switching cost back to older models was the cognitive cost of un-learning the audio-tag syntax. Most users stayed on v3 even at full price.
The 80%-off alpha was the cheapest cohort-acquisition campaign ElevenLabs ever ran. By the time pricing normalized, the audio-tag syntax was the user expectation.
What v3 did to the competitive landscape
Cartesia, PlayHT, Resemble, and others released TTS upgrades through 2025. None matched v3's audio-tag syntax. The closest equivalent was OpenAI's voice mode for ChatGPT, which had emotional range but no developer-facing markup.
By Q4 2025, "audio tags" had become a category requirement. Vendors evaluating their roadmaps had to choose: ship a v3-equivalent syntax, or accept that ElevenLabs would be the default for emotionally rich voice work.
The Series D narrative in February 2026 leaned heavily on v3 as proof that ElevenLabs was the model leader, not just the platform leader. Sequoia's lead position in the round (vs Sequoia not leading any prior round) signaled that the model story had become investable in its own right.
Series D $500M at $11B — Sequoia Leads the IPO-Track Round (Feb 2026)
Sequoia takes the lead from a16z and ICONIQ. Mati Staniszewski tells the press the company is 'building toward an IPO.' Triples the September secondary valuation in five months.
February 4, 2026. ElevenLabs announces a $500 million Series D led by Sequoia Capital, with participation from Andreessen Horowitz, ICONIQ, Lightspeed Venture Partners, Bond, and Evantic Capital. Valuation: $11 billion.
The round triples the $6.6B secondary valuation from the September 2025 employee tender — five months earlier. ARR closed 2025 at $330M+, disclosed three weeks before the round.
CEO Mati Staniszewski tells TechCrunch and CNBC the company is "building toward an IPO."
What changed in the lead investor
Across six rounds, the lead-investor pattern shows the company's narrative arc:
Round
Lead
Implied frame
Pre-seed (Jan 2023)
Credo Ventures
European seed-stage AI
Series A (Jun 2023)
a16z + Nat Friedman + Daniel Gross
AI-native angel + a16z
Series B (Jan 2024)
a16z
Unicorn growth
Series C (Jan 2025)
a16z + ICONIQ
Platform + strategic distribution
Tender (Sep 2025)
Sequoia + ICONIQ (co-led)
Bridge to growth-stage
Series D (Feb 2026)
Sequoia
IPO trajectory
Sequoia taking the lead — for the first time in the company's history — is the signal. Sequoia's late-stage practice (the Sequoia Capital growth fund) leads rounds for companies on credible paths to public markets. The fund's portfolio includes Stripe, Klarna, Snowflake (pre-IPO), and Datadog (pre-IPO).
The lead change is the most direct public signal that ElevenLabs is in the IPO-prep cohort.
The bundled disclosures
True to the cadence, the Series D didn't fire alone. The press window included:
Disclosure
Detail
$500M round
Largest in ElevenLabs history
$11B valuation
3.3× the Series C valuation 12 months earlier
$330M+ ARR
Year-end 2025 (disclosed Jan 13, 2026)
Nvidia investment
Re-emphasized — Nvidia first announced Sept 2025; the Series D framing positioned it as an infrastructure-tier endorsement
Enterprise customers
Deutsche Telekom, Revolut named publicly
IPO commentary
"Building toward an IPO" — first explicit IPO frame
Six news beats. One press window. Same announcement budget, multiplicative coverage — the same playbook ElevenLabs has run on every funding round since pre-seed.
Why the Nvidia investment carried so much weight
Nvidia's strategic investment in ElevenLabs was first announced in September 2025 — Tech.eu, Music Business Worldwide, and others covered it at the time, with Jensen Huang publicly endorsing the company. The Series D press cycle in February 2026 re-emphasized it as the IPO-track narrative crystallized. Three things this signals:
1. Strategic-customer / strategic-investor convergence. Nvidia uses ElevenLabs internally for audio generation. The strategic check confirmed an existing customer relationship — the September 2025 announcement was both the deal disclosure AND a customer reveal in one news cycle.
2. AI infrastructure validation. Nvidia's portfolio of strategic investments includes CoreWeave, Lambda Labs, Hugging Face, Inflection (pre-Microsoft), and Cohere. Joining that list places ElevenLabs in the AI-infrastructure category, not just voice AI.
3. Market depth check. Nvidia's diligence process is unusually rigorous. A company that passed Nvidia's strategic-investment review in late 2025 is not a year away from breaking — it's at infrastructure-grade operating maturity.
The valuation framework
The valuation expansion math from Series C → Series D:
Date
ARR
Valuation
Multiple
Jan 2025 (Series C)
$120M
$3.3B
27×
Aug 2025
$200M
$5.3B (interpolated)
27×
Sep 2025 (Tender)
$200M
$6.6B
33×
Dec 2025 (year-end)
$330M
$9.0B (interpolated)
27×
Feb 2026 (Series D)
$330M
$11B
33×
The multiple stayed in a narrow band (27-33×) across 13 months. That's revenue-supported expansion, not narrative re-rating.
For comparison, public-market AI infrastructure multiples in early 2026:
Snowflake: ~12× ARR
Datadog: ~14× ARR
Cloudflare: ~18× ARR
Palantir: ~30× ARR
ElevenLabs (private): 33× ARR
ElevenLabs at 33× is at the high end of public-market multiples but inside the range. The implied IPO valuation at $500M-$1B ARR (likely 2027 timing) would be $15-25B — broadly consistent with the Series D pricing.
What "building toward an IPO" means in practice
Mati Staniszewski's IPO comment is unusually specific for a CEO. Most founders dodge IPO questions; saying "building toward an IPO" sets a public expectation.
Three things that change after this kind of statement:
Hiring profile shifts. CFO and Chief Legal Officer hires become priorities. Compliance, audit, and SOX-readiness work begin in earnest.
Reporting discipline tightens. ARR disclosures, customer reveals, and metric transparency become quarterly rather than ad hoc.
Strategic-investor relationships deepen. Telco / CRM strategics from the Series C become anchor customers for the IPO narrative.
The five-year arc from incorporation to IPO target track record (April 2022 → late 2026 IPO filing window) compresses an enterprise-software company timeline that historically averaged 8-12 years.
What's not in the disclosure
The Series D press leaves three things deliberately unsaid:
The IPO timing. "Building toward" is open-ended. Filing window could be late 2026, mid-2027, or further out.
The competitive cost structure. ElevenLabs has not disclosed gross margin or unit economics. Voice AI compute is expensive; the margin question matters for IPO valuation.
The strategic-investor contract economics. Deutsche Telekom and Revolut are named, but contract sizes and commit periods are private.
These are the questions S-1 disclosure will eventually answer. The Series D narrative says the company is on the path. The S-1 will say whether the path is durable.