ElevenLabs Raises $2M and Announces AI Speech Platform Promising to Revolutionize Audio Storytelling
The company opens access to Beta platform designed to let creators and publishers narrate long-form content
- ElevenLabs launches Beta platform allowing creators and publishers to narrate their long-form content
- The pre-seed funding round was led by Credo Ventures, with Concept Ventures and individual investors also participating
- Capital will fuel research and development of the AI dubbing aimed for release later this year
2023-01-23, London, UK | ElevenLabs, an AI voice technology startup, is building lifelike speech synthesis tools with the long-term goal of instantly converting spoken audio between languages. Today, they announced launching their Beta platform after raising $2 million in a pre-seed funding round led by Credo Ventures, with Concept Ventures and other individual investors also participating.
The company’s platform lets creators and publishers narrate long-form content and expand into the audio format. Its features are powered by an in-house developed deep learning model for speech synthesis which realistically renders human intonation and inflections, and can adjust delivery based on context. ElevenLabs also offers a suite of tools for voice cloning and designing synthetic voices, aimed at providing users with new creative outlets. The company is currently working with selected publishers on a dedicated workstation for voiceover editing which will be added to the platform in early February. ElevenLabs seeks to become the first AI narrator providing the quality necessary for voicing news and audiobooks at scale.
Leaps in capability require innovation, which is why ElevenLabs considers itself first and foremost a research company. Much of this research to date focused on developing new text-to-speech models which rely on high compression and context understanding to render human speech ultra-realistically. The company also built a new model for voice cloning which achieves high output similarity with no training (no fine-tuning), on samples as short as 5 seconds - a feat previously unheard of. Developers can access all these features via API.
This research powers the platform’s current features but it also contributes to realizing the company’s ultimate goal of instantly converting spoken audio between languages. Their AI dubbing tool, aimed for release later this year, will let users automatically re-voice any audio or video in a different language, all while preserving the original speaker's voice. ElevenLabs initially hopes to attract clients in the education arena, while its long-term goal is to make on-demand multilingual audio support a reality across streaming, audiobooks, gaming, movies, and even real-time conversation.
The company’s speech synthesis and dubbing tools are as complementary as they are well-timed: both promise to bring audio and video to wider audiences and both come at a time when the audio space is booming. An early group of testers, among them YouTube creators, publishers and developers, already use the platform daily to voice videos, stories and characters, and the company expects the sphere of potential applications for generative speech will only expand. News publishers have already found that growing their audio presence is a great way of engaging and retaining subscribers. But contracting voice actors is expensive, as is having reporters read their stories. Book and newsletter authors, and even game developers, face similar challenges: the former increasingly turn to narrating their own work and the latter need to decide whether a particular character justifies recording costs. Those who use existing text-to-speech software save money but pay a different price by compromising on quality. ElevenLabs insists there is no longer a need to compromise as they prepare to equip creators and publishers with the most advanced and versatile AI storytelling tools.
“The platform we’re launching now is all about turning text into top-quality spoken audio. We want to let people enjoy their favorite book or newsletter by giving a voice to all the authors, creators and developers who couldn’t afford one” - says Mati Staniszewski, a co-founder. “Our ultimate goal is to let people enjoy any content they find relevant and interesting, regardless of what language they speak” - adds Piotr Dabkowski, also a co-founder.
At Credo Ventures we seek to work with smart and ambitious founders from the CEE region. We saw the hunger and spark in Mati’s and Piotr’s eyes at our very first meeting. A few months later they’re becoming an OpenAI-grade speech technology research hub overcoming the biggest challenges in artificial audio. Their synthesized voices are already indistinguishable from real ones and this breakthrough has not only massively lowered the barriers to generating content in unprecedented quality and fidelity, but soon enough it will also let creators radically expand their audiences by going multilingual. – says Maciek Gnutek, General Partner at Credo Ventures.
Despite being commonplace across entertainment & business alike, audio has been relatively neglected by recent advancements in research. We couldn't be more excited to be backing Mati & Piotr during this golden era for generative AI, and believe ElevenLabs are the team to bring this technology to the masses, one voice at a time - says Oliver Kicks, Principal at Concept Ventures.
ElevenLabs is a research company developing AI voice synthesis software for creators and publishers. The company’s tools render remarkably lifelike speech and can adjust intonation and inflections based either on context or user instruction. The company’s platform seeks to provide the necessary quality and versatility to become a one-stop-shop for voicing news, newsletters, books and videos. Key features include: text-based speech generation, voice cloning, voice design and, soon, project workflow for narration editing. ElevenLabs was founded in 2022 by Piotr, an ex-Google machine learning engineer, and Mati, an ex-Palantir deployment strategist. The company’s long-term goal is to make spoken content universally accessible in any language and voice.
Venture capital provided by Credo Ventures and Concept Ventures. Individual investors include Peter Czaban, Tytus Cytowski, Talfan Evans, Dr Fatima Godall, Tomasz Karwatka, Piotr Karwatka, Akhil Paul, Bartek Pucek, Marta Pyrzyk, Carles Reina, Parin Shah, Charlie Songhurst and Harry Songhurst.