When young Luke Skywalker appeared in the final episode of “The Mandalorian,” a spin-off series of the Star Wars films, viewers were left speechless. Skywalker looked like his 28-year-old self, despite being portrayed by 68-year-old U.S. actor Mark Hamill, who has played the character since 1977.
To make the iconic Jedi look younger, director Jon Favreau used a visual effect called de-aging. “Something people didn’t realize is that Skywalker’s voice also wasn’t real,” Favreau said. It was artificially synthesized by Ukrainian startup Respeecher.
Respeecher’s team of nearly 20 people worked for a Hollywood series from their small office in Kyiv. Ukrainian techies obtained 40-year-old recordings of Hamill’s voice, analyzed them using artificial intelligence (AI) and generated young Skywalker’s voice. Then they swapped the voice of an actor who read the script for the synthetic one they created.
“The fact that everyone was surprised that the voice wasn’t real means that we did everything right,” said Oleksandr Serdiuk, cofounder of Respeecher.
Apart from Lucasfilm, the production company behind Star Wars, Respeecher works with other big clients in Hollywood and smaller studios across the world. The startup reveals very few names due to non-disclosure agreements.
Although many startups, including the U.S. giants such as Replica Studios, Descript and Modulate, work with voice-altering technologies, only few have managed to produce a sound that satisfies the demanding film industry.
“The Respeecher’s team is still ahead of the game,” said Bas Godska, general partner at Acrobator Ventures, the Dutch investment fund that invested in the Ukrainian firm.
The voice-tech industry has great potential, Godska said. It is expected to grow at a 17% rate annually, attracting investment of nearly $27 billion by 2025.
“In 3–5 years it (voice conversion) will become a common tool in post-production,” said Sergii Soldatov, executive producer at Ukraine’s studio Eve Production.
Respeecher’s team believes its chances to win a big share of this market are high. “There is no company in Hollywood that makes the same quality of sound as we do,” Serdiuk said.
First steps
Serdiuk and his partner, Dmytro Bielievtsov, started to work on voice conversion, the technology that makes one voice sound like another, in 2016, when no other company offered such service, they say. At that time, Serdiuk and Bielievtsov worked as data analysts and were passionate about the possibilities of AI. Bielievtsov, a music devotee who plays four musical instruments, decided to apply his tech knowledge to work with sounds.
During one of the conferences about AI in Kyiv, Serdiuk and Bielievtsov met Grant Reaber, a U.S. tech enthusiast. Reaber studied computer sciences, machine learning and math for nearly 15 years, and he liked the idea of voice conversion. The three founded Respeecher in 2018.
Unlike most other voice-tech startups, which turn written text into speech, Respeecher trained its algorithms to process voice sounds only. It is more complicated because it requires analyzing real people’s voices, including different accents.
One of Respeecher’s first big projects was generating the voice of 37th U.S. president Richard Nixon. They created it to voice an undelivered speech Nixon prepared in case the Apollo 11 mission in 1969 would fail. The clip was used in U.S. short documentary film “In Event of Moon Disaster.”
“We created a highly realistic film, in a large part due to Respeecher’s work,” said film director Halsey Burgund, according to Respeecher’s website.
Nixon delivers a speech he prepared in case the Apollo 11 mission in 1969 would fail. This speech was artificially created for the movie “In Event of Moon Disaster” by the Ukrainian startup Respeecher.
Hollywood
Since its founding three years ago, Respeecher has worked on nearly 50 projects. The startup collaborated with Hollywood studios and produced commercials for big events.
For the 2021 Super Bowl, the annual championship game of the U. S. National Football League, Respeecher recreated the voice of legendary coach Vince Lombardi who died in 1970. As digital Lombardi delivered a motivational speech, his voice sounded just like it did 50 years ago.
Working on Lombardi’s audio materials was hard — they were old and damaged. But the ability to convert even poor-quality sound into realistic speech is what makes Respeecher attractive to big clients, Serdiuk said.
As of today, the startup works with its clients on a project-by-project basis but Ukrainians want to make the technology more automatic, according to Bielievtsov. It could have a great impact on the film industry, Respeecher’s investors said.
Ukrainian startup Respeecher recreated the voice of the legendary coach of the U.S. National Football League for the Super Bowl commercial aired in February 2021.
When an audio turns out to be flawed, it is expensive and time-consuming to bring actors back to the studio to re-record it, said Dionis Akulov, creative producer at To Be Production, Ukraine’s post-production company. With Respeecher, studios could fix the audio layer remotely.
“One day, Respeecher could become a part of a big production company,” said Roman Nikitov, co-head of venture capital at ICU, the fund that invested in Respeecher in 2020.
Respeecher has already received acquisition offers, but the startup refused them. “There are still many things we want to improve by ourselves,” Serdiuk said.
Despite the many advantages of the technology, some film industry experts are concerned it also could be dangerous. People could use voices without actors’ permission or modify them to harm someone’s reputation.
To avoid that, Respeecher gets written permission from actors whose voices are modified. It also wants to add watermarks to its audio, so that sound specialists could detect the modifications that the human ear can’t recognize.
“This is a new technology, so people are scared,” Serdiuk said. “But that is true for all innovations: We first see threats and then possibilities.”
Investors’ praise
Voice conversion is an expensive technology, according to Serdiuk. But many investors still want to put money into Respeecher. Since 2018, the startup has attracted over $2 million.
“We can imagine how far this development will go, so it was a no-brainer for our fund to invest,” Godska said.
Respeecher doesn’t disclose its revenue and market value but Serdiuk said that the startup is close to becoming profitable. It hasn’t yet become profitable since it reinvests all revenue in further tech development.
Investors keep supporting the firm because of high-profile clientele such as Lucasfilm and U.S. television network Telemundo. “We see that their product is in demand,” Nikitov said.
As of today, Respeecher’s technolody is too expensive for small studios. But the company plans to make it more affordable. That might also allow its usage in the translation of foreign movies.
Instead of dubbing, Respeecher can use an original actor’s voice and make it speak a foreign language.
“We can make Brad Pitt speak the perfect Ukrainian or Ukrainian actor — perfect English,” Serdiuk said.
The startup also works on introducing other more affordable services. It has just launched the so-called voice marketplace that sells copyrighted voices — real or artificially generated — that small companies can use in their own games or movies.
“We invest a lot in this technology,” Serdiuk said. “It will help all creative businesses to compete with big studios. So we’ll have the competition of ideas, not budgets.”
The video demonstrates Respeecher’s voice conversion tech, which allows one person to speak in the voice of another.