Speech synthesis – the process of on-demand, artificial way of translating your speech into any global language has been here for a long time now. The process, however, is not capable to match the natural voice interaction like the Communicator in Star Trek. BabelOn, a San Francisco based startup has come up with a new innovation in speech synthesis technology. With the help of this innovation, you can translate anything that you speak in your own voice.

‘Speak Local Sound Global’ – the punch line the voice synthesis startup uses to promote its natural voice translation device. There are many devices that translate your words into any other language. All these competitor translators lend you only seven percent of the language translation output. Language communication is only seven percent about words and 93 percent about vocal, emotional, and visual elements.


BabelOn groundbreaking speech synthesis technology is capable of capturing human qualities, both verbal and non-verbal, to make communication complete. The technology makes sure to keep the crucial non-verbal communication elements intact. The natural quotient of voice, the emotional dynamics, and facial animation – BabelOn hardware and software platform carries out a mere translation of all these aspects.


Imagine you are at a party in China listening to your favourite pop star singing flawlessly in Mandarin. Or you are at an unknown place understanding the culture there, listening to the people with the same original feel, but in your native language. This is how BabelOn feels like on practical use.

How does the voice synthesis tech work?

Users can use the voice translation tech directly in real-time once the software analyses the voice attributes. The process takes place in three steps –

1.       BLIP will store audio fingerprint

You need to create the BabelOn Information Profile (BLIP). The hardware in your profile captures your voice in varied contexts. From this comprehensive linguistic sample, the tech learns your audio fingerprint with all the voice attributes. You can then use the voice synthesis system in real-time whenever you feel the need to curate your voice.

2.       Record the content that needs to be synthesized

Whatever content, an actor performing on stage, a singer singing, the characters in a videogame, the voice synthesis platform can be incorporated in the recording process to obtain the output in the desired language.

3.       Voice Modulation and non-verbal dynamics

To make sure that communication is complete; the audio engineers use studio space to synthesize speech and render the non-verbal attributes to match. You can then have the natural-like interaction in any other language.

Though the SF-based startup’s tech is yet not widely implemented, the idea of this voice synthesis process came up in 2004. Co-founder Daisy Hamilton’s parents felt that there was a constant demand for dubbing in the industry, but there was lack of proper voice synthesis process. So, they came up with BabelOn back then and received its patent. But at that point of time, there was no technology around that could turn the ideology for real-time use.

Creating your BLIP would cost around $5000. Moreover, to translate a full song would make you drop around $50,000. The voice translation tech will enable wide application in many industries like film and game voice-overs, artificial assistants in the phone market, medical services, and preservation of dynamic voices for future re-use. The company has plans to initially target the film industry, the voice-over industry, and the AI market where the tech is in high demand currently.