Published on 02:34 PM, March 31, 2024

OpenAI unveils Voice Engine, voice-cloning AI model

Preliminary results indicate that the model can effectively replicate the voice of a speaker based on a brief audio sample. Image: Collected

OpenAI, the company behind ChatGPT, has recently unveiled insights from a small-scale preview of Voice Engine, a voice-cloning AI model. As per OpenAI, this model utilises text input alongside a single 15-second audio sample to generate natural-sounding speech that closely resembles the original speaker.

According to OpenAI's blog, Voice Engine was initially developed by the company in late 2022. Since then, it has been utilised to power preset voices available in the text-to-speech API, as well as features like ChatGPT Voice and Read Aloud. 

OpenAI has conducted small-scale tests to gather insights into the performance and capabilities of Voice Engine. Preliminary results indicate that the model can effectively replicate the voice of a speaker based on a brief audio sample. Additionally, Voice Engine demonstrates the ability to infuse generated speech with 'human-like' emotion and realism.

However, despite the promising potential of Voice Engine, OpenAI is adopting a cautious approach to its broader release, due to concerns surrounding the potential misuse of synthetic voices. OpenAI, in their blog, acknowledged the societal implications of deploying such technology at a mass scale, and before publicly releasing it, wants to adopt a more informed approach. As such, details regarding a possible release date have not been announced yet.