OpenAI can recreate human voices, but won't release the technology yet

OpenAI can recreate human voices, but won’t release the technology yet

OpenAI recognizes that the technology could cause problems if it were released widely, so it initially tries to work around those problems with a set of rules. Since last year, it has been testing the technology with a select set of partner companies. For example, video synthesis company HeyGen uses this model to translate a speaker’s voice into other languages ​​while maintaining the same vocal sound.

To use Voice Engine, each partner must agree to the terms of use which prohibit “impersonation of another person or organization without consent or legal right.” The terms also require that partners obtain informed consent from the people whose voices are cloned, and they must also clearly disclose that the voices they produce are generated by AI. OpenAI also embeds a watermark in each voice sample that will help trace the origin of any voice generated by its Voice Engine model.

So, as things currently stand, OpenAI is presenting its technology, but the company is not yet ready to put itself (yet) in danger in the face of the potential social chaos that wide distribution could cause. Instead, the company recalibrated its marketing approach to seem like it was responsibly warning us all about this already existing technology.

“We are taking a cautious and informed approach to wider release due to the potential for misuse of synthetic voice,” the company said in a statement. “We hope to begin a dialogue about the responsible deployment of synthetic voices and how society can adapt to these new capabilities. Based on these conversations and the results of these small-scale tests, we will make a more enlightened on whether and how to deploy this technology on a large scale.

In line with its mission to carefully deploy technology, OpenAI provided three recommendations in its blog post for how society should change to adapt to its technology. These steps include phasing out voice authentication for bank accounts, educating the public to understand “the possibility of deceptive AI content” and accelerating the development of techniques to track data. origination of the audio content, “so that it is always clear when you” are interacting with a real person or with an AI.

OpenAI also says that future voice cloning technology should require verifying that the original speaker is “knowingly adding their voice to the service” and creating a list of voices that are prohibited from cloning, such as those that are “too similar to eminent personalities. “This type of filtering technology could end up excluding anyone whose voice might naturally and accidentally sound too close to a celebrity or U.S. president.

Technology developed in 2022

According to the company, OpenAI developed its Voice Engine technology in late 2022, and many people are already using a version of the technology with predefined (not cloned) voices in two ways: Spoken conversation mode in the ChatGPT app has been released . in September and OpenAI’s text-to-speech API which debuted in November last year.

With all the competition in voice cloning, OpenAI says Voice Engine stands out because it’s a “small” AI model (we’re not sure how small it is). But having been developed in 2022, it seems almost late to party. And its cloning ability may not be perfect. Previous user-trained text-to-speech models, like those from ElevenLabs and Microsoft, have struggled with accents that weren’t part of their training data set.

For the moment, Voice Engine remains a version limited to certain partners.

This story was originally published on Ars Technica.

US supports Novonix graphite factory

US supports Novonix graphite factory

BREAKING: Australian Cyber ​​Security Act Passed

BREAKING: Australian Cyber ​​Security Act Passed

Leave a Reply

Your email address will not be published. Required fields are marked *