A month after giving ChatGPT a voice to go with its mostly intelligent mind, OpenAI has announced its Voice Mode will be delayed. Instead of gatekeeping the OpenAI Voice Mode release, the company had announced that the capabilities would be available to all users with premium subscribers getting primary access, which was exciting for users who wanted to try conversing with the AI themselves. The announcement during their Spring Update said the Voice Mode would be “coming soon,” but we didn’t have a clear picture of just when it would arrive. 

When it seemed like a late June release was imminent, the GPT-4o Voice Mode was delayed with the intention of “improving the model’s ability to detect and refuse certain content.” The company also appears to be working on the user experience and preparing its infrastructure “to scale to millions while maintaining real-time responses.”

OpenAI voice mode release

Image: OpenAI Logo

OpenAI Voice Mode Delay—Do We Say Bye to Sky Permanently?

News of OpenAI’s Voice Mode being delayed doesn’t come as a shock, particularly because the feature has been rife with contention from the public ever since its announcement. Many fans were excited for OpenAI’s Voice Mode to be released because it gave the chatbot a very lifelike quality and showcased a degree of responsiveness that was unprecedented. The performance of the five voice actors was more natural than anything we’d seen with assistants like Alexa and Siri so far, but the most prominent detail lay in how the assistant was capable of reacting to the world around it in real time.

However, not everyone was as taken by ChatGPT’s new ability to speak. Viewers found themselves spooked by the AI with one question on their mind—Why did we need a lifelike chatbot? OpenAI has always been less about the “Why” and more about the “Why Not,” so they weren’t dissuaded from launching the feature, but the ChatGPT voice assistant does seem to be facing some issues.

The company had plans to introduce the feature to a small group of ChatGPT Plus users in late June but they’ve now asked their customers to hold on for another month before this happens. The strategy to start with a limited group remains unchanged but the company should be ready to launch the feature for its wider audience in the fall. “We are also working on rolling out the new video and screen sharing capabilities we demoed separately, and will keep you posted on that timeline,” the announcement stated. 

We understand the decision to delay OpenAI’s Voice Mode to ensure a complete product is released, as safety is paramount with AI advancements. Misuse of AI has long been a topic of conversation and OpenAI has often been accused of being too lax with the implication of the technology they were developing. Despite the narrowing competition, instead of rushing out a feature, the company is better off working out all the details first. There is also a possibility that the postponement of the OpenAI Voice Mode release has something to do with their legal conflicts over allegedly imitating an actress’s voice for the AI.

GPT-4O voice mode delay

Image: An employee at OpenAI tests the GPT-4o

OpenAI News—The Conflict-Ridden Debut of the ChatGPT Voice Assistant 

Issues with the ChatGPT voice assistant grew when users wondered whether the “Sky” voice belonged to the actress Scarlett Johansson, who had voiced the role of “Samantha,” the digital companion in the movie Her. The actress soon announced her decision to sue the company, stating that the voice did not belong to her. She claimed that despite turning down the company’s request to voice the AI, it went forward and found a voice assistant who could replicate her voice for them. 

OpenAI CEO Sam Altman denied the claims and refuted any intentions to imitate her, but the Sky voice was taken down nonetheless “out of respect for her concerns.” The company’s denial that the OpenAI Voice Mode was to be released to duplicate her, was hampered by the CEO’s decision to tweet out a single word “Her” after the voice feature was announced. 

The implied connection with the movie Her was there for everyone to see, but OpenAI doubled down on its denial with a lengthy blog post on the selection of the voice actors from more than 400 submissions that were sent in—all before the company reached out to the actress. The GPT-4o Voice Mode delay could be linked to some conflict with regard to this issue, but it feels unlikely. Considering OpenAI’s prompt response in taking down the specific Voice Mode assistant, there is no longer anything that the company could be required to do other than provide monetary compensation. 

ChatGPT voice assistant issues

Image: Some fans are frustrated by the delay, forced to wait longer after the initial announcement.

The OpenAI Voice Mode Delay Is Not a Permanent End

Regardless of how the ChatGPT voice assistant issues came to be, a legal battle between the company and the actress is bound to set a strong precedent for how such cases are to be dealt with for years to come. This is far from the one legal issue the company is facing. Among others, the New York Times filed a copyright infringement lawsuit against the company last year, and the case is still ongoing. More recently, Elon Musk also decided to sue the AI giant for abandoning its founding mission of serving humanity in favor of profits. That last case was withdrawn quite quickly. 

As AI grows to be more prevalent we’ll witness more such conflict between the AI exporters and the general public, which will define how further progress is made with AI. The legal battle will not be enough to put a permanent end to the OpenAI Voice Mode release, and fans will be able to experience the feature as soon as it is ready.

We already have evidence of things moving forward at the company as we saw other OpenAI news that had nothing to do with the Voice Mode delay. The company announced that a ChatGPT for macOS was finally available for all users, paid or otherwise. Microsoft and Google have tried to capitalize on the AI market segment for desktop users and while ChatGPT has its browser version that works just fine on any device, the desktop app will help them be better integrated into the desktop experience.