Breaking Silence: ChatGPT evolves with voice and image search abilities and gets vocal thanks to OpenAI
ChatGPT, the widely-loved AI assistant, is leveling up! OpenAI has exciting news today as they’re adding voice and image capabilities to ChatGPT.
This incredible AI tool, which skyrocketed to popularity just nine months ago, initially helped users generate essays, poems, and summaries from text-based prompts. Now, it’s taking interactivity to a whole new level, allowing users to engage in voice conversations with ChatGPT.
This announcement coincides with Amazon’s $4 billion investment in OpenAI rival Anthropic. It’s all part of a larger battle in the world of generative AI, with tech giants like Google, Meta, and Microsoft vying for supremacy.
A Conversation Revolution
Today marks a significant milestone in the generative AI world. OpenAI is merging the familiar realm of voice-based assistants with its powerful large language models (LLMs).
Imagine asking ChatGPT to craft a bedtime story on the spot with just a few vocal cues to guide it. Or simply pose a question, and ChatGPT responds in spoken words.
Additionally, ChatGPT users will soon be able to search for answers using images. You can upload a picture and ask ChatGPT to explain it or provide instructions for a specific task.
The Power of Voice
The voice feature is driven by a new text-to-speech model that creates human-like voices from text and a few seconds of recorded speech.
OpenAI collaborated with professional voice actors to develop five distinct voices. They use the open-source Whisper speech recognition system to transcribe spoken words into text.
Spotify joins the party as a launch partner, offering a nifty feature for podcasters. It allows them to translate their shows from English into Spanish, French, or German while retaining their unique voice.
However, OpenAI is being cautious and limiting this technology to select podcasters, including Dax Shepard, Monica Padman, Lex Fridman, Bill Simmons, and Steven Bartlett.
OpenAI acknowledges the incredible potential of this voice technology for creativity and accessibility but also recognizes the risks, such as impersonation and fraud.
These exciting features will roll out to paying Plus and Enterprise subscribers in the next two weeks. To activate voice features, users can head to the “settings” menu, select “new features,” and opt into voice conversations. Then, simply tap the headphone icon in the top-right corner and choose your preferred voice.
Initially, voice will be available as an opt-in beta on the ChatGPT Android and iOS apps, while image search will become the default on all platforms.