ChatGPT can now see, hear and speak: Here are the revolutionary innovations

TechnoPixel — 1 year ago

2 min reading time

Most of what OpenAI has done on ChatGPT so far has been about what AI can do. More advanced model, safer and intuitive flow etc. Things have been added to ChatGPT by the company and continue to be added. However this...

READING NOW ChatGPT can now see, hear and speak: Here are the revolutionary innovations

OpenAI is rolling out its new version that allows you to direct the AI bot not just by typing sentences into a text box, but by speaking out loud or simply uploading an image. According to OpenAI, the new features will be available to ChatGPT Plus subscribers within the next two weeks. The free version and everyone else will have this feature “soon.”

voice chat

On the other hand, we can say that the voice chat part is quite familiar. ChatGPT will be possible just like Alexa, Cortana, Google Assistant or Siri. You tell the artificial intelligence what you want with just the touch of a button. ChatGPT converts this into text and feeds it into the large language model, gets a response, converts it back into speech and relays the response back to you out loud.

This will actually be something we will hear frequently in a short time. We can say that OpenAI is just a pioneer. Because it seems like most virtual assistants are being rebuilt to be based on large language models. In a short time, we will actually start carrying virtual assistant artificial intelligence such as ChatGPT on our phones.

OpenAI’s outstanding Whisper model does most of the speech-to-text work, and the company is rolling out a new text-to-speech model that it says can produce “human-like voice from just text and a few seconds of sample speech.” You’ll be able to choose ChatGPT’s voice from five options, but OpenAI thinks the model has much more potential than that. OpenAI is working with Spotify to translate podcasts into other languages, for example, while preserving the original audio. There are many interesting uses for synthetic voices, and OpenAI could be a big part of this industry.

Voice synthesis comes with risks

Being able to create a capable synthetic voice with just a few seconds of audio opens the door to all kinds of anxiety-provoking situations. “These capabilities introduce new risks, such as the potential for malicious actors to impersonate public figures or commit fraud,” the company says in a blog post announcing the new features. OpenAI says the model is not suitable for broad use for this very reason: It will be much more controlled and limited to specific use cases and partnerships.

Visual support

By the way, the new visual search is actually a bit like Google Lens. You take a photo of what you’re interested in, and ChatGPT tries to understand what you’re asking about and responds accordingly. You can also use the app’s drawing tool or write questions with the image to clarify your query.

Frankly, visual search also has potential problems. One of them is about what can happen when you ask a chatbot about a person: OpenAI says ChatGPT intentionally limits its “ability to analyze and make direct statements about people” for both accuracy and privacy reasons.

With this new release, OpenAI is trying to create safer artificial intelligence by deliberately limiting what its new models can do. But this approach won’t work forever. As more people use voice control and visual search, and ChatGPT moves closer to being a truly multi-modal, useful virtual assistant, it will become increasingly difficult to keep the guardrails built there.

Tags ChatGPT Openaı Usage Voice

Comments

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

ChatGPT can now see, hear and speak: Here are the revolutionary innovations

Artificial Intelligence Era in Microsoft 365: New Features and Pricing

Artificial Intelligence Features and Price Changes in Microsoft 365 Subscriptions

Free Games This Week on Epic Games Store: Deceive Inc. and Apex Legends Ash Pack

Artificial Intelligence Features Added to Microsoft Notepad

Leave a Comment