Biggest ChatGPT Update Is Here: DALL·E 3, Voice-Chat, Image Input!

When OpenAI unveiled ChatGPT, it took the world by storm. Ability to converse to an intelligent system felt almost like a dream. As time went on we got a big update in the form of GPT-4 but some of the features unveiled in the keynote were missing. But the dream features are finally here with the release of DALL-E 3, voice-chat and image inputs. are being rolled out over the next two weeks.

DALL-E 3 ChatGPT Integration

Let’s start off with one of the biggest updates that is coming to ChatGPT in the next couple of weeks and that is the direct integration of DALL-E 3, OpenAI’s next generation image model. We were expecting a major update to DALL-E 2 as other image models, such as StableDiffusion, Midjourney had easily caught up and surpassed the model from OpenAI.

From OpenAI’s blog we got a sneak peek at what are some major upgrades that are with DALL-E 3:

DALL-E 3 can significantly pick up more details and nuances compared to DALL-E 2.
Less prompt engineering required compared to other models. Making it easier to get the image you want.
Highly accurate and photorealistic images compared to the previous model. But will still need significant testing to compare to the best models out there.
Can now display text within images.
Direct integration of ChatGPT allows users to easily tailor images that are more suited to them. This is probably the biggest advantage DALL-E 3 will have compared to the competition.
Does not generate violent, hateful, adult images.
Declines request to display images of public figures.
Copyright free, allowing anyone to use, modify and sell images generated by DALL-E 3.

Below are some of the examples of image created using DALL-E 3 and their associated prompts. Thanks to Logan for providing the images.

“*A medieval fantasy landscape, rolling green fields with a bright blue sky filled with puffy white clouds. On top of one of the clouds is a castle rising from it. A knight races across the field.”*

“*Hyper realistic photo of Amsterdam from a helicopter*.”

*“A visual representation of Schrödinger’s cat paradox, where the cat is both alive and dead, inside a box made of quantum particles.*“

Voice-Chat

ChatGPT is getting voice-chat feature, allowing users to go back-and-forth without the need of constant pressing a send button. Voice-chat feature is being rolled out to ChatGPT Plus and Enterprise users in the next two weeks. This feature is exclusive to mobile devices (iOS and Android).

On your mobile device head over to Settings > New Features and enable voice conversations. This will enable you to talk to ChatGPT and you can enable a different voices. Currently there are ‘Juniper’, ‘Sky’, ‘Cove’, ‘Ember’ and ‘Breeze’.

Behind the scenes, the audio is being converted to text using Whisper (OpenAI’s own open-source speech-text model), and put through GPT-4. Which isn’t that new but the major difference is the Text-to-Speech (TTS) model, which is a very high quality. We have seen similar TTS models from ElevenLabs and Microsoft, although ElevenLabs does offer multiple language support.

Image Inputs

ChatGPT is finally getting the image inputs which were unveiled earlier this year in March. Although not a ‘new’ feature but image inputs is going to make ChatGPT a true multi-model system.

You can upload an image directly to ChatGPT and ask questions related to it. In an example provided by OpenAI (see tweet below), a user uploads an image of a bike and asks how to lower the seat. It provides a step-by-step instruction on how to do so. But the thing that makes this amazing is that you can upload image after image and each time ChatGPT will adjust to your needs based on the questions.

As the general model is still GPT-4, it is going to refuse a lot of queries and still hallucinate. So avoid asking medical related questions and blindly following it.

ChatGPT can now see, hear, and speak. Rolling out over next two weeks, Plus users will be able to have voice conversations with ChatGPT (iOS & Android) and to include images in conversations (all platforms). https://t.co/uNZjgbR5Bm pic.twitter.com/paG0hMshXb
— OpenAI (@OpenAI) September 25, 2023

These are some incredible features which are being rolled out to ChatGPT Plus and Enterprise users over the coming weeks and personally for me, I simply can’t wait to see the incredible things people are going to be creating with these.

Biggest ChatGPT Update Is Here: DALL·E 3, Voice-Chat, Image Input!

DALL-E 3 ChatGPT Integration

Voice-Chat

Image Inputs

ChatGPT-4 Turbo, Custom GPTs: The Most Advanced LLM Gets Big Upgrade!

Leave a Reply Cancel reply

DALL-E 3 ChatGPT Integration

Voice-Chat

Image Inputs

Similar Posts

Leave a Reply Cancel reply