ChatGPT sounds more human than ever.
A Monday event from ChatGPT-maker OpenAI revealed the next stage of AI: Emotionally expressive technology that adapts to users’ needs.
The big announcement at the event was a new AI model called GPT-4o, which the company says can figure out how you’re feeling from a selfie, tutor kids through math problems, and talk to multiple people without lag.
It can even handle being interrupted in the middle of a sentence and carry out real-time translations.
GPT-4o makes ChatGPT sound like a friend — a super friendly one. At one point, it said, “Wow, that’s a nice shirt you’re wearing,” in live demonstrations, without a text or verbal prompt.
Say hello to GPT-4o, our new flagship model which can reason across audio, vision, and text in real time: https://t.co/MYHZB79UqN
Text and image input rolling out today in API and ChatGPT with voice and video in the coming weeks. pic.twitter.com/uuthKZyzYx
— OpenAI (@OpenAI) May 13, 2024
The new model unites text, vision, and audio into one platform and can switch through seamlessly, as demonstrated by demos at the event.
In one live demo, ChatGPT sounded emotive using a singing voice, a robotic voice, and a dramatic voice, when talking to Mark Chen, OpenAI’s head of frontiers research.
OpenAI just announced “GPT-4o”. It can reason with voice, vision, and text.
The model is 2x faster, 50% cheaper, and has 5x higher rate limit than GPT-4 Turbo.
It will be available for free users and via the API.
The voice model can even pick up on emotion and generate… pic.twitter.com/X8zqN9bxFp
— Lior⚡ (@AlphaSignalAI) May 13, 2024
In another demo, this one by OpenAI post-training team lead Barret Zoph, ChatGPT acted like a tutor. Zoph turned his camera around and had ChatGPT help him with a linear equations problem. The bot even explained why math mattered in the real world.
“The best thing about GPT-4o is that it brings GPT-4 level intelligence to everyone, including our free users,” OpenAI CTO Mira Murati said, pointing out that more than 100 million people use ChatGPT. “We have advanced tools that have only been available to free users, at least until now.”
Murati said that GPT-4o will roll out to free and paid users in the coming weeks. Paying users will have up to five times the capacity limit of free ones.
All users can now upload screenshots, photos, and documents to start conversations with ChatGPT. The AI will also respond more quickly, in 50 different languages, and can perform advanced data analysis.
“We want to be able to bring this experience to as many people as possible,” Murati said.
OpenAI CTO Mira Murati. Photographer: Philip Pacheco/Bloomberg via Getty Images
GPT-4o is an improvement on OpenAI’s previous GPT-4 Turbo model, which it announced in November. GPT-4o is twice as fast and half as expensive as Turbo.
Related: OpenAI Develops Custom 1930s AI Bot For Met Gala Exhibition
App developers can also use the new model to make custom AI apps.
ChatGPT is also getting a new desktop app and a simpler, refreshed look.
Murati stated at the event that it was “quite challenging” to bring new technology to the public in a safe and useful way.
“GPT-4o presents new challenges for us when it comes to safety because we’re dealing with real-time audio, real-time vision,” Murati said.
Related: OpenAI Demos Voice Engine, But Not Ready for Wide Release
According to Murati, OpenAI is working with governments, the media, and other entities to deploy the technology safely in the coming weeks.
OpenAI has just demonstrated its new GPT-4o model doing real-time translations ? pic.twitter.com/Cl0gp9v3kN
— Tom Warren (@tomwarren) May 13, 2024
OpenAI’s spring update event on Monday occurred one day before Google’s I/O event for developers.
OpenAI CEO Sam Altman refuted reports that OpenAI would release a Google search competitor ahead of the event. Altman wrote in a post on X, formerly Twitter, on Friday: “not gpt-5, not a search engine, but we’ve been hard at work on some new stuff we think people will love! feels like magic to me.”