OpenAI launches GPT-5, the next-gen AI behind ChatGPT
OpenAI rolled out GPT-5 on Thursday, marking a major upgrade for ChatGPT users. This is OpenAI’s first “unified” model, blending top reasoning skills from their o-series with the quick reaction times of the GPT line.
GPT-5 isn’t just smarter—it handles tasks directly, from coding full apps to managing calendars and creating research briefs. Its new real-time router picks the fastest or smartest response mode automatically, ditching manual settings for users.
OpenAI CEO Sam Altman called GPT-5 “the best model in the world” and a “significant step” toward artificial general intelligence (AGI), saying:
“Having something like GPT-5 would be pretty much unimaginable at any previous time in history.”
Starting now, all ChatGPT free users get GPT-5 by default. OpenAI’s VP of ChatGPT Nick Turley says this brings advanced AI reasoning to free users for the first time.
“This is just one of the ways that I’m excited to live the mission, making sure that this stuff actually benefits people,” Turley said.
More than 700 million people use ChatGPT weekly—about 10% of the world. GPT-5’s performance is under close watch by tech giants, investors, and regulators, given its potential impact.
GPT-5 edges out rivals in coding, science, and health
OpenAI claims GPT-5 beats Anthropic, Google DeepMind, and Elon Musk’s xAI on key benchmarks but falls behind in some areas.
On the SWE-bench Verified coding test, GPT-5 scores 74.9%, surpassing Anthropic’s Claude Opus 4.1 at 74.5% and DeepMind’s Gemini 2.5 Pro at 59.6%.
On the Humanity’s Last Exam, GPT-5 Pro scores 42% with tools, a bit lower than xAI’s Grok 4 Heavy at 44.4%.
GPT-5 Pro aced the GPQA Diamond science test with 89.4%, beating Claude Opus 4.1 (80.9%) and Grok 4 Heavy (88.9%).
HealthBench Hard Hallucinations test shows GPT-5 hallucinates just 1.6% of the time, down from 12.9% and 15.8% on older GPT-4o and o3 models. This reduces misinformation around health topics.
Creatively, GPT-5 “responds more naturally” and has “better taste” in design and writing, says Turley:
“The vibes of this model are really good.”
GPT-5 also cuts hallucinations on ChatGPT prompts to 4.8%, down from 20.6%-22% in prior OpenAI models.
On the Tau-bench tasks testing real-world web navigation, GPT-5 scores 63.5% on airline websites (slightly behind o3’s 64.8%) and 81.1% on retail sites (behind Claude Opus 4.1 at 82.4%).
Safety improved, too. OpenAI safety lead Alex Beutel says GPT-5 lies less and better spots users trying to misuse ChatGPT:
“This improves not only the safety of GPT-5, but also the user experience, creating a model that’s more ‘transparent and honest in ways users can trust.’”
User and developer upgrades
ChatGPT users now pick from four new personalities—Cynic, Robot, Listener, Nerd—that tweak responses without extra prompts.
Plus tier subscribers get bigger limits for GPT-5. Pro plan users ($200/mo) unlock unlimited GPT-5 and access GPT-5 Pro, which uses more compute for better answers. Enterprise, Edu, and Team plans get GPT-5 default starting next week.
Developers get GPT-5 API in three sizes—gpt-5, gpt-5-mini, and gpt-5-nano—with control over response length. Pricing is $1.25 per million input tokens and $10 per million output tokens.
OpenAI also released gpt-oss, a free open-weight reasoning model with near o3/o4-mini power, but GPT-5 still sets a frontier standard in some areas like coding.
Benchmarks paint a mixed picture overall. GPT-5 pushes limits in some tests but sits on par with other top AI in others. The real test is how devs and users apply it in the wild.
Image Credits: OpenAI
Image Credits: OpenAI