With GPT-5.4, OpenAI Promises Fewer Errors, Preps for Autonomous Agents - PCMag Middle East

Welcome to the forefront of conversational AI as we explore the fascinating world of AI chatbots in our dedicated blog series. Discover the latest advancements, applications, and strategies that propel the evolution of chatbot technology. From enhancing customer interactions to streamlining business processes, these articles delve into the innovative ways artificial intelligence is shaping the landscape of automated conversational agents. Whether you’re a business owner, developer, or simply intrigued by the future of interactive technology, join us on this journey to unravel the transformative power and endless possibilities of AI chatbots.
Our team tests, rates, and reviews more than 1,500 products each year to help you make better buying decisions and get more from technology.
ChatGPT is getting another upgrade, and this time it’s moving up to GPT-5.4, just days after the release of GPT-5.3 Instant. OpenAI says this new release brings together its “recent advances in reasoning, coding, and agentic workflows into a single frontier model.”
Some of GPT-5.4’s biggest updates are coming to professional tools, such as improvements to AI-generated spreadsheets, documents, and presentations, but it also sees improvements across search and how you can interact with the chatbot.
One big change is OpenAI’s next step toward fully autonomous, agentic technologies. The model can use native computer resources, enabling tools to complete complex tasks across applications. GPT-5.4 can write code to operate computers, responding to mouse and keyboard commands depending on what’s captured in a screenshot. This means developers can better leverage GPT-5.4 to build agents that operate other services with limited human interaction.
A benchmark called OSWorld-Verified, designed to monitor AI’s ability to navigate desktop environments, found that GPT-5.4 scored 75%, up from 47.3% with its GPT 5.2 model. That also beats the average human result on the same benchmark of 72.4%.
Inside OpenAI’s chatbot, you’ll get the model as GPT-5.4 Thinking, which now lets you adjust an answer mid-response while it’s generating. If it misunderstands your question or you change your mind about the direction of a query, you can now interrupt to make it more closely aligned with an answer without having to start fresh. This feature is available now on Android and ChatGPT’s website, and it’s coming soon to the iPhone app.
GPT-5.4 also allows for deeper web research, which OpenAI says is particularly helpful for “highly specific queries, while better maintaining context,” especially when answering longer questions.
For professional tasks, OpenAI says it has finessed its ability to create and edit documents, making the files it generates easier to read. Its internal tests found that spreadsheets generated to emulate a junior investment banking analyst achieved a mean success rate of 87.3% with human raters.
OpenAI also says there will be fewer errors and hallucinations with the new model, calling GPT-5.4 its most factual model yet. It says, “On a set of de-identified prompts where users flagged factual errors, GPT‑5.4’s individual claims are 33% less likely to be false and its full responses are 18% less likely to contain any errors, relative to GPT‑5.2.”
The GPT-5.4 Thinking upgrades are rolling out now for Plus, Pro, and Team subscribers, replacing the GPT 5.2 Thinking model. The existing version won’t go away immediately, but it’ll be moved to the Legacy Models and then removed on June 5. There’s also a GPT-5.4 Pro option in the brand’s API for those with Pro and Enterprise plans. So far, there’s no word on whether free users will be able to use GPT 5.4 Thinking.
Separately, OpenAI has also introduced a dedicated tool, ChatGPT for Excel, to make it easier to plug your spreadsheets into its models. OpenAI says it’ll help you use Excel data from workbooks to “run scenarios, and generate outputs based on cells and formulas.”
Disclosure: Ziff Davis, PCMag’s parent company, filed a lawsuit against OpenAI in April 2025, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.
Read Our Editorial Mission Statement and Testing Methodologies.

I’ve worked at TechRadar, Android Police, T3, and more, where I broke many tech stories you may have read, including the return of the Motorola Razr when it first became a foldable phone. Based near London, I’ve appeared on BBC News, Al Jazeera, and other TV networks, podcasts, and radio shows as an expert on the latest tech stories and trends.

I’ve been a journalist for over a decade after getting my start in tech reporting back in 2013. I joined PCMag in 2025, where I cover the latest developments across the tech sphere, writing about the gadgets and services you use every day. Be sure to send me any tips you think PCMag would be interested in.
I’ve worked at TechRadar, Android Police, T3, and more, where I broke many tech stories you may have read, including the return of the Motorola Razr when it first became a foldable phone. Based near London, I’ve appeared on BBC News, Al Jazeera, and other TV networks, podcasts, and radio shows as an expert on the latest tech stories and trends.
Read full bio
is obsessed with culture and tech, offering smart, spirited coverage of the products and innovations that shape our connected lives and the digital trends that keep us talking.

source

ZoomYourWeb3

With GPT-5.4, OpenAI Promises Fewer Errors, Preps for Autonomous Agents – PCMag Middle East

Contact Us

Quick Links