The new models of Openai O3 and O4-Mini AI can now “think with images”

by admin 18 April 2025

written by admin 18 April 2025

The CEO of Openai, Sam Altman. Image: Creative Commons

Openai has deployed two new models of AI, O3 and O4-Mini, which can literally “think with images”, marking a big step forward in the way the machines include images. These models, announced in an Openai press release, can reason on images in the same way that they make text – cropping, zoom and rotating photos in the context of their internal thought process.

At the heart of this update is the possibility of mixing visual and verbal reasoning.

“”OPENAI O3 And O4-Mini represents an important breakthrough of visual perception by reasoning with images in their chain of thought, “said society in its press release. Unlike past versions, these models are not based on separate vision systems – they rather mix image tools and text tools for richer and more precise responses.

How does “reflection with images” work?

The models can crop, zoom, turn or turn an image in the context of their reflection process, just like humans. They not only recognize what is in a photo, but work with him to draw conclusions.

The company notes that “Chatgpt Improved visual intelligence helps you solve more difficult problems by analyzing images more in -depth, with precision and reliably than ever. »»

This means that if you download a photo of a handwritten mathematical problem, a blurred sign or a complicated graph, the model can not only understand it, but also decompose it step by step – perhaps even better than before.

Surpass previous models in key references

These new capacities are not only impressive in theory; OPENAI says that the two models surpass their predecessors concerning the best academic and AI benchmarks.

“Our models have established new peak performance in questions and answers of STEM questions (MMMU, Mathvista), reading and reasoning of graphics (Charxiv), primitives of perception (VLMS are blind) and visual research (V *),” noted the company in a press release. “On V *, our visual reasoning approach reaches 95.7%precision, largely resolving the reference index.”

But the models are not perfect. Openai admits that models can sometimes think too much, leading to manipulation of prolonged and unnecessary images. There are also cases where AI could misinterpret what it sees, despite the correct use of tools to analyze the image. The company also warned against reliability problems when testing the same task on several occasions.

Who can use Openai O3 and O4-Mini?

From April 16, O3 and O4-Mini are available for Chatgpt Plus, Pro and Team users; They replace older models like O1 and O3-Mini. Users of the company and education will have access next week, and free users can try O4-Mini via a new “think” feature.

Source Link