- The Neuron
- Posts
- šŗOpenAI releases their best models yet
šŗOpenAI releases their best models yet
PLUS: NVIDIA's loss and OpenAI's new buy?!


Welcome, humans.
Yesterday was Day 3 of OpenAIās ship week, and TBH, weāre actually shocked OpenAI released o3 and o4 mini (more on that below) because for a minute there, this meme was starting to define all their recent releases:
IDK if yāall noticed, but this rampant personality injection has even made its way into the feedback. Example: Our editor Corey spotted new questions ranking OpenAIās āpersonalityā instead of whether or not a result was good or badā¦

Thatās why itās nice to see OpenAI pop out and show people they still got it with todayās launch. We were getting worried weād have to start promoting āSave the AIāāa hilariously dark PETA-but-for-AI organization that calls out our reckless human habits for literally killing AI.
Every glass of water you drink? That's precious coolant stolen from data centers! Those long hot showers? You're basically waterboarding ChatGPT! And don't even think about turning on lightsāAI needs that electricity to think deep thoughts about your cat photos.
The solution? Simple! Just sit thirsty in the dark with your dead phone. AI needs those resources more than you do, you selfish human!
Hereās what you need to know about AI today:
OpenAI released o3 and o4-mini, their best reasoning models yet.
Google used AI to stop ad scammers.
OpenAI might buy Windsurf.
NVIDIA lost $5B to export restrictions on China.

OpenAI releases the full o3 and new o4-mini⦠their best models yet.
OpenAI's newest modelsāo3 and o4-miniājust launched yesterday, and theyāre quite the pair, combining strategic reasoning with powerful tool integration.
Unlike previous AIs that simply pattern-match, OpenAI says these systems actively strategize to solve complex problems. And best of all, theyāre actually more efficient.
We put together a comprehensive analysis of the launch here.
Hereās the highlights:
Available now to ChatGPT Plus, Pro, and Team users, with Enterprise access next week.
Both models outperform predecessors at lower costs.
Both models were available to ChatGPT Pro, Plus, Team subscribers and through APIs..
Safety evaluations show improved refusal capabilities while maintaining helpfulness.
The quality improvements extend to multiple areas:
Much better vision capabilities than the o1 teaser (e.g: research w/ vision).
Exceptional writing that's casual, but not too casualāso perfect for work.
First models with access to ALL OpenAI tools (web search, Python, image analysis, file search)āwhich represents a major step towards model āunificationā (a.k.a. GPT-5).
The benchmark results speak volumes:
98.4% accuracy on AIME 2025 math competition (o3 with Python).
99.5% accuracy on the same test (o4-mini with Python).
2700+ Codeforces ELO rating (both models)āplacing them among the world's top 200 competitive programmers.

Early testers report genuinely impressive applications. Early users report it's not just smart but practicalāfast enough for daily use and impressively accurate with facts. While some testers reported hallucinations, many find it more reliable than GPT-4o and o1 for similar tasks.For example, Dan Shipper used o3 to:
Flag conflict-avoidance patterns in meeting transcripts.
Create custom AI courses with daily reminders.
Analyze org charts to predict team strengths and weaknesses.
ā¦And a ton more. Ethan Mollick also got early access, and used o3 to crack a business case he teaches at Wharton, create SVG images through code alone, and write a hard sci-fi space battle.
But thereās some red flags, too. Transluce found that OpenAI's o3 model frequently fabricates actions it never performed (especially claiming to run code it cannot execute) and elaborately justifies these fabrications when questioned (good X thread on this).
Our take? On net, this entire week is a bullish signal for OpenAI finally releasing their next agent (A-SWE, or the research one, if not both), as these models are likely whatās going to power it under the hood. Itās also clear the crew is setting the stage for GPT-5.
As a reminder, by GPT-5 we mean the model thatāll finally clean up this mess:

As for our full take, we still need a bit more time to test them ourselves to compare, but it looks like Gemini 2.5 finally has a real competitor⦠read more on the site!
Oh, and btwāwe also decided it was worth giving o3 a sit down interview⦠because why not? Itās about time somebody did. Check it out here.

FROM OUR PARTNERS
Zero to success: What we got wrong before we got it right (live event)
Starting a company is one thing. Keeping it ALIVE is another. Vanta is hosting a webinar on April 24th, where you can join candid conversation withā¦
Shaan Puri (Co-Host of My First Million).
Chase Lee (Founder and CEO of Trustpage, acquired by Vanta).
Travis Good (Co-founder and CEO of Workstreet).
In this session theyāll cover:
The mistakes they made and how theyād avoid them today
Tactical advice for navigating building, hiring, and investor conversations
What technical and non-technical co-founders need to get right in the first few hires
Whether you're in your first year or gearing up for growth, this live webinar will be packed with real talk from founders whoāve been there.

Prompt Tip of the Day
IDK how people read this who are trying to go to grad school, but this is GREAT prompt advice on how to use AI to help you write better and actually get in from an ACTUAL grad admissions person. Rough draft w/ AI, then personalize it to keep it real.
For everyone else, if you missed OpenAIās 4.1 prompt cookbook from yesterday, plz check it out!

Treats To Try.
Kling 2.0 lets you control gen video using images and video clips instead of just text, with new editing features for adding, removing, or replacing elements (great demo in this video on using GPT for a base image + Kling to animate).
Mailgo automates finding and emailing prospects with AI, ensuring your messages actually reach inboxes instead of spam foldersāfree tier, then $15.20/month.
Cohere released Embed 4, which finds information in your business documents, no matter if they contain text, images, or tables, in any of 100+ languages.
Claude Research is like Claudeās Deep Research; it finds information across your internal documents and the web, automatically conducting follow-up searches to answer your questions completelyāavailable to Max, Team, and Enterprise plans in the US, Japan, and Brazil.
IBMās new Granite 3.3 model transcribes speech with industry-leading accuracy, solves complex math problems better than competitors, and includes special adapters that detect hallucinations in AI search resultsāfree under Apache 2.0 license (download here, try it here).
OpenAI also released the new open-source Codex CLI, which offers a lightweight terminal agent with robust security controlsāwatch this for more.
Infinite Reality builds immersive 3D websites and experiences that help your brand tell stories and engage audiences with no code tools.
Simular automates digital tasks on your computer through their benchmark-leading agentsācheck out S2 in particular (free to try).

Around the Horn.

Hereās a helpful chart that compares price versus performance on the LiveBench leaderboard for non-reasoning modelāGoogle is on the āPareto frontā, where āno other model is both cheaper and better at the same time.ā
OpenAI could be about to buy Windsurf (the popular AI coding tool) for $3B.
Microsoft released BitNet b.582BT, an open-source 1-bit AI model that runs efficiently on CPUs instead of GPUs.
NVIDIA says it lost $5.5B from canceled China chip orders and pledged compliance with U.S. export laws amid a congressional investigation.
Related: DeepSeek could also be specifically targeted.
Google suspended 39M+ advertiser accounts in 2024 (more than 3x the previous year), removed 5B+ bad ads, and used AI-powered systems to detect fraud signals at account setup before ads were served, resulting in a significant 90% decrease in reported deepfake ads.

FROM OUR PARTNERS
ā” Get 2x more done with AI-native email
Your inbox isn't just crowdedāit's stealing time from future you. While you're drowning in emails, Superhuman users are saving 4+ hours weekly with emails that write themselves and intelligent filtering that actually works. Used by 60% of Forbes AI 50 companies for a reason.

Thursday Trivia
One is real, and one is AI. Which is which? (vote below!)
A.

B.

Which is AI?The answer is below, but place your vote to see how your guess compares to everyone else (no cheating now!) |
Hereās the results from last weekās poll:

The robots took this one yāall.
Hereās what you said:
A.P chose A: āLooks just like the cartoon, but plug is not right at the wall/panel.ā
S.F. chose B: āThatās not what Tom & Jerry looked like.ā (I know, but old cartoons are weird).
G.W. chose A: āA has Tom and Jerry looking great, but the background is too modern for the show and the outlet/plug is off. B is an old school version.ā Nailed it!

A Cat's Commentary.


Trivia Answer: B is AI, and A is real.
![]() | Thatās all for today, for more AI treats, check out our website. The best way to support us is by checking out our sponsorsātodayās are Vanta and Superhuman. |

| ![]() |