The Neuron
Posts
😺GPT Deep Research, tested

😺GPT Deep Research, tested

PLUS: o3-mini prompt tips from o3-mini...

Grant Harvey
February 04, 2025

Welcome, humans.

It looks like OpenAI’s money-hungry days might be behind it—SoftBank is willing to spend $3B per year on OpenAI products for itself and all its subsidiaries through a joint venture called SB OpenAI Japan.

The deal basically created a new tier of ChatGPT that SoftBank CEO Masayoshi Son called “Cristal Intelligence.” You get Operator, you get Deep Research, ChatGPT Enterprise, access to the API, and even custom models.

Interested? Just convince your CFO to drop $3B on a bunch of stuff you can basically get in ChatGPT Pro. Good luck!

Here’s what you need to know about AI today:

We tested Deep Research and share the results.
Meta outlined what AI systems it won’t open-source.
An AI-enhanced Beatles recording won Grammy.
Code tweak could cut data center energy by 30%.

Who researched it better…ChatGPT or Gemini?

On Sunday, OpenAI launched Deep Research (DR for short), but we barely had enough time to cover the announcement, let alone try it out. So today, we’re putting GPT Deep Research to the test in a head to head battle against Gemini Deep Research across 3 key prompts—and you can vote for the winner.

These prompts were hand-picked by you—and whenever possible, we ran them exactly as you submitted them.

In our best WWE voice: “Arrrre yoooou ready to reseaaarrrch ruuuumbbbble??!”

First prompt—”the product review”: “Research the best car I could buy for 20k.”

GPT DR’s result: Very thorough. It covered most aspects of what we suggested, but it was too long to read, so we asked it “which is the best” at the end.
Gemini DR’s result: Apparently Gemini researched 481 websites—the most we’ve EVER seen Gemini comb through—to get this answer.

Oh, and here’s a lifetime ownership cost analysis between a Tesla Model 3, Toyota Prius, and a Camry—in case you were curious.

Who researched the $20k car better?

Pick your favorite of the two answers below.

Next prompt—”the how to guide”: “Best ways to start and write cold emails in times of mailbox spams and repetitive, boring intros.”

GPT DR’s take: at first, it seemed like it was working—until we realized it was totally gaslighting us. After a bit of prompting, we got it to ACTUALLY work; but when we prompted it a second time, it went much better.
Gemini DR: this was pretty thorough, decent advice; although I will say the templates were pretty generic, and that doesn’t really address the pain point.

Btw, you could totally upload these answers into GPT and ask it to create custom templates for your specific needs—just saying!

Who researched cold sales email better?

Pick your favorite below

Prompt #3—“the rabbit hole”: “Research multiverse theories and draw a conclusion about whether the current state of research and theory suggests it may be true or not.”

GPT DR’s answer: Mind blown—that’s all we’ll say. Well, that and it sounds like GPT DR is leaning towards “multiverse = true.”
Gemini DR: A bit less depth, but still interesting—seems like Gemini leans towards “not true.” Fascinating!

Who researched the multiverse better?

Pick your favorite below!

Now, there were a few other searches we tried that weren’t so clear cut…and some Gemini just wouldn’t answer. Here’s a quick round-up of a few of those:

“What are the main strategies for reducing recidivism in the UK” (GPT’s answer, Gemini’s answer).
“How will Trump’s Presidency + DeepSeek impact AI advances” (GPT’s answer; Gemini wouldn’t touch this one).
“Explain a technical paper and do a sample implementation with sample data” (GPT’s answer—if you’re an AI researcher reading this, let us know if it worked! Here’s Gemini’s answer…)

BONUS: Here’s a few GPT DR results we found from around the web:

Act like a financial analyst and write an in-depth report on NVIDIA. Not financial advice 😉
Generate a report on how to use AI as a freelance consultant.
Under what condition can you get a visa to work in Spain?

GPT DR seems to really work. It produces a ton of content in its reports; the question now is: does it actually write TOO much?

FROM OUR PARTNERS

Zep / AI That Thinks Beyond Static Data

Zep is transforming how AI agents remember and learn. Traditional systems retrieve static documents, but Zep’s temporal knowledge graph continuously evolves, tracking conversations and structured business data over time. This means more accurate responses, faster performance, and a deeper understanding of context.

With 94.8% DMR accuracy—surpassing MemGPT—Zep not only remembers across sessions but reasons over time. It also reduces token costs, making it scalable and enterprise-ready.

Say goodbye to fragmented AI memory—Zep delivers real intelligence that adapts as your data changes. Learn more

Prompt Tip of the Day

Since o3-mini is available to free users, it’s an incredibly powerful tool that everybody can take advantage of (unlike Deep Research, Operator, and other tools limited at the Pro level).

Here’s a prompt template we came up with using o3-mini to search for prompt advice on o3-mini.

And here’s what happened when we used Deep Research to ask the same thing.

We’re not entirely sure these will work for your needs, but try them out, and let us know what you think of the end result with the poll below.

Did the prompt template work?

Respond yes or no and tell us what happened.

Treats To Try.

Anthropic has a new system called a “constitutional classifier” that adds a constitution of rules over a language model to stop jailbreaks—try it out here.
Tana transforms your meetings and notes into organized, searchable tasks and projects that update automatically as you work (raised $25M).
Sonofa turns your reading materials into podcasts you can listen to anywhere.
Skyvern handles your tedious web tasks by automatically filling forms, downloading documents, and checking out purchases across any website.
Chat Thing turns your business content into 24/7 customer support bots that work across your website, Slack, and Discord.
Presentation 2.0 turns any conversation into a ready-to-share presentation, complete with slides and narration.

See our top 51 AI Tools for Business here!

Around the Horn.

Meta’s new policy document (known as the Frontier AI Framework) outlines what types of AI it won’t open-source: “high risk” systems that make cyber-crimes and biological attacks easier to carry out, and “critical risk” systems that basically guarantee a catastrophic outcome.
The Beatles won a grammy thanks to a noise-reduction AI that helped Paul McCartney clean up an old piano demo.
Researchers in Canada showed that data centers can reduce their energy usage by 30% just from altering ~30 lines of code.
Speaking of head to head showdowns—check out this match-up comparing DeepSeek R1 and Qwen 2.5 across 7 prompts.
Fun fact: if you want to ignore a good chunk of AI generated results in your regular internet searches, try swearing—it’s not a silver bullet for better content, but it’ll help avoid a good chunk of AI slop!

FROM OUR PARTNERS

The Ultimate Guide to AI Agents

Build, optimize, and deploy AI agents with confidence using Galileo's comprehensive eBook. With this 100 page guide, you’ll learn how to:

Select the right agentic framework for your needs.
Evaluate and improve agent performance with proven techniques.
Identify and resolve failure points before they impact production.

Get the eBook.

Tuesday Ticker

Over the weekend, we asked what features you’d want to use OpenAI’s Operator agent for. Here’s some of the top responses:

M.M: “I would like to see Operator analyze financial information for investors.”
B.S: “Tariff assessment and free trade agreements.”
J.N: “Identify the operator of a website that has no contact information published on it. Then, do lead research on that company.”
C.S: “To summarize my work every week from my transcripts from my meeting transcript AI system. It does not have a way to download them so I need it to click the ... and download each one and then I want a summary for each day by topics and then one for the week that I can send to my boss :P”