- The Neuron
- Posts
- šŗGPT Deep Research, tested
šŗGPT Deep Research, tested
PLUS: o3-mini prompt tips from o3-mini...
Welcome, humans.
It looks like OpenAIās money-hungry days might be behind itāSoftBank is willing to spend $3B per year on OpenAI products for itself and all its subsidiaries through a joint venture called SB OpenAI Japan.
The deal basically created a new tier of ChatGPT that SoftBank CEO Masayoshi Son called āCristal Intelligence.ā You get Operator, you get Deep Research, ChatGPT Enterprise, access to the API, and even custom models.
Interested? Just convince your CFO to drop $3B on a bunch of stuff you can basically get in ChatGPT Pro. Good luck!
Hereās what you need to know about AI today:
We tested Deep Research and share the results.
Meta outlined what AI systems it wonāt open-source.
An AI-enhanced Beatles recording won Grammy.
Code tweak could cut data center energy by 30%.
Who researched it betterā¦ChatGPT or Gemini?
On Sunday, OpenAI launched Deep Research (DR for short), but we barely had enough time to cover the announcement, let alone try it out. So today, weāre putting GPT Deep Research to the test in a head to head battle against Gemini Deep Research across 3 key promptsāand you can vote for the winner.
These prompts were hand-picked by youāand whenever possible, we ran them exactly as you submitted them.
In our best WWE voice: āArrrre yoooou ready to reseaaarrrch ruuuumbbbble??!ā
First promptāāthe product reviewā: āResearch the best car I could buy for 20k.ā
GPT DRās result: Very thorough. It covered most aspects of what we suggested, but it was too long to read, so we asked it āwhich is the bestā at the end.
Gemini DRās result: Apparently Gemini researched 481 websitesāthe most weāve EVER seen Gemini comb throughāto get this answer.
Oh, and hereās a lifetime ownership cost analysis between a Tesla Model 3, Toyota Prius, and a Camryāin case you were curious.
Who researched the $20k car better?Pick your favorite of the two answers below. |
Next promptāāthe how to guideā: āBest ways to start and write cold emails in times of mailbox spams and repetitive, boring intros.ā
GPT DRās take: at first, it seemed like it was workingāuntil we realized it was totally gaslighting us. After a bit of prompting, we got it to ACTUALLY work; but when we prompted it a second time, it went much better.
Gemini DR: this was pretty thorough, decent advice; although I will say the templates were pretty generic, and that doesnāt really address the pain point.
Btw, you could totally upload these answers into GPT and ask it to create custom templates for your specific needsājust saying!
Who researched cold sales email better?Pick your favorite below |
Prompt #3āāthe rabbit holeā: āResearch multiverse theories and draw a conclusion about whether the current state of research and theory suggests it may be true or not.ā
GPT DRās answer: Mind blownāthatās all weāll say. Well, that and it sounds like GPT DR is leaning towards āmultiverse = true.ā
Gemini DR: A bit less depth, but still interestingāseems like Gemini leans towards ānot true.ā Fascinating!
Who researched the multiverse better?Pick your favorite below! |
Now, there were a few other searches we tried that werenāt so clear cutā¦and some Gemini just wouldnāt answer. Hereās a quick round-up of a few of those:
āWhat are the main strategies for reducing recidivism in the UKā (GPTās answer, Geminiās answer).
āHow will Trumpās Presidency + DeepSeek impact AI advancesā (GPTās answer; Gemini wouldnāt touch this one).
āExplain a technical paper and do a sample implementation with sample dataā (GPTās answerāif youāre an AI researcher reading this, let us know if it worked! Hereās Geminiās answerā¦)
BONUS: Hereās a few GPT DR results we found from around the web:
Act like a financial analyst and write an in-depth report on NVIDIA. Not financial advice š
Generate a report on how to use AI as a freelance consultant.
Under what condition can you get a visa to work in Spain?
GPT DR seems to really work. It produces a ton of content in its reports; the question now is: does it actually write TOO much?
FROM OUR PARTNERS
Zep / AI That Thinks Beyond Static Data
Zep is transforming how AI agents remember and learn. Traditional systems retrieve static documents, but Zepās temporal knowledge graph continuously evolves, tracking conversations and structured business data over time. This means more accurate responses, faster performance, and a deeper understanding of context.
With 94.8% DMR accuracyāsurpassing MemGPTāZep not only remembers across sessions but reasons over time. It also reduces token costs, making it scalable and enterprise-ready.
Say goodbye to fragmented AI memoryāZep delivers real intelligence that adapts as your data changes. Learn more
Prompt Tip of the Day
Since o3-mini is available to free users, itās an incredibly powerful tool that everybody can take advantage of (unlike Deep Research, Operator, and other tools limited at the Pro level).
Hereās a prompt template we came up with using o3-mini to search for prompt advice on o3-mini.
And hereās what happened when we used Deep Research to ask the same thing.
Weāre not entirely sure these will work for your needs, but try them out, and let us know what you think of the end result with the poll below.
Did the prompt template work?Respond yes or no and tell us what happened. |
Treats To Try.
Anthropic has a new system called a āconstitutional classifierā that adds a constitution of rules over a language model to stop jailbreaksātry it out here.
Tana transforms your meetings and notes into organized, searchable tasks and projects that update automatically as you work (raised $25M).
Sonofa turns your reading materials into podcasts you can listen to anywhere.
Skyvern handles your tedious web tasks by automatically filling forms, downloading documents, and checking out purchases across any website.
Chat Thing turns your business content into 24/7 customer support bots that work across your website, Slack, and Discord.
Presentation 2.0 turns any conversation into a ready-to-share presentation, complete with slides and narration.
Around the Horn.
Metaās new policy document (known as the Frontier AI Framework) outlines what types of AI it wonāt open-source: āhigh riskā systems that make cyber-crimes and biological attacks easier to carry out, and ācritical riskā systems that basically guarantee a catastrophic outcome.
The Beatles won a grammy thanks to a noise-reduction AI that helped Paul McCartney clean up an old piano demo.
Researchers in Canada showed that data centers can reduce their energy usage by 30% just from altering ~30 lines of code.
Speaking of head to head showdownsācheck out this match-up comparing DeepSeek R1 and Qwen 2.5 across 7 prompts.
Fun fact: if you want to ignore a good chunk of AI generated results in your regular internet searches, try swearingāitās not a silver bullet for better content, but itāll help avoid a good chunk of AI slop!
FROM OUR PARTNERS
The Ultimate Guide to AI Agents
Build, optimize, and deploy AI agents with confidence using Galileo's comprehensive eBook. With this 100 page guide, youāll learn how to:
Select the right agentic framework for your needs.
Evaluate and improve agent performance with proven techniques.
Identify and resolve failure points before they impact production.
Tuesday Ticker
Over the weekend, we asked what features youād want to use OpenAIās Operator agent for. Hereās some of the top responses:
M.M: āI would like to see Operator analyze financial information for investors.ā
B.S: āTariff assessment and free trade agreements.ā
J.N: āIdentify the operator of a website that has no contact information published on it. Then, do lead research on that company.ā
C.S: āTo summarize my work every week from my transcripts from my meeting transcript AI system. It does not have a way to download them so I need it to click the ... and download each one and then I want a summary for each day by topics and then one for the week that I can send to my boss :Pā
A Cat's Commentary.
Thatās all for today, for more AI treats, check out our website. See you cool cats on Twitter: @noahedelman02 |
|