• The Neuron
  • Posts
  • šŸ˜ŗ How to pick the best AI model for what you ACTUALLY need...

šŸ˜ŗ How to pick the best AI model for what you ACTUALLY need...

PLUS: AI's are doing AI research now?!

Welcome, humans.

People are getting early access to Google Searchā€™s AI Mode, and itā€™s really interesting to watch in action.

Some are saying this is basically Googleā€™s Perplexity killer. And if you add this to the success of Gemini 2.5 Pro, which Google is giving away for free rn, it looks like Google is finally becoming the threat OpenAI was created to preventā€¦

Hereā€™s what you need to know about AI today:

  • We break down how to pick the best AI model.

  • OpenAI launched PaperBench to test AI research replication.

  • Google released AGI Safety report predicting AGI by 2030.

  • Wikimedia traffic rose 50% since January from AI scraping.

Here's how to pick the best AI model for what you actually need

Tired of playing AI model musical chairs? One week Claude's the best, then it's ChatGPT, then suddenly Gemini's crushing benchmarks (welcome to the wild world of AI, folksā€”or as we call it, ā€œTuesdayā€).

With all the constant changes, how do you know which AI to use when? We actually just watched Tina Huang's hour long interview with Louie Peters (CEO of Towards AI) where they tackled this exact question, and the advice was pretty solid. 

First things firstā€”it all depends on what you're trying to accomplish:

  • Solving complex reasoning problems that need high accuracy?

  • Processing massive documents with over 700K words?

  • Just chatting casually and need something fast and cheap?

  • Building enterprise solutions that need self-hosting?

These all require different AI strengths. Here's a few of Louie's pro tips on how to pick the right model for the job:

  • Match functionality to your needs: Choose models with capabilities (images, audio, etc.) that fit your specific tasksā€”for beginners, start with ChatGPT 4o. 

  • Check context window size: For long documents or complex instructions, models like Gemini 2.5 Pro offer up to 1M tokens.

  • Use benchmarks wisely: Check the metrics most relevant to your use case (math, coding, writing)ā€”more on that below.

  • Calculate your ROI: Expensive models (like o1 Pro) are worth it only when reliability saves more time than they cost.

  • Experiment regularly: Build your own intuition by testing multiple modelsā€”Louie personally uses 5-6 different models 20-30 times a day.

Louie also shared his own breakdown of which models are his favorite atm:

So what tools can help you actually implement this advice?

You could test every model individually (time-consuming but thorough)ā€”or you can use OpenRouter, which lets you test multiple models with the same prompt at once. Just sign up, add funds for premium models, and start comparing results side-by-side.

Another option is checking benchmarks like Live Bench, but remember that AI companies know how to game these tests.

Our favorite approach? Use the site Artificial Analysis, which puts every AI model through standardized tests covering intelligence, speed, cost, and specialized skills.

Their latest rankings show:

Fun fact: they also rank speech to text, image, and video models, too!

Now, the above ranking could change. Like, tomorrow. So ultimately, we recommend you go with whichever one consistently works the best for you.

You donā€™t always need the smartest modelā€”you just need the one that gets the job done.

After all, choosing an AI is surprisingly personalā€”itā€™s not unlike choosing your friends (or more appropriately, your cybernetic coworker). After all, if you're going to spend a good chunk of your day ā€œchattingā€ with something, the vibes do kinda matter. 

FROM OUR PARTNERS

When your AI needs the best ears in the business... šŸ‘‚

Frustrated when voice AI constantly misunderstands you? Speechmatics fixed that.

While others rush to make AI talk, Speechmatics has solved what matters first: making it truly listen. 

Their real-time speech tech delivers 90%+ accuracy in under one second across 55+ languages, diverse accents, and dialects ā€“ a full 25% more accurate than competitors, even in noisy environments.

Whether it's AI assistants, customer service, or medical transcription, Speechmatics ensures AI catches every word the first time.

No more ā€œcan you repeat that?ā€ā€”just AI that keeps up, not catches up.

Prompt Tip of the Day

When youā€™re trying to condense something, try this prompt: ā€œFirst, give me a shortened version, in <short version>, keeping all the same specificity and context of the original. When thatā€™s done, write an even shorter version, in <even shorter>.ā€

Itā€™s sort of like adding a built-in editor for your AI writing (demo).

Another helpful tip? If you want the AI to write more visually, try: ā€œmake it more concrete (show, donā€™t tell).ā€ Also, you can ask it to use more ā€œimage wordsā€ā€”but you might want to add something like: ā€œDon't use metaphors, just use picture words that the user can see.ā€ (demoā€¦maybe I probably shouldā€™ve used that version, huh?).

Treats To Try.

  1. Claude for Education is a new resource for schools that helps you enhance teaching and learning with specialized features for Claude like Learning mode that guides student reasoning rather than giving answers outright (more).

  2. Actively AI researches, understands, and reasons about potential customers to maximize revenue quality and pipeline growth (raised $22M).

  3. GenSpark is a new agent out of China that completes tasks for you through a mixture-of-agents system with fewer hallucinations than competitorsā€”demo.

  4. DeepSite is a totally free vibe-coding app you can use to help code a website (powered by DeepSeek)ā€”we used it to make this.

  5. Subscription Day tracks all your subscription payments in your menu bar, showing upcoming charges on a calendar and alerting you before payments are due (Mac only rn).

  6. Recall connects what you're currently reading with content you've previously saved, instantly showing you where you've seen similar information before.

  7. ElevenLabs now has a text to bark model for dogsā€¦ whereā€™s the cat one, huh??

Around the Horn.

  • Google replaced the current leader of its consumer AI apps with the leader of Google Labs and helped launch the viral AI research tool NotebookLM.

  • OpenAI released PaperBench, a benchmark that evaluates AI agents' ability to replicate state-of-the-art AI research papersā€”so far, the best agent tested only achieved 21% replication accuracy.

  • Google published a 145 page report on the companyā€™s approach to ā€œAGI Safetyā€ and predicts AGI could arrive by 2030.

  • Wikimedia traffic surged 50% since January 2024 due to AI crawlers scraping content.

  • Researchers from Hong Kong introduced Dream 7B, ā€œthe most powerfulā€ open diffusion model (which means it generates text sorta like painting).

FROM OUR PARTNERS

On-device AI. No cloud. No GPUs.

Mirai is building the infrastructure for on-device AI, enabling dev teams to run small language models directly on iOS.

Locally. Fast. Private.

Their engine supports a wide range of architectures, including Llama, Gemma, Qwen, VLMs, and RL over LLMsā€”making advanced AI capabilities accessible on mobile devices. Yeah, pretty cool. 

Thursday Trivia

One is real, and one is AI. Which is which? (vote below!)

A.

B.

Which is AI?

The answer is below, but place your vote to see how your guess compares to everyone else (no cheating now!)

Login or Subscribe to participate in polls.

Here are the results from last weekā€™s trivia (A was AI):

Hereā€™s what you said:

  • L.G. chose A: ā€œA. is AI - it's almost perfect - but the logo isn't 100% on point.ā€

  • T.R. chose B: ā€œThe gibberish letters on the cap lead me to think B is AI-generated, given its historical difficulty with text in images.ā€

  • D.F chose A: ā€œShallow depth of field gives it away... It's very common in AI image generation.ā€

A Cat's Commentary.

Trivia answer: B is AIā€¦

Thatā€™s all for today, for more AI treats, check out our website.

The best way to support us is by checking out our sponsorsā€”todayā€™s are Speechmatics and Mirai.

What'd you think of today's email?

Login or Subscribe to participate in polls.