Start With Why

Plus: Gemini beats GPT-4o and Claude Sonnet 3.5, Google's monopoly and much more...

Luca Bersella
August 11, 2024

Hey everyone,

This is your Sunday Space, where I serve up the best ideas, tools and resources I’ve found each week as we explore the technology shaping the future.

If you find something thought-provoking, forward it to a friend.

IDEAS
Start With Why

Source: Kingsmaker

Last week, I wrote about how successful people focus on asking the right questions.

Focusing on the question over the answer is all well and good. But where do we start? How do we know how to ask the right questions?

Well, there’s a simple framework I’ve analogised which happens to steal the title of a famous book. Sorry, Simon:

Start With Why.

All great questions start (literally) with the word “why.”

One can imagine Steve Jobs, while designing the iPhone, asking questions like:

Why do mobile phones have to have physical keyboards?
Why do all these phones have to flip up and down?
Why shouldn’t we make it fully touchscreen?

To create something new, challenge the status quo or overcome a seemingly insurmountable obstacle, we must be cunning, creative and conniving.

The way we do that is by starting our questions with “why.”

(Click here to share this idea on X/Twitter.)

RESEARCH
Gemini 1.5-Pro > GPT4o

In recent weeks, Gemini 1.5-Pro (Google’s frontier LLM) has leapfrogged GPT-4o and Claude 3.5 Sonnet to number one on LMSYS’ leaderboard.

Source: LMSYS Chatbot Arena Leaderboard

I find this incredibly surprising, given my personal experiences with Gemini 1.5 Pro were horrendous.

Constant hallucinations and being told it can’t produce images (even though it’s supposedly multi-modal) were disappointing, to say the least.

However, I’m not against changing my mind.

So when I came across Ruben Hassid, a popular creator on X, pitting GPT-4o against Gemini 1.5-Pro to see which performed best, I was intrigued.

Ruben put the two models through five tests, focussed mainly on content creation:

Generating threads for X based on an initial post.
Content calendar ideas for a makeup brand.
Brainstorming content ideas.
Generating the perfect hook.
Creating a storytelling template.

I’m shocked to say that after looking at the outputs, I agree with his verdict. It was a landslide for Gemini.

I highly recommend having a look at the outputs to judge for yourself.

Even though I’ve doubted Gemini and had bad experiences, I’m considering reconsidering my opinion…

I plan to do some tests of my own between GPT-4o, Gemini, and Claude 3.5 Sonnet (see below) in the coming weeks.

And, of course, you’ll be the first to hear the results of the deep dive.

AI Word of the Day

Multi-modal

Created with Midjourney

This is the ability of AI models to process and generate data across various types of inputs, such as text, images, audio, and video, unlike traditional models that only handle text.

This is achieved through different modality encoders that convert various data into a unified format. This enables the model to analyse and produce outputs considering all inputs together, expanding its applications and capabilities.

INSIGHTS
1 Article

Google Illegally Monoplised Search

Source: Bloomberg

This week, a judge ruled that Google illegally monopolised the search market by paying $26 billion to the likes of Apple, ensuring it remained the default search engine on smartphones. This ruling is the US government’s first major win in an antitrust case against a tech giant in over two decades. Further trials will determine the remedies, which could include breaking Google up.

Luca’s take: This ruling could be as pivotal as the Microsoft antitrust case in 2000, which caused the company to miss not one but two technology waves—social and mobile. It’s very possible Google could be similarly restrained.

AI will inevitably transform search; the question is whether Google will have a say in it. Only time will tell.

1 Post

Current situation:
1. Stocks are falling like we are entering a recession
2. Gold prices are falling like nothing is wrong
3. Bonds are rising like rate cuts are on the way
4. Oil prices are rising like rate cuts got cancelled
5. Crypto is falling like we are in a bear… x.com/i/web/status/1…
— The Kobeissi Letter (@KobeissiLetter)
3:14 PM • Aug 2, 2024

1 Video

TOOLS
Claude 3.5 Sonnet

Source: Anthropic

Regulars here will know that I’m a ChatGPT die-hard. It was my first love when it came to LLMs.

However, I have a confession to make—I’ve been seeing other bots.

Particularly since X has been abuzz with how Claude 3.5 Sonnet, Anthropic’s most intelligent model, was far superior to GPT-4o.

Naturally, I was intrigued. So, I’ve started diverting some of my workflows to Claude to see if the hype is real.

While I can’t say for sure yet which model is more capable, I can say that Anthropic is nailing the AI user experience.

Firstly, it’s clear Anthropic has leaned into prompt engineering, and that comes through in their example suggestions and prompt builder in the console.

Their suggested prompts are awesome.

Claude.ai

Lastly, my personal favourite is what Anthropic call “Artifacts.” These are standalone pieces of content Claude generates, such as documents or code snippets, that can be edited, shared, and reused separately from the main conversation.

Source: Anthropic

Now, this is what I call intuitive AI UX. It's another personal “wow” moment from tinkering with LLMs.

I highly recommend trying Claude 3.5 Sonnet for your AI workflows and projects. You can use it for free here.

THOUGHTS
Quote I’m Pondering

David Sacks, co-founder of Glue.ai and bestie on the All-In podcast, on innovation in the era of LLMs:

“Once you figure out where the AI model’s innovation ends, you’ll know where your product’s innovation begins.”

(Click here to share this on X/Twitter.)

What did you think of today's edition?

How can I improve? What topics interest you the most?

Was this email forwarded to you? If you liked it, sign up here.

If you loved it, forward it along and share the love.

Thanks for reading,

— Luca

P.S. Olympia Lightning Bolt…

*These are affiliate links—we may earn a commission if you purchase through them.

Reply

or to participate.