LLMs are like pizza, and that's OK

LLMs are like pizza, and that's OK

Let's get pizza. Cheese? Meat lovers? Vegetarian? Want pineapple? Deep dish, giant slice, or fully loaded with the lot? There's enough takes on pizza to please every taste; and the existence of one type of pizza doesn't devalue any other. Pizza is a food genre.

Say you want a car. Big SUV? Small hatchback? There's a company who will sell you one of those. What a car is, is well defined and constrained by roads and other cars.

Want to take a flight somewhere? All the airlines are substantially similar and so are the planes. What an airline is, and what a plane is, who supplies them, what they cost, and what the ticket will cost you, have converged around a few conventions that work for most of us.

Each is an example of a market so massive, with demands so diverse, that it is very hard for one player to get a monopoly, to take over all competitors and serve everyone. The commoditization is a strength - anyone is free to set up in competition; customers already understand the product so they need not educate anyone on what they're doing - only why their product is superior.

By contrast, software generally favors winner-take-all dynamics: the Internet offers global distribution (very hard for a pizza store or even for a chain like Pizza Hut.) Software weighs nothing and takes up less and less space every year, allowing it to have so many features that it actually does please everyone. This has led to an assumption that all categories of software must be winner-take-all.

The winner-take-all dynamic does not hold for Large Language Models.

An LLM is a tool, used as part of a bigger whole - either part of an AI product or part of your workflow. That makes it sticky - hard to displace once it's there. Every LLM has different vibes; you get used to what you're used to and it's hard to change. So, you tend to stay with what you know.

Most LLM subscribers are individuals - especially for ChatGPT. People are creatures of habit. Millions of subscribers are paying; it's a subtle phenomenon and it's bigger than you know. It does make intuitive sense: once you realize how much you can do with this tool that costs $20 per month, you're hooked - you're not giving that up. Millions of subscribers will cover the massive capital outlay being used to train these models without the risk of being dependent on a large customer.

And when you build a product that uses a specific LLM, a new version of a model may be worse in some ways, and the evals show it's not worth switching. So you stay put. Hence, sticky.

So all this is to say LLMs are:

  • Like pizza: everyone has a favorite flavor
  • Like cars: there are a narrow set of categories, constrained by possibilities of technology: for LLMs it's pretty much only voice interactions, image and video understanding, and of course text - that's all there really is, under the hood
  • Like airplanes and airlines, hyper-refined, hyper-efficient, and commoditized, with every provider racing to give you everything they can to keep you subscribed in a world of minimal differentiation, through a similar interface that follows a set of conventions

The arms race we've seen has been insanely good for consumers. It shows sign of ending either - short of a breakthrough that makes current architectures obsolete.

And since no-one wants their product to be a commodity, big money and energy (figuratively and literally) are being spent by OpenAI and Anthropic to differentiate. Both are racing "up the stack" - moving beyond "just a chatbox" and adding better and better features (Claude Artifacts, Projects, and Tool Use; ChatGPT Canvas, o1 or Structured Outputs) - and they are just getting started. Fierce competition with other providers - but mainly with open-source models and the communities building tools like ollama - is driving this acceleration. Another way commoditization has been insanely good for us, the users!

The enormity of the AI market and diversity of vibes make this market very hard for one player to fully cater to - and even if we end up with a duopoly, Facebook/Meta seems to be keen on keeping the big players honest with Llama releases, and other labs like the Qwen folks are never far behind. May the commoditization continue!

To close, here's Claude's take:

Me: In what ways are LLMs like pizza?
Claude: Let me think about this creative analogy!

Both LLMs and pizza have:

1. Layers that work together: Pizza has crust, sauce, cheese, and toppings; LLMs have layers of neural networks processing language.

2. Quality depends on ingredients/training data: A pizza is only as good as its ingredients; LLMs can only be as reliable as their training data.

3. Come in different varieties for different needs: You have thin crust, deep dish, Neapolitan, etc.; there are different LLM architectures and sizes (like GPT, PaLM, etc.)

4. Work best when there's good balance: A pizza needs balanced toppings and proportions; LLMs need balanced training and proper parameters.

5. Can be reheated/fine-tuned: Pizza can be reheated for later use; LLMs can be fine-tuned for specific tasks.

And ChatGPT's answer for the same question:

Me: In what ways are LLMs like pizza?
ChatGPT: LLMs (like ChatGPT) and pizza actually share some fun, quirky similarities:

1. Customizable Ingredients: Just as you can pick toppings for a pizza, LLMs can be fine-tuned with different "ingredients" (data sources, parameters, training techniques) to cater to specific needs. Want an LLM specialized in medical information? That’s like a veggie pizza for a health-conscious crowd. Or maybe one for coding? Think of it as a loaded meat lover's pizza, rich with all kinds of programming "toppings."

2. Diverse Flavors (Models): Pizza comes in different styles—New York, Neapolitan, Chicago deep-dish—and similarly, LLMs come in various architectures and styles (e.g., GPT, BERT, LLaMA). Each has its own strengths, "flavors," and suited applications.

3. Complexity in Simplicity: At its core, pizza is just dough, sauce, and cheese, yet there’s an endless variety in how it’s made. Similarly, LLMs are fundamentally built on the principles of language modeling, but the layers of training, parameter tweaking, and tuning create vastly different outputs.

4. Universal Appeal with Regional Variants: Like how pizza is loved worldwide but adapted regionally (think pizza Margherita in Italy, deep dish in Chicago, and sushi pizza in Japan), LLMs are deployed globally, with tweaks and adjustments to handle different languages, cultures, and preferences.

5. Easily Served and Shared: A pizza is meant to be shared with others, and similarly, LLMs are increasingly designed to be accessible and available to all, from individuals to enterprises, democratizing AI knowledge much like pizza brings people together.

6. Sometimes Controversial Toppings: Just as pineapple on pizza divides opinions, some LLM outputs or applications spark debate. Ethical concerns, content moderation, or even the choice of data can make people feel strongly, just like the "Is pineapple okay on pizza?" debate.

7. Both Require the Right Oven (or Server) Conditions: Pizza needs a hot oven to cook properly, while LLMs require powerful servers and GPUs to run efficiently and provide smooth responses.

8. Endlessly Satisfying, Yet Open to Innovation: No matter how long pizza has been around, there’s always a new twist (stuffed crusts, cauliflower bases, etc.). Similarly, LLMs keep evolving, with updates in capabilities, user experiences, and use cases.

So, just like pizza, LLMs have something for everyone and can always surprise you with a new "flavor"!

It's wonderful that pizza isn't patented and can be remixed and reinvented; and it's wonderful that the market for AI and LLMs (currently) works the same way, even though they're a far less mature product category.

I love chicken tikka masala pizza, which was absolutely never gonna happen unless pizza was remixable by anyone. It'll be exciting to see what the LLM race gives us as the future unfolds.