Happy Sunday and welcome to InsideAI weekend! I'm Rob May, CTO at Dianthus, and an active AI angel investor. If you have an interesting AI startup, please send it my way. Also I'm moving over, starting soon, to Investing in AI, on Substack. There are a few newsletters already written there, but I won't start a regular cadence until I'm done here at the end of the year. The goal is to do something more long form and insightful, now that AI is in a place where it's much better understood, and providing some of the shorter form insights I've done here over the years is less necessary.
Also I've finally restarted my podcast. Check out this episode with Mady Mantha from Happy Pillar on how they use NLP to help parents interact better with their kids.
This week I want to talk briefly about Foundation Models. One way to think about foundation models is as key pieces of AI infrastructure. The way most software applications are built on top of a data base, many AI applications (particularly NLP, at the moment) are built on top of a foundation model. (Think of this as a structural explanation, not a functional one - foundation models are not anything like databases).
Foundation models are things like GPT-3. They are trained on large datasets and can perform a range of tasks. The set of tasks they can perform is getting larger all the time. I wrote about GPT3 in particular about 18 months ago, before the term "foundation models" was coined. But I saw it coming (which is why you read this newsletter, right?)
With the release of GPT-3 there have been some who thought it’s game over for these startups. In my opinion though, the right way to think about GPT-3 is closer to a piece of platform infrastructure, more akin to AWS. That isn’t a perfect analogy but I will explain below why I see GPT-3 as the real beginning of opportunities in NLP, not the end.
The question is - how far does this go? Will there be many more foundation models, or few? For key areas like NLP and machine vision, the market will most likely be an oligopoly with the big tech companies owning the space, because these models are expensive to train and require data collection capabilities outside the capabilities of most organizations.
Where else will we see foundation models? Is Alphafold considered one? Will there be one (or more) related to AI powered chemistry? Will there be one for accounting/finance/taxes? Will we see foundation models in areas of healthcare or law in some way?
I'll start with the entrepreneur perspective and say if you have the data to train a large model in a specific data vertical, that doesn't mean your business should be one of making a foundation model. If you have expertise in an industry, and access to. and understanding of, specific data in that industry that isn't common, it might be more profitable to build a full stack application and use your model as part of the application. The time to make the business a foundation model is, in my opinion, when you see the following characteristics:
- Training the model (because of data and/or compute) is prohibitively expensive for most people who would want to use it.
- There is a long, multi-year path to improving the model each year, and those improvements will have significant value.
- You can imagine many applications built on top of it - things you would never have the bandwidth to do, that would be really useful.
I'll talk in a different post about the investment perspective. But the key point here is that foundation models are trendy but entrepreneurs shouldn't rush into it as an obvious business model.
Thanks for reading.
@robmay
Comments
Post a Comment