The Killer App
Filed under: #tech #ai
Today I want to take a little detour from our descent into the depths of LLMs and how software is built around it. I’d like to go up a little higher again and get a little philosophical, perhaps. As big tech and likely even non-tech Fortune 500 companies are currently trying to get LLMs added to whatever they are doing, there’s an interesing question to be asked. Is anyone building The Killer App for LLMs?
I’ve seen a non-zero amount of online discussions comparing the current AI craze with the blockchain (crypto currency, NFT, etc.) craziness from the past several years. I categorically disagree. Blockchain in general was always a technology solution looking for a problem. AI is the reverse in many ways: it’s very easy to imagine the potential applications, but the technology today cannot quite deliver on the most promising ideas yet. But another thing is different. There was never the amount of investment and push to deliver from the large companies the way we see with AI now. The amount of money being put on the line right now (both what we can see and what we can’t see) is STAGGERING. The most widely reported number is the $10 billion investment from Microsoft in OpenAI. And sure, this definitely bought early mover advantage and is currently being embedded in virtually every Microsoft product. But the investment is, in my personal opinion, more interesting as a bet on any future advances coming from OpenAI. Because ChatGPT, as cool as it is, is NOT the killer app. It’s just a technology teaser. A wake up call, surely, but mostly just a great tech demo.
As we progressed through 2023, first in awe with ChatGPT, then with how we got slapped in the face by every other tech company’s marketing materials about LLMs, research didn’t stand still. Researchers are figuring out how to get similar results as ChatGPT from a lot smaller models. Open source models, free for anyone to use, are steadily catching up. One of the things OpenAI did was use a massive amount of data from the open internet to train, used to train the GPT model which is rumored to have maybe more than a trillion parameters. But research is starting to show that training a smaller model (only 70 billon parameters, or only 7% the size of GPT4) on a smaller data set, but a more focused and better quality data set, starts giving similar results. And more interestingly, the smaller models also start exhibiting similar “emergent behavior” (think of this as like “conceptualizing” a subject so it can answer more accurately on never before seen questions related to that subject). Most recently, the Mistral company out of France had a customer leak their prerelease of their next open source model, which supposely is nearing GPT-4’s performance in most benchmarks (their previous version already was close or marginally better than GPT-3.5).
What I’m trying to say is that, what we see today with GPT-4 based applications will soon no longer be only available to the Microsofts, Googles and Facebooks of the world. These smaller models will perform similarly enough, and can run on consumer hardware instead of extremely expensive hardware that only cloud datacenters can realistically provide. Rumors have it this is also what Apple is working on and that is why we haven’t heard any big announcements (yet) from them. I’m not talking about the future here, I’m saying smaller language models are virtually here already.
- Textbooks are all you need
- The surprising power of small language models
- Mistral makes waves by matching GPT 3.5 on benchmarks
- Mistral CEO confirms leak of new model nearing GPT-4 performance
Today’s crop of applications embedding LLMs aren’t very innovative, per se. As remarkable an advance it is as getting your search results in the form of a poem, or getting your email client to give you a summary of the long email thread you were just added to… They are just native features of the large language models. Anyone can use a foundation model from OpenAI or Google, and embed this sort of feature in their product. The other application of LLMs you see are largely reminiscent of previous technology jumps. When PCs became a thing, many applications were “X, but on a computer”. When the web came around, we went to “X, but on a website” (there was also “X, but in the cloud” and “X, but on a phone”). Not that those weren’t major advances and made big societal changes - they definitely were. And even if AI doesn’t make another jump as big as the ChatGPT one, the current tech with language models will definitely change the way we interact with our computers. “X, but using AI” will become a thing. And the race to be the dominant “platform” has already begun - just like the personal computer race, the browser wars, cloud provider race and smart phone race (which, listing those out, shows why Microsoft is very keen on not being late to another major platform change).
But what’s the killer app? What can or will we be able to do that we were NOT able to do before? I say the killer app for LLMs has not been built yet. And it will likely require the next AI advancement - not ‘just’ another upgraded GPT model. In the meanwhile, smaller language models can take over today’s LLM duties at a fraction of the compute energy.
There is no comment section here, but I would love to hear your thoughts! Get in touch!
Blog Links
Blog Post Collections
- The LLM Blogs
- Dynamics 365 (AX7) Dev Resources
- Dynamics AX 2012 Dev Resources
- Dynamics AX 2012 ALM/TFS
Recent Posts
-
GPT4-o1 Test Results
Read more... -
Small Language Models
Read more... -
Orchestration and Function Calling
Read more... -
From Text Prediction to Action
Read more... -
The Killer App
Read more...