I'm Joris "Interface" de Gruyter. Welcome To My

Code Crib

CodeCrib Blog

  Page: 1 of 2

Sep 16, 2024 - GPT4-o1 Test Results

Filed under: #tech #ai

I hadn’t actually planned on writing anything. Models come and go, some are cool, some are meh. But the release of GPT4-o1 has repeatedly caused click-baity headlines because the system card talks about some tests that were done having the model try to complete hacking challenges (CTF). I’d like to explain what happened in this hacking test, and for good measure throw some quotes from the same report (and the same section) you will not see in the clickbait titles.

  Read more...

Jun 19, 2024 - Small Language Models

Filed under: #tech #ai

A lot of the talk in the news is about the big guys in the LLM space. Of course OpenAI with ChatGPT, Microsoft Copilot which uses OpenAI, and Google’s Gemini which powers their big AI features. At this point, those models have gotten so large in the AI race that the companies no longer publish the actual size. So we don’t actually know how large these models have gotten. This matters because it gives an indication of the power consumption and cost to run these models at scale. It’s estimated that generating 1 image uses about as much power as fully charging a cell phone once. That may not sound like much at first, but consider how many images one may try to generate before finding one you like. And then multiply that by who-knows how many users.

  Read more...

Jun 17, 2024 - Orchestration and Function Calling

Filed under: #tech #ai

In From Text Prediction to Action we talked about the concept of using language models to convert between natural language and structured data such as CSV or JSON. The main goal there is that “translating” natural language to JSON means we can use the JSON as input to traditional software. At the end, I mentioned a few frameworks, including Semantic Kernel). Let’s look into what that means.

  Read more...

Feb 7, 2024 - From Text Prediction to Action

Filed under: #tech #ai

A benefit of training a text prediction model on text from the internet, is that it not only learns language. It also learns a lot of other text that people write and talk about on the internet. Things like programming code, or text file formats. And so we can use the text predictor to generate traditional, structured data, which we can then use in our software that’s in charge to perform actions.

  Read more...

Feb 5, 2024 - The Killer App

Filed under: #tech #ai

Today I want to take a little detour from our descent into the depths of LLMs and how software is built around it. I’d like to go up a little higher again and get a little philosophical, perhaps. As big tech and likely even non-tech Fortune 500 companies are currently trying to get LLMs added to whatever they are doing, there’s an interesing question to be asked. Is anyone building The Killer App for LLMs?

  Read more...

Jan 31, 2024 - The AI Software Pipeline

Filed under: #tech #ai

As I explained in my previous post, the software around a language model is really what makes it all work to become an AI “system”. To dig a little deeper into it, let’s look at how software leveraging AI is typically made using a pipeline. Note that some or most parts of such a pipeline are part of existing APIs or frameworks you can use, and probably should, but it’s important to have an idea of how these things are built.

  Read more...

Jan 29, 2024 - The Software In Charge of AI

Filed under: #tech #ai

In today’s (January 2024) generative AI landscape, which came on in a flash, there’s not much broader understanding about the architecture of the AI software we use. So I wanted to explain why The Great Text Predictor is just a cog in the AI machinery you use today and still is only completing sentences, and not reading your emails or searching the web. That’s the role of the software “controlling” the neural net.

  Read more...

Jan 24, 2024 - From Text Predictor to Chatbot

Filed under: #tech #ai

Previously, we talked about LLMs basically being nifty text generators by predicting the next word given a bunch of previous text (“context”). We also found out there’s some by-product of clusters of knowledge hidden in these enormous neural nets of statistical word correlations that are very interesting but we may not really be able to count on. But how does fancy word prediction software get to the point of actually having a coherent back-and-forth conversation?

  Read more...

Jan 23, 2024 - Hallucinations are a feature, not a bug

Filed under: #tech #ai

In my blog article The Great Text Predictor I talked about scaling up text prediction from next word based on the previous word, to next word based on previous words in the sentence, all the way to basing it on previous paragraphs and paragraphs. This is what is called “context”. Sometimes, statistically, the LLM has not seen enough data, or it has seen contradictory data, or somehow it has lost or never “figured out” specific correlations. Yet, its calculations spit out the next word - however unlikely that word may appear to us. As people like to say, it can be “confidently wrong”. Or, as the researchers have taught us to call it: hallucinations.

  Read more...

Jan 22, 2024 - The Great Text Predictor

Filed under: #tech #ai

When texting became a more popular sport, we were all still using physical keys. In fact, the keys were for the numbers you would dial phone numbers with, and to type letters each number key was assigned about 3-4 letters. You’d press a key once, you’d get the first letter. If you’d press the key twice in quick succession you’d get the 2nd letter, three times for the third. As many of my age that used the ever popular Nokia 3210 (picture below) at the time can attest, there was a certain level of pride in how fast one could spell out words. Ah, those young ‘uns and their texting!

  Read more...

 

  Page: 1 of 2

Blog Links

Blog Post Collections

Recent Posts