From the course: Introduction to Generative AI with GPT

What is GPT?

From the course: Introduction to Generative AI with GPT

Start my 1-month free trial

What is GPT?

- GPT stands for generative pre-trained transformer. A name like that was obviously created by people that know nothing about marketing. It's certainly a mouthful. The joking aside, GPT uses complex algorithms, a set of rules for problem-solving that computers follow, and lots of data to generate original text and other media types. The quality of text output that Version 4 of GPT produces is high quality and hard to discern from human-generated text. With some prompting from a user, GPT can create, for example, meaningful stories, poems, emails, chatbot responses, and software code. In addition, unlike previous versions of GPT, Version 4 is multimodal in that it also supports visual inputs, not just text. GPT is based on a generative model of language. Generative models use existing knowledge of language to make predictions on what words may come next based on a series of previous words and contexts. This is where we get the word generative, the G in GPT. As a simple example, if you type, "Once upon a," the model will predict the word time as the next word based on its analysis of a large corpus of data that is managed via a large language model, or LLM. The LLM, which is a complex algorithm, not the data itself, but AI relevant parameters, is derived from a massive volume of data that's trained on. This data comes from a variety of public sources including Wikipedia and Common Crawl, a data set of billions of web pages. To help train to make the system smart, so to speak, text is randomly removed from the acquired content and the software is trained to fill it in with the correct missing words. This is where we get the word pre-trained, or P, in GPT. Powering all of this is a type of AI called deep learning, which is based on 70 years of research in neural networks. As an example, for an AI-powered recognition system to learn what a bicycle is and identified in a picture, AI must analyze large volumes of existing bicycle pictures. It's called a neural network because it loosely mimics the function of the brain. The network consists of a web of connected nodes or data points. The type of neural network used in GPT is called a transformer. That's the T in GPT. It's particularly good at taking text and reusing it in another context or word sequence, while maintaining meaning. In a neural network, data moves through nodes, data points, based on certain criteria. One of these criteria is the weight or strength of connections between nodes. If criteria are not met, data does not get processed. If it does, it moves to another node. This repeats until the data arrives transformed at the end of the neural network. Large language models, or LMMs, are at the heart of the artificial intelligence in generative AI. These LLMs contain parameters, which I'll explain in a moment. Basically, more parameters mean higher accuracy output. In each subsequent version of GPT the number of parameters has increased massively. Version 1 had around 117 million. Version 2 had 1.5 billion, and Version 3 had 175 billion. OpenAI has not revealed the exact number of parameters in version four, but some analysts estimate it at over a trillion. Even as an estimate, that may be a few billion off, that's some leap. And you can see now where GPT4 gets its smarts. So what are these parameters in a large language model? There are values that have been derived from analyzing lots of data. Think of a set of parameters like the temperature, direction, and speed of air from vents that you control in your car. You can adjust these until you get just the right mix of air that makes you feel comfortable. In the case of GPT, the response you get to an input prompt will be the result of a series of parameters chosen along a neural network that correspond to text or other output. It is the prediction that the right set of parameters will provide the correct output. Okay, I'll stop there. The concepts described here provide the basic foundation of what GPT is and how it works. It will be enough for most people, but there's a lot more under the hood if you decide you want to go deeper.

Contents