From the course: Prompt Engineering: How to Talk to the AIs

The generative AI revolution - GPT Tutorial

From the course: Prompt Engineering: How to Talk to the AIs

The generative AI revolution

- Before we go too deeply into prompt engineering, let's talk a bit about generative AI more broadly and define some key concepts. AI has been around for quite some time. It has been used with different degrees of success by companies and other institutions in different fields for years. For example, when I led the algorithm Steam and Netflix, we use AI for recommending movies and shows. In the past few years, you might have also heard how AI models conquer new feats. Like for example, beating professional chess or go players. But why is AI, all of the sudden, on the cover of the New York Times. Now, a few innovations have come into play, and while they're all slightly different, they have been categorized into this new label of generative AI. Generative AI not only has the ability to identify and classify inputs like the old AI did, it can also generate new content that did not exist before. To be fair, some of the old AI was also able to generate new content. However, generative AI does so with much better quality and importantly, in response to natural language input. This last aspect is key to why prompt design and engineering is so important, and we will double click on it soon. But let's also define some other important generative AI concepts. As mentioned before, generative AI is a new label that is being used to refer to a new class of AI models and applications that can generate content from a natural language prompt input. Broadly speaking, those models and applications include the following. First is the concept of large language models, which are models that have been trained on a huge collection of text from the internet, books and beyond. These models are trained to predict the next token, roughly equivalent to the next word. However, after this training, the models show some emergent behavior and they're capable of following realistic conversations and reason about some facts given the right constraints. Large language models include GPT-4 and ChatGPT from OpenAI, LLaMA from Meta, Sparrow from DeepMind, or Bard and LaMDA from Google. Another is Text to Image. These models have been trained to translate a text into an image. And they include DALL-E 2 from OpenAI, Stable Diffusion or Midjourney. But that is not all, there are more applications of generative AI such as text to music, audio or video, or even the so-called action transformers that are able to learn how to translate text into actions such as clicking on a link or browsing the internet. It's beyond the scope of this introductory course to go into the details of the different components that make up these models, but it's good for you to know that they're based on the so-called transformer architecture, which was introduced in the "Attention Is All You Need" paper by Google researchers. The attention mechanism is a key component of transformers. More recently, reinforcement learning with human feedback, RLHF, has also become a key aspect of the latest training stages. And for image and multimedia, the novel use of diffusion models has also become key. Finally, an important novel aspect of all these models is their ability to learn on the fly, the so-called zero-shot learning capability. This means that general models can learn from information that is new to them without having to be retrained.

Contents