From the course: Introduction to Artificial Intelligence

Labeled and unlabeled data

From the course: Introduction to Artificial Intelligence

Labeled and unlabeled data

- When you think about machine learning the key is to focus on the term learning. What does it mean for your machine to learn? What strategies can you use to learn something new? How can you take the strategies and apply them to machines? Imagine you wanted to learn how to play chess. You could do this a couple of different ways. You could hire a chess tutor, then they would introduce you to some of the different chess pieces. Then they'd show you how to move them around the board. You could practice by playing against your tutor. Then they would supervise your moves and help you when you made a mistake. If you couldn't find a tutor you could also go to public parks. There you'd watch people play. You couldn't ask them questions. You'd just quietly watch and learn. You'd have to figure out chess just by watching the games on your own. If you did this long enough you'd probably understand the game. You might not know the names of the chess pieces but you'd understand the moves and strategies after hundreds of hours of observation. These two strategies are very similar to how a machine learns. The system could do something called supervised learning. Here, a data scientist acts like a tutor for the machine. They show the machine the correct answers and then let the system train itself to get better at the game. The system could also do unsupervised learning. Here you just have the machine make all the observations on its own. The system might not know all the different names and labels, but it'll figure out their way to learn from the data. As you can imagine, these two approaches have their own strengths and weaknesses. For supervised learning the system needs to have a knowledgeable tutor. There must be someone that knows a lot about chess that can show the system how to play the game. With unsupervised learning, the system needs to have access to a lot of data. That's the only way to see the patterns. The system might not be able to go to a public park and watch hundreds of people play. It also depends a little bit on who it watches. You need it to watch people who are good players. As you can imagine, these techniques are used for much more than just playing chess. Companies use these techniques to get valuable insights about their customer. With supervised learning, a company like Amazon might identify a thousand customers who spend a lot of time shopping on their website. The company can then label these customers as high spenders. Then it would have the machine learning system look through the customer to find patterns that make them high spenders. Now, for unsupervised learning a machine learning system could be given access to all of Amazon's customer data. Here, the system might find its own patterns in the data. Maybe somebody who buys chessboards are much more likely to buy an expensive kitchen appliance. Then Amazon could use that data to advertise. If you use Amazon, you may have noticed that sometimes they advertise products that seem completely unrelated to what you're looking for but it's still something you're interested in buying. Both techniques have their own strengths and yet each one can give you incredibly useful insights.

Contents