From the course: Learning Python

Fetching Internet data - Python Tutorial

From the course: Learning Python

Fetching Internet data

- [Instructor] One of the areas where Python really shines is in retrieving and working with data from the internet, such as JSON, XML and HTML. In this chapter we'll see how to work with all three of these data types. Let's start by opening up the inetdata_start.py file. And in this first example, we're going to just retrieve data from a web server and print the results. So in order to make a request to a web server, I need to import the urllib.request module. So let's start by doing that. So I'm going to import urllib.request. And this module provides the classes and code I need to make HTTP requests. So next, I'm going to get rid of this placeholder pass statement in my main and I'm going to create a variable called weburl and weburl, I'm going to assign the result of urllib.request, and then I'm going to call the urlopen function. And the urlopen function just simply takes a string of the URL that I want to request. So I'm going to just give it a URL, a simple one. So I'll just go ahead and get some data from google.com. This will give me back a web response object. So the URL I'm opening up for here is just the address for Google's homepage. And for the moment what I'm going to do is just print out the result code. Now the result code is retrieved by calling the getcode function on the response that I've created. So I'm going to print out the result code, and that is going to be weburl.getcode. So this will just be a regular old HTTP result code. So for example, the result code will be 200 if everything's okay, or 404, if not found for example. So let's go ahead and save this, and let's run what we have so far. So I'm going to choose run Python file in terminal. And sure enough, you can see that the result code is 200. So everything worked just fine. I'm able to connect to the website without any problems. And now that we've got the URL open, we can read some data and print it out. To do that, I need to call the read function on the weburl request that I've created. So what I'm going to do is make a new variable called data, and then I'll assign that to weburl.read. And this is very similar to reading files, which we saw how to do earlier in the course. So I'm going to go ahead and print out that data. So I'm just reading the entire contents of this URL into a variable called data, and I'm going to print that data out. So if all goes, well, this should be the HTML code for Google's home page. So let's go ahead and save this. And I'll close that terminal and start it up again. So I'll choose run Python in terminal. And sure enough, you can see, if we scroll back up through all this code, right here, near the top. All right. So here's the result code. The result code is 200 and then you can see that I get the HTML back for Google's homepage. So just in a few lines of code, we were able to open a connection to a URL and then read the contents of that URL and then print out the results.

Contents