From the course: NLP with Python for Machine Learning Essential Training

Unlock the full course today

Join today to access over 23,000 courses taught by industry experts.

Identifying features for transformation

Identifying features for transformation

- [Instructor] Now that we've generated a couple of new features, we're going to take a look at whether either of them might be a fit for a transformation. Again, this is a pretty large area with deep theoretical underpinnings and I'm only going to hit the tip of the iceberg here. So I encourage you to dive in more on your own. First, we'll read in the raw version of the text just like we have in the last few chapters. And then we're going to create our two new features and then we can print out the first five rows. So this looks just like a data at the end of our last lesson so now, in order to determine whether transformation might be helpful, we can look at the distribution of our data using a histogram. Now, on the last lesson, we looked at the normalized overlayed histograms, but we didn't look at the full histogram so we're still not exactly sure what the full distribution looks like for these new features. We only know when it's split by label. The first thing we'll do is look…

Contents