Course Preprocessing for Machine Learning in Python

In this course you’ll learn how to get your cleaned data ready for modeling.

DescriptionChaptersExercisesInstructor

This online course about Preprocessing for Machine Learning in Python covers a key part of what a future data analyst would require.

This course covers the basics of how and when to perform data preprocessing. This essential step in any machine learning project is when you get your data ready for modeling. Between importing and cleaning your data and fitting your machine learning model is when preprocessing comes into play. You’ll learn how to standardize your data so that it’s in the right form for your model, create new features to best leverage the information in your dataset, and select the best features to improve your model fit. Finally, you’ll have some practice preprocessing by getting a dataset on UFO sightings ready for modeling.

Enroll now in this Preprocessing for Machine Learning in Python course, and don’t miss the opportunity of learning with the best, as Sarah Guido is. With 62 enriching exercises, 20 videos, and an estimated time of 4 hours to successfully end up the course, you will become one of the best.

Chapter 1: Introduction to Data Preprocessing
In this chapter you’ll learn exactly what it means to preprocess data. You’ll take the first steps in any preprocessing journey, including exploring data types and dealing with missing data.
Chapter 2: Feature Engineering
In this section you’ll learn about feature engineering. You’ll explore different ways to create new, more useful, features from the ones already in your dataset. You’ll see how to encode, aggregate, and extract information from both numerical and textual features.
Chapter 3: Putting it all together
Now that you’ve learned all about preprocessing you’ll try these techniques out on a dataset that records information on UFO sightings.
Chapter 4: Standardizing Data
This chapter is all about standardizing data. Often a model will make some assumptions about the distribution or scale of your features. Standardization is a way to make your data fit these assumptions and improve the algorithm’s performance.
Chapter 5: Selecting features for modeling
This chapter goes over a few different techniques for selecting the most important features from your dataset. You’ll learn how to drop redundant features, work with text vectors, and reduce the number of features in your dataset using principal component analysis (PCA).
Chapter 6: Introduction to Data Preprocessing
In this chapter you’ll learn exactly what it means to preprocess data. You’ll take the first steps in any preprocessing journey, including exploring data types and dealing with missing data.
Chapter 7: Standardizing Data
This chapter is all about standardizing data. Often a model will make some assumptions about the distribution or scale of your features. Standardization is a way to make your data fit these assumptions and improve the algorithm’s performance.
Chapter 8: Feature Engineering
In this section you’ll learn about feature engineering. You’ll explore different ways to create new, more useful, features from the ones already in your dataset. You’ll see how to encode, aggregate, and extract information from both numerical and textual features.
Chapter 9: Selecting features for modeling
This chapter goes over a few different techniques for selecting the most important features from your dataset. You’ll learn how to drop redundant features, work with text vectors, and reduce the number of features in your dataset using principal component analysis (PCA).
Chapter 10: Putting it all together
Now that you’ve learned all about preprocessing you’ll try these techniques out on a dataset that records information on UFO sightings.
Preprocessing for Machine Learning in Python. In this course you'll learn how to get your cleaned data ready for modeling.

Sarah Guido

Senior Data Scientist at InVision

Sarah is a Senior Data Scientist at InVision where she studies user collaboration through data. She is an accomplished conference speaker, conference track chair, and O’Reilly Media author, and is passionate about Python and machine learning. Sarah attended graduate school at the University of Michigan’s School of Information.

Collaborators

#R #Python #MachineLearning #BigData #DataAnalysis