This course was created with the
course builder. Create your online course today.
Start now
Create your course
with
Autoplay
Autocomplete
Previous Lesson
Complete and Continue
Data pre-processing for machine learning in Python
Introduction
Introduction to the course (2:58)
Numerical and categorical variables (2:07)
The dataset
Required Python packages
Jupyter notebooks (9:08)
Data cleaning
Introduction to data cleaning (2:07)
Selecting numerical and categorical variables (4:34)
Cleaning the numerical features (10:29)
Cleaning the categorical features (4:32)
KNN blank filling (10:25)
ColumnTransformer and make_column_selector (13:38)
Exercises (10:45)
Encoding of the categorical features
Introduction to the encoding of categorical variables (1:11)
One-hot encoding (20:31)
Ordinal encoding (9:03)
Label encoding of the target variable (2:53)
Exercise (12:07)
Transformations of the numerical features
Introduction to transformations (2:05)
Power Transformation (9:02)
Binning (11:03)
Binarizing (2:25)
Applying an arbitrary transformation (7:10)
Exercise (11:20)
About power transformations
Pipelines
Define a transformation pipeline (8:51)
Pipelines and ColumnTransformer together (12:07)
Exercises (10:52)
Scaling
Introduction to scaling (2:46)
Normalization, Standardization, Robust scaling (12:18)
Exercise (7:16)
Principal Component Analysis
Introduction to PCA (3:20)
How to perform PCA (9:57)
Exercise (6:15)
Filter-based feature selection
Introduction to feature selection (5:41)
Numerical features, numerical target (10:24)
Numerical features, categorical target (6:35)
Categorical features, numerical target (8:58)
Categorical features, categorical target (7:51)
Feature importance according to a model (11:20)
A comment on mutual information
A comment on feature selection with categorical variables
Exercises (8:32)
A complete pipeline
An example of a complete pipeline (18:58)
Oversampling
Introduction to SMOTE (4:29)
How to perform SMOTE (10:48)
Exercise (5:29)
General guidelines
Practical suggestions
End of course
End of course
Numerical and categorical variables
Complete and Continue
Discussion
0
comments
Load more
0 comments