- Lab
- A Cloud Guru
Cleanse Missing Data Using the pandas Python Package
In this lab, we will load a CSV file into a pandas DataFrame. Once loaded, we will count the number of missing values in the file. Next, we will drop any columns that are missing all values, and replace any remaining missing values. Basic Python programming skills will be required for this lab. If you need a refresher, check out the following course: - [Certified Associate in Python Programming Certification](https://acloud.guru/overview/8169e8e7-91a7-4d92-b278-4dd08c787dc6)
Path Info
Table of Contents
-
Challenge
Load the Data File
Load the
missing-data.csv
file into a pandas DataFrame, and count the number of missing values. -
Challenge
Drop Any Columns That Are Missing All Values
Make sure to drop only columns that are missing all values.
-
Challenge
Replace Remaining Missing Values with the Last Valid Observed Value of That Column
This will leave some missing values at the beginning, as there was no last valid observed value.
-
Challenge
Write the Data to a New File
Write the data to a new file named
cleaned_data.csv
.
What's a lab?
Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.
Provided environment for hands-on practice
We will provide the credentials and environment necessary for you to practice right within your browser.
Guided walkthrough
Follow along with the author’s guided walkthrough and build something new in your provided environment!
Did you know?
On average, you retain 75% more of your learning if you get time for practice.