Simple play icon Course
Skills Expanded

Data Cleaning and Quality Assurance in R

by Janani Ravi

Data preparation is a critical step in turning raw information into actionable insights. This course will teach you how to clean, structure, and validate data in R to solve real-world problems efficiently.

What you'll learn

Preparing data for analysis can be a daunting task, especially when dealing with missing values, outliers, and inconsistent formats that compromise the integrity of your insights.

In this course, Data Cleaning and Quality Assurance in R, you’ll gain the ability to handle inconsistent, real-world datasets and transform them into reliable and analyzable formats.

First, you’ll explore strategies to identify and address missing data, including summarizing missing patterns and imputing values using statistical and conditional techniques. Next, you’ll discover how to detect and manage outliers in both numerical and categorical data using visualizations, statistical methods, and targeted replacements. Finally, you’ll learn how to ensure data consistency by converting data types, standardizing units, and implementing validation checks to maintain data integrity. When you’re finished with this course, you’ll have the skills and knowledge of data cleaning and preparation needed to confidently preprocess datasets for analysis.

About the author

Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework. After spending years working in tech in the Bay Area, New York, and Singapore at companies such as Microsoft, Google, and Flipkart, Janani finally decided to combine her love for technology with her passion for teaching. She is now the co-founder of Loonycorn, a content studio focused on providing ... more

Ready to upskill? Get started