-
Course
- Data
Big Data Analytics with PySpark
PySpark combines the versatility of Python with the parallel computing power of Spark. This course will get you started with big data analytics using this popular API.
What you'll learn
PySpark combines the powerful parallel computing platform of Spark with the very popular Python language to enable the analysis of large datasets.
In this course, Big Data Analytics with PySpark, you’ll gain the ability to tackle the transformation and analysis of large datasets with this popular API.
First, you’ll explore ingesting and transforming datasets with PySpark.
Next, you’ll discover how to do analysis and aggregation of datasets as well as optimizing the performance of the PySpark operations.
Finally, you’ll learn how to visualize and export the results of your analysis.
When you’re finished with this course, you’ll have the skills and knowledge of PySpark needed to tackle your next big data analytics project.
Table of contents
About the author
Warner is a SQL Server Certified Master, MVP, and Principal Consultant at Pythian. He manages clients in many industries and leads a talented team that maintains and innovates with their data solutions. When he's not working in Ottawa, Ontario, he can be found in his home country of Costa Rica.
More Courses by Warner