Developing Spark Applications Using Scala & Cloudera

Apache Spark is one of the fastest and most efficient general engines for large-scale data processing. In this course, you'll learn how to develop Spark applications for your Big Data using Scala and a stable Hadoop distribution, Cloudera CDH.

by Xavier Morera

Get started Preview course

What you'll learn

At the core of working with large-scale datasets is a thorough knowledge of Big Data platforms like Apache Spark and Hadoop. In this course, Developing Spark Applications Using Scala & Cloudera, you’ll learn how to process data at scales you previously thought were out of your reach. First, you’ll learn all the technical details of how Spark works. Next, you’ll explore the RDD API, the original core abstraction of Spark. Then, you’ll discover how to become more proficient using Spark SQL and DataFrames. Finally, you'll learn to work with Spark's typed API: Datasets. When you’re finished with this course, you’ll have a foundational knowledge of Apache Spark with Scala and Cloudera that will help you as you move forward to develop large-scale data applications that enable you to work with Big Data in an efficient and performant way.

Try this course for free

Access this course and other top-rated tech content with a free trial.

Free individual trial Free team trial

Have questions?

Get them answered now.

Start a live chat

Course Info

Rating

(48 reviews)

Level

Beginner

Last updated

May 03, 2018

Duration

5h 43m 3s

Course Overview | 2m

About the author

Xavier Morera

Xavier is very passionate about teaching, helping others understand Generative AI, ML, Search, and Big Data. He is also an entrepreneur, project manager, technical author, trainer, and holds a few certifications with Cloudera, Microsoft, and the Scrum Alliance, along with being a Microsoft MVP.

More Courses by Xavier

Developing Spark Applications Using Scala & Cloudera

What you'll learn

Table of contents

Course Overview 2m

Why Spark with Scala and Cloudera? 13m 5s

Getting an Environment and Data: CDH + StackOverflow 34m 50s

Refreshing Your Knowledge: Scala Fundamentals for This Course 24m 54s

Understanding Spark: An Overview 27m 35s

Getting Technical with Spark 45m 38s

Learning the Core of Spark: RDDs 42m 48s

Going Deeper into Spark Core 47m 59s

Increasing Proficiency with Spark: DataFrames and Spark SQL 37m 28s

Continuing the Journey on DataFrames and Spark SQL 35m 50s

Working with a Typed API: Datasets 19m 29s

Final Takeaway and Continuing the Journey with Spark 11m 22s

About the author