Handling Fast Data with Apache Spark SQL and Streaming
Apache Spark is a leader in enabling quick and efficient data processing. This course will teach you how to use Spark's SQL, Streaming, and even the newer Structured Streaming APIs to create applications able to handle data as it arrives.
What you'll learn
Analyzing data used to be something you did once a night. Now you need to be able to process data on the fly so you can provide up to the minute insights. But, how do you accomplish in real time what used to take hours without a complicated code base? In this course, Handling Fast Data with Apache Spark SQL and Streaming, you'll learn to use Apache Spark Streaming and SQL libraries as a great way to handle this new world of real time, fast data processing. First, you'll dive into SparkSQL. Next, you'll explore how to catch potential fraud by analyzing streams with Spark Streaming. Finally, you'll discover the newer Structured Streaming API. By the end of this course, you'll have a deeper understanding of these APIs, along with a number of streaming concepts that have driven the API design.
Table of contents
- Introduction 2m
- The Streaming Landscape 5m
- Introducing Kafka 10m
- Understanding Spark Streaming's Mechanics 4m
- Streaming in Action 8m
- More of the Streaming API 4m
- The DStream 'RDD' API 4m
- About Stateful Streaming: Windows and Checkpoints 7m
- Utilizing State for Speedy Fraud Detection 12m
- An Improved Stateful Stream via mapWithState 7m
- The Streaming UI 2m
- Resources 2m
- Summary 1m
- Introduction 1m
- Increasing Stream Resiliency 7m
- Optimizing to Boost Performance: Streaming 5m
- Optimizing to Boost Performance: SQL 4m
- Introduction to Structured Streaming 5m
- A Deeper Dive into Structured Streaming 8m
- Structured Streaming: Watermarks and Output Models 8m
- Structured Streaming Demo 11m
- The Future: Spark 2.x 8m
- Resources 2m
- Summary 1m