Processing Streaming Data Using Apache Spark Structured Streaming
Structured streaming is the scalable and fault-tolerant stream processing engine in Apache Spark 2 which can be used to process high-velocity streams.
What you'll learn
Stream processing applications work with continuously updated data and react to changes in real-time. In this course, Processing Streaming Data Using Apache Spark Structured Streaming, you'll focus on integrating your streaming application with the Apache Kafka reliable messaging service to work with real-world data such as Twitter streams.
First, you’ll explore Spark’s architecture to support distributed processing at scale. Next, you will install and work with the Apache Kafka reliable messaging service.
Finally, you'll perform a number of transformation operations on Twitter streams, including windowing and join operations.
When you're finished with this course you will have the skills and knowledge to work with high volume and velocity data using Spark and integrate with Apache Kafka to process streaming data.
Table of contents
- Version Check 0m
- Prerequisites and Course Outline 2m
- Drivers, Workers, Executors, and Tasks 5m
- Introducing Spark Standalone 3m
- High Availability Schemes 5m
- Demo: Install and Set up Spark on Your Local Machine 3m
- Demo: Start Master and Worker Processes 5m
- Demo: Config Files for Worker Nodes 5m
- Demo: Configuring Processing Using Command Line Arguments 2m
- Demo: The Spark Web UI for Monitoring Applications 6m
- Demo: High Availability Configuration with Zookeeper 6m
- Demo: Configuring the Spark Environment Using Config Files 2m
- Security for Spark Clusters 5m
- Backpressure 5m
- Stream-first Architecture 2m
- Introducing Apache Kafka 7m
- Demo: Installing and Setting up Apache Kafka 3m
- Demo: Publishers, Consumers, and Topics 4m
- Demo: Creating a Developer Account on Twitter 5m
- Demo: Connecting to Twitter Using Tweepy 6m
- Demo: Extracting and Counting Hashtags from a Twitter Stream 7m
- Demo: Reading Messages from Multiple Publishers 2m
- Demo: Reading from Multiple Topics 3m
- Demo: Reading from Multiple Topics Using a Regular Expression 4m
- Demo: Performing Sentiment Analysis on Input Tweets 3m
- Demo: Assigning Sentiment Status to Tweets 2m
- Demo: Writing to a Kafka Sink and Foreach Sink 6m