Processing Streaming Data Using Apache Flink
Apache Flink is built on the concept of stream-first architecture where the stream is the source of truth.
What you'll learn
Flink is a stateful, tolerant, and large scale system which works with bounded and unbounded datasets using the same underlying stream-first architecture. In this course, Processing Streaming Data Using Apache Flink, you will integrate your Flink applications with real-time Twitter feeds to perform analysis on high-velocity streams.
First, you’ll see how you can set up a standalone Flink cluster using virtual machines on a cloud platform. Next, you will install and work with the Apache Kafka reliable messaging service.
Finally, you will perform a number of transformation operations on Twitter streams, including windowing and join operations.
When you are finished with this course you will have the skills and knowledge to work with high volume and velocity data using Flink and integrate with Apache Kafka to process streaming data.
Table of contents
- Version Check 0m
- Prerequisites and Course Outline 3m
- Deployment Modes in Apache Flink 6m
- Standalone Cluster 3m
- Demo: Provisioning VMs for the Flink Cluster 4m
- Demo: Creating Public Private Key Pairs for Passwordless SSH 5m
- Demo: Setting up Passwordless SSH 4m
- Demo: Install Flink on Cluster Nodes 3m
- Demo: Configure Cluster Settings 5m
- Demo: Configuring Firewall Rules 2m
- Demo: Starting the Job Manager and Task Manager Processes 2m
- Demo: Running a Sample Application on the Flink Cluster 2m
- Demo: Setting up a Maven Project 3m
- Demo: Building a Jar for the Flink Application 3m
- Demo: Running a Custom Application on the Flink Cluster 5m
- Introducing Apache Kafka 2m
- Publishers, Consumers, and Topics in Kafka 6m
- Demo: Installing and Running Apache Kafka 3m
- Demo: Kafka Producers, Consumers, and Topics 3m
- Demo: Integrating a Flink Application with Kafka 6m
- Demo: Reading from and Writing to Kafka in a Flink Application 6m
- Demo: Creating and Accessing Developer Keys for Twitter 5m
- Demo: Reading from the Twitter API and Publishing to a Kafka Topic 7m
- Demo: Extracting Hashtags from Twitter Messages 5m
- Demo: Reading from Multiple Kafka Topics 6m
- Demo: Connecting to the Twitter Source Directly from a Flink Application 6m
- Windowing and the Notion of Time 6m
- Demo: Extracting Event Time and Associating Processing Time 6m
- Demo: Event Time Tumbling and Sliding Windows 6m
- Demo: Event Time Global and Session Windows 3m
- Demo: Processing Time Tumbling Windows 3m
- Backpressure in Streaming 4m
- Demo: Monitoring Back Pressure 6m
- Demo: Using Rich Functions to Perform Stateful Operations 8m
- A Quick Introduction to Joins 4m
- Demo: Performing Join Operations on Streams 8m
- Introducing Elasticsearch 4m
- Demo: Installing Elasticsearch and Creating an Index 3m
- Demo: Implementing a Branching Pipeline to Write to Elasticsearch and Kafka 4m
- Demo: Writing Data out to an Elasticsearch Index 4m
- Introducing Metrics 3m
- Demo: Using Counter Metrics to Monitor Applications 8m
- Demo: Unit Testing 8m
- Demo: Testing Using MiniClusterWithClientResource 5m
- Summary and Further Study 2m