Applying Real-time Processing Using Apache Storm
Storm lets you to work with large scale streaming data using it's distributed real-time processing architecture. This course discusses the components of Storm topologies and how to use Storm for applying machine learning in real-time.
What you'll learn
Storm is meant to be to used for distributed real-time processing, the way Hadoop is used for distributed batch processing. With Storm, you can process informations such as trends and breaking news and react to it in real-time. In this course, Applying Real-time Processing Using Apache Storm, you'll learn how to apply Storm for real-time processing. First, you'll discover how to set up a data processing pipeline using Storm topologies. Next, you'll explore parallelization by controlling data flows between components. Then, you'll cover how to perform complex data transforms using the Trident API. Finally, you'll learn how to apply machine learning models in real-time. By the end of this course, you'll be able to build your own Storm applications for different real-time processing tasks.
Table of contents
- Understanding Parallelism in a Storm Cluster 4m
- Setting up a Remote Cluster 5m
- Running a Topology on a Remote Cluster 5m
- Controlling Data Flow with Stream Grouping 4m
- Building a Word Count Topology 2m
- Implementing the Topology Components 4m
- Contrasting Different Stream Grouping Strategies 4m
- Implementing a Custom Stream Grouping 4m