Building an Enterprise Grade Distributed Online Analytics Platform
In this course, you'll develop an understanding for analytics capabilities, and you'll learn how to build a full-blown, wholistic, distributed analytics system using Kafka, Cassandra, Storm, and Elasticsearch.
What you'll learn
In this course, Building an Enterprise Grade Distributed Online Analytics Platform, you'll learn how to build a full-blown distributed analytics system using Kafka, Cassandra, Storm, and Elasticsearch. First, you'll begin by understanding what is online analytics and how it differs from offline analytics. You'll further discuss and analyze the parts of a modern online analytics system, including the data backbone, storage, processing, and insight generation. Next, you'll develop an understanding of your choice of technology, its features, and why it was chosen for a specific task. Finally, you'll explore how to properly integrate the technology into your solution in a manner that's most beneficial. Each technology you use will be placed under an observant eye, and you'll see how each technology provides scalability, fault tolerance, and most importantly how it contributes in achieving the functionality you desire. By the end of this course, you'll be ready to immediately enrich your enterprise with amazing analytics capabilities.
Table of contents
- Introduction 2m
- So What Is Big Data Analytics? 3m
- Differentiating Offline Analytics from Online Analytics 6m
- The Roles Portrayed in an Online Analytics System 2m
- The Data Backbone 2m
- Demo - A Naive Data Backbone 6m
- A Naive Data Backbone - Maybe Too Naive? 2m
- The Data Backbone - Recap 1m
- The Computation Role 4m
- Demo - The Computation Layer 4m
- Naive Implementation - The Missing Parts 2m
- The Computation Role - Recap 1m
- The Storage Role 3m
- Demo - The Storage Layer 2m
- Naive Implementation - The Familiar Missing Parts 1m
- The Insights Engine Role 2m
- Demo - The Insights Engine Role 3m
- Naive Implementation - Far from Perfect 1m
- A Word About Boundaries 1m
- Our Requirements from an Online Analytics Platform 2m
- Summary 2m
- Introduction 3m
- Apache Kafka from the Bird's-eye View 4m
- Diving Deeper into Apache Kafka - Design and Architecture 7m
- Work Distribution 3m
- Why Apache Kafka and a Word About Zookeeper 4m
- Demo - Downloading, Configuring, and Running Apache Kafka 7m
- Demo - Distributing Our Apache Kafka Cluster 4m
- Demo - Distributing Zookeeper 7m
- Demo - Utilizing Apache Kafka as a Data Backbone 3m
- Analytics on the Data Backbone 2m
- Summary 1m
- Introduction 2m
- Distributed Storage for Online Analytics 3m
- Introducing Apache Cassandra 5m
- Cassandra Data Organization Abstractions 5m
- Demo - Deploying a Multi-node Cassandra Cluster 5m
- Demo - Deploying a Multi-datacenter Cassandra Cluster 2m
- Demo - Basic CQL Operations 7m
- Demo - Integrating Our Storage Layer 7m
- Summary 2m
- Introduction 2m
- The Role of a Distributed Insight Engine in Online Analytics 4m
- Getting Familiar with Elasticsearch 2m
- Elasticsearch Deployment Abstractions 3m
- Elasticsearch Data Organization Abstractions 2m
- Choosing Elasticsearch Against the Alternatives 1m
- Demo - Deploying an Elasticsearch Cluster 8m
- Demo – Integrating Elasticsearch with Our Analytics System 5m
- Demo – Gathering Insights with Elasticsearch 6m
- Summary 2m