Hadoop for .NET Developers
Big Data is becoming part of every company, and Hadoop is the core technology for storing and accessing huge quantities of data. This course will teach you how to use Hadoop in the Microsoft world - running on Windows and using .NET to write queries.
What you'll learn
Big Data is an established discipline, and every company can benefit from storing and analyzing large amounts of data to improve their products and services. Hadoop is the key technology in Big Data, but it's too often seen as something which is only for Java and Linux people. This course, Hadoop for .NET Developers, will teach you how to use this key technology. First, you'll learn how to bring Hadoop into a Microsoft environment. You'll also discover how to run the services on Windows and create fast, understandable MapReduce queries in .NET using C#. The course takes a proof-of-concept approach, demonstrating how to evaluate Hadoop on Windows with .NET and .NET Core. By the end of the course, you'll be able to start your own Hadoop journey with confidence.
Table of contents
- Introducing the Course 2m
- HDFS – the Hadoop Distributed File System 3m
- Storing and Accessing Data with HDFS 5m
- YARN – Yet Another Resource Negotiator 4m
- Querying Hadoop Data with YARN 4m
- The MapReduce Programming Style 4m
- Building a MapReduce Program in Java 4m
- How Hadoop Scales Out 4m
- How Hadoop Handles Failure 4m
- Module Summary 3m
- Running Hadoop on Windows 3m
- Hadoop in Docker 3m
- Running a Dockerized Hadoop Cluster 4m
- Hortonworks Data Platform 3m
- Installing and Running HDP 4m
- Syncfusion Big Data Platform 2m
- Installing and Running Syncfusion 4m
- Hadoop on Azure HDInsight 3m
- Creating and Using Hadoop on HDInsight 5m
- Module Summary 4m
- Writing MapReduce Jobs in .NET 2m
- The Hadoop Streaming API 3m
- .NET Framework Mappers and Reducers 5m
- Shipping Dependencies with Streaming 3m
- Hadoop Streaming with .NET Framework Apps 4m
- The Microsoft Hadoop SDK for .NET 2m
- Hadoop Streaming with the .NET SDK 6m
- .NET Core Mappers and Reducers 3m
- Hadoop Streaming with .NET Core Apps 4m
- Module Summary 5m
- Enhancing MapReduce Performance and Traceability 2m
- The Hadoop Distributed Cache 3m
- Using the Distributed Cache with .NET Streaming Jobs 4m
- Combiners in MapReduce Jobs 3m
- Running a .NET Combiner in Hadoop Streaming 2m
- Using Multiple Reducer Tasks in a Job 3m
- Configuring Streaming Jobs to use Multiple Reducers 3m
- Recording Progress with Hadoop Job Counters 2m
- Incrementing Counters and Logging from .NET 4m
- Module Summary 4m
- Introdcuing the Hadoop Ecosystem 2m
- Hive – the SQL Facade Over Big Data 2m
- Using Hive with HDFS and .NET Core 5m
- HBase – the Real-time Big Data No-SQL Database 3m
- Using HBase and Accessing HBase from Hive 4m
- Thrift – the Language-neutral Server Defintion 2m
- Accessing HBase from .NET Using Thrift 4m
- Spark – In-memory Fast Querying for Hadoop 2m
- Using Spark with Python and Jupyter 4m
- Module and Course Summary 3m
Course FAQ
Hadoop is an open source platform for storing and processing huge quantities of data using a cluster of ordinary servers.
This Hadoop .NET course is designed for .NET and .NET Core developers who want to shift into big data.
We'll have an overview of Hadoop and see how the storage layer, HDFS, and the processing layer, YARN, work and how they work together to query terabytes of data.
This course will cover options for running Hadoop on Windows using Docker, and then with native Windows packages from Hortonworks and Syncfusion.