The Hadoop for Data Analysts training course is designed to demonstrate how to manage, manipulate, and query large complex data in real time, using SL and familiar scripting languages on Hadoop.
The course begins with an introduction to Hadoop basics. Next, it explores how Apache Pig and Apache Hive enable data transformations and analyses via filters, joins, and user-defined functions. The course concludes by examining how to analyze and process data with Pig, and how to optimize Hive.
Purpose
|
Learn how to use Hadoop to manage, manipulate, and query large complex data in real time. |
Audience
|
This class is targeted at the non-technical data analyst role. Previous experience with a scripting language like Python recommended. |
Role
| Software Developer |
Skill Level
| Intermediate |
Style
| Workshops |
Duration
| 3 Days |
Related Technologies
| Java | Hadoop | Apache |
Productivity Objectives
- Understand Hadoop fundamentals
- Know how to use Pig to analyze data
- Understand how to process complex data with Pig
- Troubleshoot Pig
- Know when to use Hive
- Know how to manage data with Hive
- Understand how to optimize Hive