Getting Started with Hive for Relational Database Developers
Traditional databases focus on transactional processing, whereas Hive helps with analytical processing extracted from huge datasets. This course focuses on the similarities and differences between SQL and Hive.
What you'll learn
Transactional processing focuses on accessing and updating individual records. Analytical processing works on data in bulk and deals more with summaries across the dataset, trends and insights. The difference in requirements and the kind of data they work on, lead to differences between Hive and traditional databases. This course, Getting Started with Hive for Relational Database Developers, teaches you about several gotchas involved while using familiar SQL constructs in Hive. You'll learn about loading and parsing data from files, views, subqueries, and some cool built-in functionality such as table generating functions. The course also demonstrates the constraints imposed by Hive architecture choices such as schema on read, denormalized storage in HDFS, and high latency of operations. This serves as a guide for user choices during storage and querying. By the end of this course, you'll feel confident in using Hive for your own relational database uses.
Table of contents
- Data and Metadata Stored in Hive 4m
- Exploring the Hive Warehouse Directory 5m
- Managed vs. External Table 2m
- Creating External Tables in Hive 7m
- The Alter Table Command 4m
- Working with Temporary Tables 3m
- Loading Data from Files and Existing Tables 6m
- Multi-table Insert and Deleting Data from Tables 3m
- An Overview of Partitioning and Bucketing 5m