Simple play icon Course
Skills Expanded

Creating Your First Big Data Hadoop Cluster Using Cloudera CDH

by Xavier Morera

Data by itself has no meaning, it is what you do with it that counts. In this course, you'll fast track to Hadoop & Big Data with the Cloudera QuickStart VM and then you'll learn how to set up a Hadoop cluster with Cloudera CDH.

What you'll learn

"Ask Bigger Questions" is Cloudera's vision. You may not be familiar with this phrase, but you're likely familiar with "Knowledge is Power". To get knowledge you need to analyze and understand huge amounts of structured and unstructured data - Big Data. In this course, Creating Your First Big Data Hadoop Cluster Using Cloudera CDH, you'll get started on Big Data with Cloudera, taking your first steps with Hadoop using a pseudo cluster and then moving on to set up our own cluster using CDH, which stands for Cloudera's Distribution including Hadoop. First, you'll explore the case for Hadoop, Big Data, and Cloudera. Next, you'll learn about the fast track to Big Data with Cloudera's QuickStart VM and you'll also learn how to create a visualization environment with VirtualBox. Then, you'll discover how to create a Linux clean cluster with CentOS. Finally, you'll follow the steps to install and configure a cluster with the help of Cloudera Manager. By the end of this course, you'll have a Hadoop cluster, and you'll be ready to start your journey to Big Data.

Course FAQ

What are Hadoop clusters and what are they used for?

Hadoop clusters are collections of computers, known as nodes, that are networked together to perform these kinds of parallel computations on big data sets. Hadoop clusters consist of a network of connected master and slave nodes that utilize high availability, low-cost commodity hardware.

What is Clourdera?

Cloudera is a software company that provides an enterprise data cloud accessible via a subscription. Cloudera is built on open source technology that uses analytics and machine learning to yeild insights from data through a secure connection.

What software is needed for this course?

To complete this course, you will need the Cloudera Quickstart VM and Cloudera CDH software.

What are data clusters?

A data cluster is a sub-group of data which shares similar characteristics and is significantly different to other clusters in a database, usually defined by the statistical technique of cluster analysis.

What will you learn in this Hadoop course?

In this course, you will learn about big data and how to create data clusters. You will also learn how to create a visualization environment with VirtualBox. Finally, you'll discover how to create a Linux clean cluster with CentOS. By the end of this course you will have a Hadooop cluster, and you'll be ready to embark in big data.

About the author

Xavier Morera is driven by one passion: taking on the challenge of understanding complex topics and sharing that knowledge with others. He’s currently focused on the transformative fields of AI, machine learning, generative AI, search, and big data. As an entrepreneur, project manager, technical author, and trainer, Xavier brings a diverse set of skills and deep expertise to every project he takes on. He holds multiple certifications with Cloudera, Microsoft, and the Scrum Alliance and has been... more

Ready to upskill? Get started