IT Governance - Think It Tech

A framework called IT governance makes sure that your organization’s IT infrastructure supports and facilitates the accomplishment of its corporate strategy and goals. Alan Calder’s IT Governance: A Pocket Guide contains the complete definition. The implementation, management, and monitoring of IT governance inside an organisation are defined by an IT governance framework, a sort of framework. ISO/IEC 38500:2015 is the recognised standard for IT governance. It establishes a clear framework for the board’s oversight of information and communications technology and serves as a valuable resource for IT governance experts around the globe.

Curriculum

Understanding Big Data and Hadoop

Introduction to Big Data & Big Data Challenges
Limitations & Solutions of Big Data Architecture
Hadoop & its Features
Hadoop Ecosystem
Hadoop 2.x Core Components
Hadoop Storage: HDFS (Hadoop Distributed File System)
Hadoop Processing: MapReduce Framework
Different Hadoop Distributions

Hadoop Architecture and HDFS

Hadoop 2.x Cluster Architecture
Federation and High Availability Architecture
Typical Production Hadoop Cluster
Hadoop Cluster Modes
Common Hadoop Shell Commands
Hadoop 2.x Configuration Files
Single Node Cluster & Multi-Node Cluster set up
Basic Hadoop Administration

Hadoop MapReduce Framework

Traditional way vs MapReduce way
Why MapReduce
YARN Components
YARN Architecture
YARN MapReduce Application Execution Flow
YARN Workflow
Anatomy of MapReduce Program
Input Splits, Relation between Input Splits and HDFS Blocks
MapReduce: Combiner & Partitioner
Demo of Health Care Dataset
Demo of Weather Dataset

Advanced Hadoop MapReduce

Counters
Distributed Cache
MRunit
Reduce Join
Custom Input Format
Sequence Input Format
XML file Parsing using MapReduce.

Apache Pig

Introduction to Apache Pig
MapReduce vs Pig
Pig Components & Pig Execution
Pig Data Types & Data Models in Pig
Pig Latin Programs
Shell and Utility Commands
Pig UDF & Pig Streaming
Testing Pig scripts with Punit
Aviation use-case in PIG
Pig Demo of Healthcare Dataset

Apache Hive

Introduction to Apache Hive
Hive vs Pig
Hive Architecture and Components
Hive Metastore
Limitations of Hive
Comparison with Traditional Database
Hive Data Types and Data Models
Hive Partition
Hive Bucketing
Hive Tables (Managed Tables and External Tables)
Importing Data
Querying Data & Managing Outputs
Hive Script & Hive UDF
Retail use case in Hive
Hive Demo on Healthcare Dataset

Advanced Apache Hive and HBase

Hive QL: Joining Tables, Dynamic Partitioning
Custom MapReduce Scripts
Hive Indexes and views
Hive Query Optimizers
Hive Thrift Server
Hive UDF
HBase v/s RDBMS
HBase Components
HBase Architecture
HBase Run Modes
HBase Configuration
HBase Cluster Deployment

Advanced Apache HBase

HBase Data Model
HBase Shell
HBase Client API
Hive Data Loading Techniques
Apache Zookeeper Introduction
ZooKeeper Data Model
Zookeeper Service
HBase Bulk Loading
Getting and Inserting Data
HBase Filters

Processing Distributed Data with Apache Spark

What is Spark
Spark Ecosystem
Spark Components
What is Scala
Why Scala
SparkContext
Spark RDD

Oozie and Hadoop Project

Oozie
Oozie Components
Oozie Workflow
Scheduling Jobs with Oozie Scheduler
Demo of Oozie Workflow
Oozie Coordinator
Oozie Commands
Oozie Web Console
Oozie for MapReduce
Combining flow of MapReduce Jobs
Hive in Oozie
Hadoop Project Demo
Hadoop Talend Integration