IT Governance

A framework called IT governance makes sure that your organization’s IT infrastructure supports and facilitates the accomplishment of its corporate strategy and goals. Alan Calder’s IT Governance: A Pocket Guide contains the complete definition. The implementation, management, and monitoring of IT governance inside an organisation are defined by an IT governance framework, a sort of framework. ISO/IEC 38500:2015 is the recognised standard for IT governance. It establishes a clear framework for the board’s oversight of information and communications technology and serves as a valuable resource for IT governance experts around the globe.

Curriculum

Understanding Big Data and Hadoop

  • Introduction to Big Data & Big Data Challenges 
  • Limitations & Solutions of Big Data Architecture
  • Hadoop & its Features
  • Hadoop Ecosystem
  • Hadoop 2.x Core Components 
  • Hadoop Storage: HDFS (Hadoop Distributed File System)
  • Hadoop Processing: MapReduce Framework
  • Different Hadoop Distributions

Hadoop Architecture and HDFS

  • Hadoop 2.x Cluster Architecture 
  • Federation and High Availability Architecture 
  • Typical Production Hadoop Cluster
  • Hadoop Cluster Modes
  • Common Hadoop Shell Commands 
  • Hadoop 2.x Configuration Files
  • Single Node Cluster & Multi-Node Cluster set up
  • Basic Hadoop Administration

Hadoop MapReduce Framework

  • Traditional way vs MapReduce way
  • Why MapReduce 
  • YARN Components
  • YARN Architecture
  • YARN MapReduce Application Execution Flow
  • YARN Workflow
  • Anatomy of MapReduce Program 
  • Input Splits, Relation between Input Splits and HDFS Blocks
  • MapReduce: Combiner & Partitioner
  • Demo of Health Care Dataset
  • Demo of Weather Dataset

Advanced Hadoop MapReduce

  • Counters
  • Distributed Cache
  • MRunit
  • Reduce Join 
  • Custom Input Format 
  • Sequence Input Format
  • XML file Parsing using MapReduce.

Apache Pig

  • Introduction to Apache Pig 
  • MapReduce vs Pig
  • Pig Components & Pig Execution
  • Pig Data Types & Data Models in Pig
  • Pig Latin Programs 
  • Shell and Utility Commands
  • Pig UDF & Pig Streaming
  • Testing Pig scripts with Punit
  • Aviation use-case in PIG
  • Pig Demo of Healthcare Dataset

Apache Hive

  • Introduction to Apache Hive 
  • Hive vs Pig
  • Hive Architecture and Components 
  • Hive Metastore
  • Limitations of Hive
  • Comparison with Traditional Database
  • Hive Data Types and Data Models
  • Hive Partition
  • Hive Bucketing
  • Hive Tables (Managed Tables and External Tables)
  • Importing Data
  • Querying Data & Managing Outputs
  • Hive Script & Hive UDF
  • Retail use case in Hive
  • Hive Demo on Healthcare Dataset

Advanced Apache Hive and HBase

  • Hive QL: Joining Tables, Dynamic Partitioning 
  • Custom MapReduce Scripts
  • Hive Indexes and views 
  • Hive Query Optimizers
  • Hive Thrift Server
  • Hive UDF 
  • HBase v/s RDBMS
  • HBase Components
  • HBase Architecture 
  • HBase Run Modes
  • HBase Configuration
  • HBase Cluster Deployment
  •  

Advanced Apache HBase

  • HBase Data Model 
  • HBase Shell
  • HBase Client API
  • Hive Data Loading Techniques
  • Apache Zookeeper Introduction
  • ZooKeeper Data Model
  • Zookeeper Service
  • HBase Bulk Loading 
  • Getting and Inserting Data
  • HBase Filters

Processing Distributed Data with Apache Spark

  • What is Spark 
  • Spark Ecosystem
  • Spark Components 
  • What is Scala 
  • Why Scala
  • SparkContext
  • Spark RDD

Oozie and Hadoop Project

  • Oozie 
  • Oozie Components
  • Oozie Workflow
  • Scheduling Jobs with Oozie Scheduler
  • Demo of Oozie Workflow
  • Oozie Coordinator 
  • Oozie Commands
  • Oozie Web Console
  • Oozie for MapReduce
  • Combining flow of MapReduce Jobs
  • Hive in Oozie
  • Hadoop Project Demo
  • Hadoop Talend Integration
Scroll to Top
× How can I help you?