What is hadoop?

Hadoop is a platform written in java where we can able to process large amount of data. Hadoop eco system has lots of tools which make processing the bigdata made easy.

Let’s learn how to do that end to end..!!

Objective:

Over the past years,Hadoop&Spark has seen enormous industry adoption and facing lack of skills in the market. To help bridge the gap we have designed this course with industry expectations with real time examples. This is course will help you understand variety of big data application development options and let you develop your own and Performance tune the same.

Course Overview:

  • Introduction to Hadoop,Bigdata Projects
  • Hadoop Architecture In-depth travel.
  • Hbase
  • Welcome to Spark.
  • Programming with RDD.
  • SparkSQL&DataFrames.
  • Spark Job Execution.
  • Cluster Architecture for Spark.
  • Introduction to Kafka.
  • Introduction to Spark Streaming.

Recorded Sessions:

  • Pig & Hive
  • Map Reduce 1.0 & YARN
  • Sqoop& Flume
  • Oozie& Zookeeper

Intro Session(Tamil)Click Here

Module 1: Welcome to Spark:

  • Welcome to the world of Spark.
  • Bye ByeHadoop? (HadoopVs Spark).
  • Spark Components:
    • Spark Core
    • Spark SQL
    • Graphx
    • Mlib
  • Spark Use cases in real time.

Hands on:

  • Installing and configuring spark in your machine.
  • Running a sample program in spark.
  • Executing a spark use case.

Module 2: Programming with RDD:

  • What is RDD?
  • Why RDD?
  • How RDD gets executed in a spark application.
  • Transformations in RDD.
  • Actions in RDD.
  • RDD Programming API’s.

Hands On:

  • Creating RDD from a Data file.
  • Applying transformations & actions in RDD.
  • Interactive queries using RDD.

Module 3: Spark SQL/DataFrames.

  • SparkSQL/Dataframe Uses.
  • DataFrame / SQL API’s
  • Spark & Hive Integration.
  • Catalyst query optimization.

Hands on:

  • Create dataframe from a file.
  • Create dataframe from a table.
  • Caching and reusing dataframes.
  • Query with dataframes API and SQL.

Module 4: Spark Execution & Optimization.

  • Jobs Stages & tasks.
  • Partitions and Shuffles.
  • Data locality.
  • Job Performance (tuning).

Hands on:

  • Visualizing DAG execution.
  • Measuring memory usage.
  • Understanding performance.

Module 5: Introduction to Kafka.

  • Introduction to Kafka.
  • Kafka architecture.
  • Producers,Consumers in Kafka.
  • Working with kafka.

Hands on:

  • Installing & configuring kafka.
  • Producing and consuming messages.

Module 6: Spark Streaming.

  • Introduction to Spark Streaming.
  • DSTREAM API’s and Stateful Streams.
  • Realiablity and fault recovery.

Hands on:

  • Creating DStream from source.
  • Integration of Kafka and Spark streaming.
  • Developing a kafka-spark application.
  • Viewing Stream jobs in WebUI.

Module 7: Hbase,

  • HBASE Introduction.
  • HBASE Overview.
  • HBASE JAVA/SCALA IMPLEMENTATION

Module 8: Map-Reduce,Flume,Oozie,Sqoop,Hive:

All this will be aviaalble in a recorded sessions with discussion for queries.

This course is for,

  • Professionals who wants to learn & develop Hadoop&Spark applications.
  • Professionals who wants to do certification (Hortonworks :HDPCD, HDPCDSPARK)(Cloudera: CCA175, CCA159).
  • And those who are interested to learn about latest technology for their career improvement.

INTRODUCTION VIDEO TO COURSE (DEMO)

Demo Class By Usman

APACHE SPARK INTRODUCTION

Suspendisse et metus eu massa lobortis condimentum sed ut orci. Nullam viverra dapibus risus, eu tristique nisl sollicitudin at. Etiam iaculis blandit libero. licitudin at. Etiam iaculis barabecue libero. licitudin at. Etiam iaculis blandit libero. Fusce id lobortis beury  for your tefos orci. Proin tristique laoreet tempus.

DOWNLOAD SYLLABUS

WHAT OUR CLIENT'S SAY

ALWAYS DEDICATED AND DEVOTED

LATEST NEWS

ALWAYS DEDICATED AND DEVOTED