What is the course curriculum?

You will get the entire Apache Spark ecosystem broken down into step-by-step lessons, making it very easy for you to grasp all the concepts & components.

BIG Data Fundamentals

• Introduction to BIG Data – Concepts, Types & Applications
• Traditional Systems vs Hadoop vs Apache Spark

Introduction to Scala

• Scala Basics & REPL. Variables & Datatypes in Scala
• Control Structures, Command Formats & Execution Operations

Advanced Scala

• Functions, Procedures & OOPs Concepts
• Collections & Higher Order Functions
• Anonymous Functions & Higher Order Programming

Apache Spark

• Apache Spark – Introduction, Architecture & Ecosystem
• Applications & Business Derivatives
• MapReduce Comparison

Introduction to RDDs

• RDDs – Introduction, Architecture and Data Loading. Partitioner & Performance Improvements
• Caching & Persistence

Advanced RDDs

• RDD Operations & Programming
• Job Execution Cycles via RDDs
• Types of RDDs in Apache Spark
• Shared Variables

Spark Dataframe API

• Introduction, Creating & Using Dataframes
• SQL Context
• Running SparkSQL via Dataframes
• Parquet Files
• Integration Spark & Hive

Spark Streaming

• Introduction to Spark Streaming
• DStream & Features
• Windows & Stateful Operations
• Apache Kafka Integration
• Socket & File Streaming

Spark MLlib

• Introduction to Machine Learning
• Spark MLlib API
• Supervised & Unsupervised Learning
• Machine Learning via Apache Spark

Spark GraphX

• GraphX – Introduction & Usage
• Graph Analysis, Visualization & Computations via Apache Spark

