Big Data & Microsoft Azure

DP-100, DP-203, & DP-900 Exam Prep

 

ABOUT THE Program

Data are critical for characterization, calibration, verification, validation, and assessment of models for predicting the long-term structural durability and performance of materials in extreme environments. Data helps understand and improve business processes in order to reduce wasted money and time. Without adequate data to assess them, many models would have no purpose.

This online program includes an introductory course in Big Data and prepares the students to take the Microsoft DP-900 Microsoft Azure Data Fundamentals, DP 203 Data Engineering on Microsoft Azure, & DP-100 Designing & Implementing a Data Science Solution on Azure certification exams.

  • Big Data training requires a holistic approach and a change to regular working practices. A number of tools are available for working with Big Data. Many of the tools are open source and Linux distribution based. This program covers the fundamentals of Big Data, including positioning it in a historical IT context, the tools available for working with Big Data, the Big Data stack, and finally, an in-depth look at Apache Hadoop. Our Big Data online training program is designed for IT engineers, programmers, and DBAs working with or interested in, as well as business decision makers interested in implementing or managing Big Data systems.

  • DP-100 Designing & Implementing a Data Science Solution on Azure teaches the students how to leverage their existing knowledge of Python and machine learning to manage data ingestion and preparation, model training and deployment, and machine learning solution monitoring in Microsoft Azure.

  • DP-203 Data Engineering on Microsoft Azure offers an opportunity to prove knowledge expertise in integrating, transforming, and consolidating data from various structured and unstructured data systems into structures that are suitable for building analytics solutions that use Microsoft Azure data services.

  • DP-900 Microsoft Azure Data Fundamentals is intended for those candidates who want to start working with data on the cloud, get basic skills in cloud data services, and also build foundational knowledge of cloud data services in Microsoft Azure.

This program is self-paced. Self-paced programs create a unique learning experience that allows students to learn independently and at a pace that best suits them.


Certification

This program fully prepares students to take the Microsoft DP-900, DP 203, & DP-100 certification exams.

The certification exams are not a requirement for graduation. Vendor certifications are at the student’s expense. Vouchers may be available depending on the student’s funding and financial aid.


Tuition: $3,997

Duration: 138 Hours (133 Hours + 5 Hours Virtual Practice Lab)

Includes e-books, virtual practice labs, and certification exam review questions.

Prerequisites: Basic computer skills and familiarity with the internet

Students have full online access to the program for 1 year.

To learn more about ETI’s tuition and financial aid options, click here.


Course Outline

Big Data

66 hours

    • Big Data Fundamentals

    • Big Data Interpretation

    • The Big Data Technology Wave

    • Big Data Opportunities and Challenges

    • Apache Hadoop

    • MapReduce Essentials

    • The Marketing Perspective

    • The Engineering Perspective

    • The Legal Perspective

    • The Sales Perspective

    • The Strategic Planning Perspective

    • The Corporate Leadership Perspective

    • Ecosystem for Hadoop

    • Installation of Hadoop

    • Data Repository with HDFS and HBase

    • Data Repository with Flume

    • Data Repository with Sqoop

    • Data Refinery with YARN and MapReduce

    • Data Factory with Hive

    • Data Factory with Pig

    • Data Factory with Oozie and Hue

    • Data Flow for the Hadoop Ecosystem

    • Designing Hadoop Clusters

    • Hadoop in the Cloud

    • Deploying Hadoop Clusters

    • Hadoop Cluster Availability

    • Securing Hadoop Clusters

    • Operating Hadoop Clusters

    • Stabilizing Hadoop Clusters

    • Capacity Management for Hadoop Clusters

    • Performance Tuning of Hadoop Clusters

    • Cloudera Manager and Hadoop Clusters

    • Apache Kafka Operations

    • Apache Kafka Development

    • Kafka Integration with Spark

    • Kafka Integration with Storm

    • Clustering with Kafka

    • Kafka Real-time Applications

    • Apache Storm Introduction – Architecture and Installation

    • Apache Storm Introduction – API and Topology

    • Hadoop Solution

    • Analyzing, Querying, and Extracting Big Data

    • Deployment and Configuration

    • Query and Data Management

    • Managing Big Data Operations

    • Quality and Security of Big Data Operations

    • Introduction to Hadoop

    • Introduction to Data Modeling in Hadoop

    • Introduction to Apache Spark

    • Apache Spark SQL

    • Structured Streaming

    • Spark Monitoring and Tuning

    • Spark Security

    • Hadoop Distributed File System

    • Hadoop Clusters

    • Apache Hadoop on Amazon EMR

    • Hadoop Ranger

    • Hadoop Maintenance and Distributions

    • Accessing Data with Spark: An Introduction to Spark

    • Accessing Data with Spark: Data Analysis Using the Spark DataFrame API

    • Accessing Data with Spark: Data Analysis using Spark SQL

Microsoft azure exam prep

  • 18 hours + 5 hours virtual practice lab

    • Data Workloads

    • Data Analytics

    • Relational Data Workloads

    • Relational Data Management

    • Provisioning and Configuring Relational Data Services

    • Azure SQL Querying Techniques

    • Non-Relational Data Workloads

    • Azure Analytics Workloads

    • Modern Data Warehousing

    • Azure Data Ingestion & Processing

    • Azure Data Visualization

  • 26 hours

    • Storage Accounts

    • Designing Data Storage Structures

    • Data Partitioning

    • Designing the Serving Layer

    • Physical Data Storage Structure

    • Logical Data Structures

    • The Serving Layer

    • Data Policies & Standards

    • Securing Data Access

    • Securing Data

    • Data Lake Storage

    • Data Flow Transformations

    • Data Factory

    • Databricks

    • Databrick Processing

    • Stream Analytics

    • Synapse Analytics

    • Data Storage Monitoring

    • Data Process Monitoring

    • Data Solution Optimization

  • 23 hours

    • Machine Learning

    • ML Services

    • ML Regression Models

    • ML Classification Models

    • ML Clustering Models

    • Project Jupyter & Notebooks

    • Azure ML Workspaces

    • Azure Data Platform Services

    • Azure Storage Accounts

    • Storage Strategy

    • Azure Data Factory

    • Non-relational Data Stores

    • ML Data Stores & Compute

    • ML Orchestration & Deployment

    • Model Features & Differential Privacy

    • ML Model Monitoring

    • Azure Data Storage Monitoring

    • Data Process Monitoring

    • Data Solution Optimization

    • High Availability & Disaster Recovery