Data Science | AI | DataOps | Engineering
wallpaperRep.png

Advanced Databricks Training

Databricks training

 

Big data made easy with azure databricks

Big Data processing is being democratised. Tools such as Azure Databricks, mean you do not need to be a Java expert to be a Big Data Engineer anymore. Databricks has made your life much easier! While it is easier, there is still a lot to learn and knowing where to start can be quite daunting.

Too often training courses are academic, teaching theory and not application. We have created an applied Azure Databricks course. We have built this course based on demand and real-world problems faced by our customers. It will teach you how to implement different scenarios in Databricks, but most importantly it will tell you why, when to implement and when not to implement.

This course is designed to take a data professional from Zero to Hero in just 3 days. You will leave this course with all the skills you need to get started on your Big Data Journey. You will learn by experimentation, this is a lab heavy training session. If you are starting a new project and want to know if Databricks is suitable for your problem, then we also offer tailored training around your problem domain.

The course will be delivered by Terry McCann, Microsoft MVP. Terry is recognised for his ability to convert deep technical material in to bite sized understandable chunks.

Cost

On-site delivery: £1,650 per delegate (minimum 8 delegates)
Tailored course: POA

AppliedAzureDb.jpg
Certs.png

Very good and understandable explanations of quite complex stuff. A very good kickoff to get started with Databricks. Great combination of theory and real life experiences.
— Previous course attendee

Agenda

(A Full agenda is available upon request) - This course is evolving with the upcoming release of Spark 3.0

Introduction

General introduction

  • Engineering Vs Data Science

Intro to Big Data Processing

  • Introduction to Big Data Processing - why we do what we do.
  • Introduce you to the skills required
  • Introduction to Spark
  • Introduce Azure Databricks

Exploring Azure Databricks

  • Getting set up
  • Exploring Azure Databricks
 

The languages

  • The languages (Scala/Python/R/Java)
  • Introduction to Scala
  • Introduction to PySpark
  • PySpark deep dive
  • Working with the additional Spark APIs

Data Engineering

Managing Databricks

  • Managing Secrets
  • Orchestrating Pipelines
  • Troubleshooting Query Performance
  • Source Controlling Notebooks
  • Cluster Sizing
  • Installing packages on our cluster / All clusters

Data Engineering

  • Cloud ETL Patterns
  • Design patterns
  • Loading Data
  • Schema Management
  • Transforming Data
  • Storing Data
  • Managing Lakes

Data Factory Data Flows

  • Creating Data Flows
  • Execution Comparison

Data Science

Data Science

  • Introduction to Data Science
  • Batch Machine Learning vs Interactive.
  • Python for Machine learning
  • How to train a model
  • Enrich our existing data - Batch machine learning

Spark ML

  • What is SparkML
  • SparkML components
  • Creating a regression model in SparkML
  • Creating a classification model in SparkML
  • Tuning models at scale
  • Persisting models & retraining
  • Model deployment scenarios

Databricks Delta

Databricks Delta Tables

  • Introduction to Delta, What is is how it works
  • Datalake management
  • Problems with Hadoop based lakes
  • Creating a Delta Table
  • The Transaction Log
  • Managing Schema change
  • Time travelling

Bring it all back together

  • How this all fits in to a wider architecture.
  • Projects we have worked on.
  • Managing Databricks in production
  • Deploying with Azure DevOps

Labs

  • Getting set up (Building a new instance, getting connected, creating a cluster)
  • Creating all the required assets.
  • Running a notebooks
  • An introduction to the key packages we will be working with.
  • Cleaning data
  • Transforming data
  • Creating a notebook to move data from blob and clean it up.
  • Scheduling a notebook to run with Azure Data Factory
  • Creating a streaming application
  • Creating a Machine learning model
  • Deploying a machine learning model
  • Reading a stream and enriching our stream
  • Databricks Delta

If you would like to enquire about training then please complete the form below and we will get back to you.