Databricks Migration Workshop: From Stored Procedures to Lakehouse (5-Day Intensive) Training Course

Databricks is a unified Lakehouse platform that combines Spark, Delta Lake, and governance (Unity Catalog) to support scalable data engineering and analytics workflows.

This instructor-led, live training (online or onsite) is aimed at intermediate-level technology managers with a data engineering background who wish to migrate complex procedural OLAP logic to a Lakehouse architecture using Databricks, Spark, Delta Lake, Unity Catalog, and native workflows.

Upon completion of this training, participants will be able to:

Explain Lakehouse architecture and the Bronze→Silver→Gold (Medallion) pattern.
Translate stored-procedure logic into Spark DataFrame and notebook implementations.
Design and implement incremental ingestion, merge, and optimization routines using Delta Lake.
Build end-to-end orchestrated pipelines with Databricks Workflows, version control, testing, and governance.

Format of the Course

Intensive, instructor-led sessions with focused demonstrations and explanations.
Daily hands-on labs using representative datasets and migration exercises.
Guided code reviews, performance tuning clinics, and workflow orchestration practice.

Course Customization Options

This course can be tailored to your environment, datasets, and governance requirements; please contact us to arrange customization.

This course is available as onsite live training in Spain or online live training.

Thank you for sending your enquiry! One of our team members will contact you shortly.

Thank you for sending your booking! One of our team members will contact you shortly.

Course Outline

Introduction, Objectives, and Migration Strategy

Course goals, participant profile alignment, and success criteria
High-level migration approaches and risk considerations
Setting up workspaces, repositories, and lab datasets

Day 1 — Migration Fundamentals and Architecture

Lakehouse concepts, Delta Lake overview, and Databricks architecture
SMP vs MPP differences and implications for migration
Medallion (Bronze→Silver→Gold) design and Unity Catalog overview

Day 1 Lab — Translating a Stored Procedure

Hands-on migration of a sample stored procedure to a notebook
Mapping temp tables and cursors to DataFrame transformations
Validation and comparison with original output

Day 2 — Advanced Delta Lake & Incremental Loading

ACID transactions, commit logs, versioning, and time travel
Auto Loader, MERGE INTO patterns, upserts, and schema evolution
OPTIMIZE, VACUUM, Z-ORDER, partitioning, and storage tuning

Day 2 Lab — Incremental Ingestion & Optimization

Implementing Auto Loader ingestion and MERGE workflows
Applying OPTIMIZE, Z-ORDER, and VACUUM; validating results
Measuring read/write performance improvements

Day 3 — SQL in Databricks, Performance & Debugging

Analytical SQL features: window functions, higher-order functions, JSON/array handling
Reading the Spark UI, DAGs, shuffles, stages, tasks, and bottleneck diagnosis
Query tuning patterns: broadcast joins, hints, caching, and spill reduction

Day 3 Lab — SQL Refactoring & Performance Tuning

Refactor a heavy SQL process into optimized Spark SQL
Use Spark UI traces to identify and fix skew and shuffle issues
Benchmark before/after and document tuning steps

Day 4 — Tactical PySpark: Replacing Procedural Logic

Spark execution model: driver, executors, lazy evaluation, and partitioning strategies
Transforming loops and cursors into vectorized DataFrame operations
Modularization, UDFs/pandas UDFs, widgets, and reusable libraries

Day 4 Lab — Refactoring Procedural Scripts

Refactor a procedural ETL script into modular PySpark notebooks
Introduce parametrization, unit-style tests, and reusable functions
Code review and best-practice checklist application

Day 5 — Orchestration, End-to-End Pipeline & Best Practices

Databricks Workflows: job design, task dependencies, triggers, and error handling
Designing incremental Medallion pipelines with quality rules and schema validation
Integration with Git (GitHub/Azure DevOps), CI, and testing strategies for PySpark logic

Day 5 Lab — Build a Complete End-to-End Pipeline

Assemble Bronze→Silver→Gold pipeline orchestrated with Workflows
Implement logging, auditing, retries, and automated validations
Run full pipeline, validate outputs, and prepare deployment notes

Operationalization, Governance, and Production Readiness

Unity Catalog governance, lineage, and access controls best practices
Cost, cluster sizing, autoscaling, and job concurrency patterns
Deployment checklists, rollback strategies, and runbook creation

Final Review, Knowledge Transfer, and Next Steps

Participant presentations of migration work and lessons learned
Gap analysis, recommended follow-up activities, and training materials handoff
References, further learning paths, and support options

Requirements

An understanding of data engineering concepts
Experience with SQL and stored procedures (Synapse / SQL Server)
Familiarity with ETL orchestration concepts (ADF or similar)

Audience

Technology managers with a data engineering background
Data engineers transitioning procedural OLAP logic to Lakehouse patterns
Platform engineers responsible for Databricks adoption

35 Hours

Custom Corporate Training

Training solutions designed exclusively for businesses.

Customized Content: We adapt the syllabus and practical exercises to the real goals and needs of your project.
Flexible Schedule: Dates and times adapted to your team's agenda.
Format: Online (live), In-company (at your offices), or Hybrid.

Investment

Price per private group, online live training, starting from 8000 € + VAT*

(*The final price may vary depending on the technical specialization of the course, the level of customization, the method of delivery and the number of learners)

Need help picking the right course?

Testimonials (1)

All the topics covered, although many were very quick, give us an idea of what we will need to delve into further. Additionally, I liked that we got to do some hands-on practice, although I still believe the course deserves more.

Databricks Migration Workshop: From Stored Procedures to Lakehouse (5-Day Intensive) Training Course

Course Outline

Requirements

Custom Corporate Training

Testimonials (1)

Sandra Mariela Lopez Bernal - Kueski

Course - Databricks

Upcoming Courses

Databricks Migration Workshop: From Stored Procedures to Lakehouse (5-Day Intensive)

Databricks Migration Workshop: From Stored Procedures to Lakehouse (5-Day Intensive)

Databricks Migration Workshop: From Stored Procedures to Lakehouse (5-Day Intensive)

Databricks Migration Workshop: From Stored Procedures to Lakehouse (5-Day Intensive)

Databricks Migration Workshop: From Stored Procedures to Lakehouse (5-Day Intensive)

Related Categories

This site in other countries/regions

Europe

Asia Pacific

North America

South America

Africa / Middle East

Other sites

Databricks Migration Workshop: From Stored Procedures to Lakehouse (5-Day Intensive) Training Course

Course Outline

Requirements

Custom Corporate Training

Testimonials (1)

Sandra Mariela Lopez Bernal - Kueski

Course - Databricks

Upcoming Courses

Databricks Migration Workshop: From Stored Procedures to Lakehouse (5-Day Intensive)

Databricks Migration Workshop: From Stored Procedures to Lakehouse (5-Day Intensive)

Databricks Migration Workshop: From Stored Procedures to Lakehouse (5-Day Intensive)

Databricks Migration Workshop: From Stored Procedures to Lakehouse (5-Day Intensive)

Databricks Migration Workshop: From Stored Procedures to Lakehouse (5-Day Intensive)

Related Courses

Machine Learning with Azure Databricks for Finance

Databricks

Data Analysis with Databricks for Finance

Related Categories

Databricks

This site in other countries/regions

Europe

Asia Pacific

North America

South America

Africa / Middle East

Other sites