Found 359 repositories(showing 30)
Databricks Data Engineer Associate Certification Lab: End-to-end hands-on project covering Auto Loader, Medallion Architecture, SCD Type 2, Unity Catalog governance, and Databricks Jobs orchestration. Build a production-grade pipeline on Databricks Free Edition.
Azure-Samples
Azure SQL and Databricks samples and best practices for loading data quickly and efficiently
lunabrain-ai
This repo loads the Dolly V2 12b model (databricks/dolly-v2-12b) using the `transformers` library. The code loads it in 8-bit quantized mode.
Databricks Real-Time Fintech Monitoring Pipeline: Hands-on lab to build a streaming fraud detection system using Auto Loader, watermarked deduplication, stream-static joins, and windowed rules engines in Databricks. Covers dual-SLA architecture for real-time alerts and batch compliance reporting.
patrickverol
In this project I develop a data lakehouse on the Snowflake and Databricks platforms, performing data transformations with DBT and using Airbyte for data loading.
Paras-Gadhiya
In this project, we used the Azure cloud services to get done data engineering operations (Ingestion, Transformation, Analysis, Load) on Formula-1 Racing Dataset Available from 'eargst developer API' which includes both 'CSV' and 'JSON' (single and also split) files. we also manage the incremental load and full load approach to dealing with some of that files and the notebook workflow via Azure Data Factory pipelines and Azure Databricks Itself. we schedule that pipelines with tumbling window trigger to get execute that pipeline in a wise manner.
tushar-hatwar
Extracting, Loading and Transforming Data using python, Azure Databricks, Azure Blob Storage
Example project demonstrating incremental data loading using Delta Lake in Databricks, based on a use case from the DP-700 Microsoft Fabric Data Engineer certification exam.
ayush9892
Data Engineering Project on Supply Chain ETL. Creating a dynamic ADF pipeline to ingest both Full Load and Incremental Load data from SQL Server and then transform these datasets based on medallion architecture using Databricks.
Individual Project #3: Databricks ETL (Extract Transform Load) Pipeline
victor-antoniassi
Enterprise-grade ingestion blueprint for Postgres to Databricks powered by dlt. Features dual-mode operation (Full Load + CDC Load) and robust CI/CD via Databricks Asset Bundles.
Vivi-Figueiredo
Processo de ETL (Extract, Transform, Load) no Databricks com extração de dados via API.
lbruand-db
Load the IGN BD TOPO database (French national topographic dataset) into Databricks Delta tables with geometry support.
0HugoHu
Individual Project #3: Databricks ETL (Extract Transform Load) Pipeline
tushar-hatwar
Extracting, Loading and Transforming Data using python, databricks and Snowflake
ShreevaniRao
Azure projects - End to End Data Engineering Project with medallion architecture using Azure Data Factory & Azure Databricks. Azure Serverless/Logical DataWarehouse using Azure Synapse Analystics to demo CETAS, Data Modeling, Incremental loading, CDC and Sql Monitoring the data processing connected to Power BI
cjj198909
A load balancer proxy for Claude Code that distributes requests across multiple Databricks Claude endpoints
tam159
Delta lake common libraries to ingest, process and load data into Databricks lakehouse with Spark jobs
cleberzumba
Importando dados desestruturados no spark databricks com pyspark
JessePepple
This Azure Data Engineering project ingests Netflix datasets using ADF and Databricks, stores raw data in Azure Data Lake, transforms it in Databricks, and loads the cleaned data into Azure Databricks SQL Warehouse and Synapse Analytics.
iamabhaydawar
Event-driven data pipeline on Databricks for real-time e-commerce data processing with incremental loading, validation, enrichment, and Delta Lake operations
Yaswanthv5
An End-to-End Ecommerce Data pipeline for batch Streaming. From extracting the data from the API's and external data sources to Transform using databricks and loading the data into the GCP Bigquery for the KPI's and Dashboards.
Suvajit-Bhattacharjee
"An end-to-end data engineering pipeline built on Databricks, demonstrating the Medallion Architecture (Bronze, Silver, Gold layers). It processes raw e-commerce sales data using PySpark, Delta Lake, and Auto Loader for reliable, incremental ETL into a structured data lakehouse ready for analytics."
PranovSarath
An enterprise level, scalable metadata-driven ETL framework in Azure using Azure Data Factory V2, Azure Data Lake Gen2, Databricks and Azure SQL DB and Synapse which can perform incremental and full data loads from any given source for any number of entities. All configuration is maintained in a single Azure SQL DB acting as the control database.
snowplow-incubator
Snowplow Databricks Loader
gregorosaurus
ML Load forecasting notebooks using Databricks
No description available
No description available
No description available
raybags-dev
Application creates.handle piplines that load, cleans, enrich, normalize and uploads cleaned datasets to databricks