Search Results

Found 359 repositories(showing 30)

databricks_data_engineer_associate_cert_prep

jrlasak

🧡50

Databricks Data Engineer Associate Certification Lab: End-to-end hands-on project covering Auto Loader, Medallion Architecture, SCD Type 2, Unity Catalog governance, and Databricks Jobs orchestration. Build a production-grade pipeline on Databricks Free Edition.

Python

Updated 1 week ago

certificationdata-engineeringdatabricks+3

azure-sql-db-databricks

Azure-Samples

❤️40

Azure SQL and Databricks samples and best practices for loading data quickly and efficiently

MIT

Jupyter Notebook

Updated 11 months ago

dolly-v2-12b-8bit-example

lunabrain-ai

❤️40

This repo loads the Dolly V2 12b model (databricks/dolly-v2-12b) using the `transformers` library. The code loads it in 8-bit quantized mode.

Apache-2.0

Python

Updated 1 year ago

databricks_fintech_monitoring

jrlasak

🧡55

Databricks Real-Time Fintech Monitoring Pipeline: Hands-on lab to build a streaming fraud detection system using Auto Loader, watermarked deduplication, stream-static joins, and windowed rules engines in Databricks. Covers dual-SLA architecture for real-time alerts and batch compliance reporting.

Python

Updated 3 weeks ago

data-engineeringdatabricksdatabricks-notebooks+8

datalakehouse-snowflake-databricks

patrickverol

❤️35

In this project I develop a data lakehouse on the Snowflake and Databricks platforms, performing data transformations with DBT and using Airbyte for data loading.

Python

Updated 10 months ago

F1-racing-Azure-DATABRICKS-project

Paras-Gadhiya

🧡65

In this project, we used the Azure cloud services to get done data engineering operations (Ingestion, Transformation, Analysis, Load) on Formula-1 Racing Dataset Available from 'eargst developer API' which includes both 'CSV' and 'JSON' (single and also split) files. we also manage the incremental load and full load approach to dealing with some of that files and the notebook workflow via Azure Data Factory pipelines and Azure Databricks Itself. we schedule that pipelines with tumbling window trigger to get execute that pipeline in a wise manner.

Python

Updated 2 days ago

Lending_Club

tushar-hatwar

❤️35

Extracting, Loading and Transforming Data using python, Azure Databricks, Azure Blob Storage

Python

Updated 10 months ago

incremental-loading-deltalake-databricks

yougnoli

❤️35

Example project demonstrating incremental data loading using Delta Lake in Databricks, based on a use case from the DP-700 Microsoft Fabric Data Engineer certification exam.

Jupyter Notebook

Updated 4 months ago

Supply-Chain-ETL

ayush9892

❤️35

Data Engineering Project on Supply Chain ETL. Creating a dynamic ADF pipeline to ingest both Full Load and Incremental Load data from SQL Server and then transform these datasets based on medallion architecture using Databricks.

Jupyter Notebook

Updated 10 months ago

adf-pipelineadlsgen2azure+5

IDS706_w11_DatabricksETL_Individual_hzx

nogibjj

🧡50

Individual Project #3: Databricks ETL (Extract Transform Load) Pipeline

MIT

Python

Updated 1 month ago

postgres-to-databricks-cdc

victor-antoniassi

🧡55

Enterprise-grade ingestion blueprint for Postgres to Databricks powered by dlt. Features dual-mode operation (Full Load + CDC Load) and robust CI/CD via Databricks Asset Bundles.

Python

Updated 2 weeks ago

cdcdatadata-engineering+10

ETL-Databricks

Vivi-Figueiredo

❤️35

Processo de ETL (Extract, Transform, Load) no Databricks com extração de dados via API.

Jupyter Notebook

Updated 1 year ago

apidata-sciencedatabricks+5

dbtopo-bricks

lbruand-db

🧡50

Load the IGN BD TOPO database (French national topographic dataset) into Databricks Delta tables with geometry support.

Python

Updated 2 weeks ago

HugoHu-Project-3

0HugoHu

❤️25

Individual Project #3: Databricks ETL (Extract Transform Load) Pipeline

Unlicense

Jupyter Notebook

Updated 10 months ago

ELT_Pipeline_Python_Snowflake

tushar-hatwar

❤️35

Extracting, Loading and Transforming Data using python, databricks and Snowflake

Jupyter Notebook

Updated 2 years ago

Azure

ShreevaniRao

❤️45

Azure projects - End to End Data Engineering Project with medallion architecture using Azure Data Factory & Azure Databricks. Azure Serverless/Logical DataWarehouse using Azure Synapse Analystics to demo CETAS, Data Modeling, Incremental loading, CDC and Sql Monitoring the data processing connected to Power BI

Jupyter Notebook

Updated 2 months ago

azureazure-storageazuredatabricks+8

databricks-claude-lb

cjj198909

❤️45

A load balancer proxy for Claude Code that distributes requests across multiple Databricks Claude endpoints

Python

Updated 1 month ago

delta-lake-library

tam159

❤️35

Delta lake common libraries to ingest, process and load data into Databricks lakehouse with Spark jobs

Python

Updated 11 months ago

Load-DataBricks

cleberzumba

❤️35

Importando dados desestruturados no spark databricks com pyspark

Updated 3 years ago

Azure_Netflix_Data_Engineering_Project

JessePepple

🧡55

This Azure Data Engineering project ingests Netflix datasets using ADF and Databricks, stores raw data in Azure Data Lake, transforms it in Databricks, and loads the cleaned data into Azure Databricks SQL Warehouse and Synapse Analytics.

Jupyter Notebook

Updated 3 weeks ago

Ecomm_event_driven_dbx_Pipline

iamabhaydawar

❤️35

Event-driven data pipeline on Databricks for real-time e-commerce data processing with incremental loading, validation, enrichment, and Delta Lake operations

Jupyter Notebook

Updated 3 months ago

ecommerce-etl-pipeline

Yaswanthv5

❤️35

An End-to-End Ecommerce Data pipeline for batch Streaming. From extracting the data from the API's and external data sources to Transform using databricks and loading the data into the GCP Bigquery for the KPI's and Dashboards.

Jupyter Notebook

Updated 7 months ago

airflowdatabricksdlt+4

databricks-medallion-etl-pipeline

Suvajit-Bhattacharjee

❤️45

"An end-to-end data engineering pipeline built on Databricks, demonstrating the Medallion Architecture (Bronze, Silver, Gold layers). It processes raw e-commerce sales data using PySpark, Delta Lake, and Auto Loader for reliable, incremental ETL into a structured data lakehouse ready for analytics."

Python

Updated 1 month ago

Azure_Metadata_Driven_ETL_Framework

PranovSarath

❤️35

An enterprise level, scalable metadata-driven ETL framework in Azure using Azure Data Factory V2, Azure Data Lake Gen2, Databricks and Azure SQL DB and Synapse which can perform incremental and full data loads from any given source for any number of entities. All configuration is maintained in a single Azure SQL DB acting as the control database.

Python

Updated 2 years ago

snowplow-databricks-loader

snowplow-incubator

❤️35

Snowplow Databricks Loader

NOASSERTION

Scala

Updated 4 months ago

LoadForecasting

gregorosaurus

❤️35

ML Load forecasting notebooks using Databricks

Jupyter Notebook

Updated 2 years ago

Mini-ETL-Pipeline-on-Databricks-Incremental-Loads-with-Auto-Loader

gitanshsyal

❤️25

No description available

Jupyter Notebook

Updated 5 months ago

Azure-Databricks-Data-Pipeline-for-Incremental-Data-Loading-

Jparihar33

❤️25

No description available

Python

Updated 8 months ago

Load-Balancing-Social-Media-Data-Pipeline-in-Databricks

ekhosravie

❤️25

No description available

Python

Updated 4 months ago

prop-business-airbnb_rentals

raybags-dev

❤️20

Application creates.handle piplines that load, cleans, enrich, normalize and uploads cleaned datasets to databricks

Jupyter Notebook

Updated 1 year ago

GitHub Explorer

Search Results

databricks_data_engineer_associate_cert_prep

azure-sql-db-databricks

dolly-v2-12b-8bit-example

databricks_fintech_monitoring

datalakehouse-snowflake-databricks

F1-racing-Azure-DATABRICKS-project

Lending_Club

incremental-loading-deltalake-databricks

Supply-Chain-ETL

IDS706_w11_DatabricksETL_Individual_hzx

postgres-to-databricks-cdc

ETL-Databricks

dbtopo-bricks

HugoHu-Project-3

ELT_Pipeline_Python_Snowflake

Azure

databricks-claude-lb

delta-lake-library

Load-DataBricks

Azure_Netflix_Data_Engineering_Project

Ecomm_event_driven_dbx_Pipline

ecommerce-etl-pipeline

databricks-medallion-etl-pipeline

Azure_Metadata_Driven_ETL_Framework

snowplow-databricks-loader

LoadForecasting

Mini-ETL-Pipeline-on-Databricks-Incremental-Loads-with-Auto-Loader

Azure-Databricks-Data-Pipeline-for-Incremental-Data-Loading-

Load-Balancing-Social-Media-Data-Pipeline-in-Databricks

prop-business-airbnb_rentals

databricks_data_engineer_associate_cert_prep

azure-sql-db-databricks

dolly-v2-12b-8bit-example

databricks_fintech_monitoring

datalakehouse-snowflake-databricks

F1-racing-Azure-DATABRICKS-project

Lending_Club

incremental-loading-deltalake-databricks

Supply-Chain-ETL

IDS706_w11_DatabricksETL_Individual_hzx

postgres-to-databricks-cdc

ETL-Databricks

dbtopo-bricks

HugoHu-Project-3

ELT_Pipeline_Python_Snowflake

Azure

databricks-claude-lb

delta-lake-library

Load-DataBricks

Azure_Netflix_Data_Engineering_Project

Ecomm_event_driven_dbx_Pipline

ecommerce-etl-pipeline

databricks-medallion-etl-pipeline

Azure_Metadata_Driven_ETL_Framework

snowplow-databricks-loader

LoadForecasting

Mini-ETL-Pipeline-on-Databricks-Incremental-Loads-with-Auto-Loader

Azure-Databricks-Data-Pipeline-for-Incremental-Data-Loading-

Load-Balancing-Social-Media-Data-Pipeline-in-Databricks

prop-business-airbnb_rentals