Found 147 repositories(showing 30)
longNguyen010203
A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Docker. Data from kaggle and youtube-api
aymane-maghouti
The project aims to automate the extraction of data from a YouTube channel, transform the data into a suitable format, and make it available for analysis through a Power BI dashboard. By following a structured ETL process, this project streamlines data retrieval, preparation, and visualization.
HashimGharip
This project provides a practical example of building production-ready data pipelines and working with real-world streaming data sources, offering valuable insights and techniques that can be directly applied in data engineering roles.
mahanta-mayur
This project aims to securely manage, streamline, and perform analysis on the structured and semi-structured YouTube videos data based on the video categories and the trending metrics. Brief : Data -> Multiple buckets s3 Aws -> Data Preprocessing ETL (Lambda Function with s3 trigger) -> ETL to join -> Analysis Bucket -> QuickSight Data Analysis AWS
prajwal-ns
No description available
snkpgithub
This project leverages AWS cloud services to ingest, process, and analyze YouTube trending video data. It implements a scalable ETL pipeline and data lake architecture, culminating in interactive dashboards for insights into video popularity trends across different regions.
hwf87
This is a data scraping project that sources data from the Houzz e-commerce platform, the CNN YouTube channel, and the TedTalk official website. The implementation uses the Apache Beam framework to build an ETL pipeline and write the results into an Elasticsearch database. The final step visualizes the crawler results using Kibana.
KirandeepMarala
In this project, we are analysing T-Series youtube channels by building complete ETL pipelines
AmiraliKhalife
No description available
prathyyyyy
Youtube ETL pipeline Project Using Pyspark and AWS
chinedumsunday
This project is an automated ETL (Extract, Transform, Load) pipeline that fetches trending video statistics from the YouTube Data API v3, processes the data, validates it, and stores it in a SQLite database.
shuklaank
No description available
thivsiv
No description available
Brian-McHugh
Create a North American database for YouTube top trending videos
charlie-n01r
ETL pipeline project using YouTube's API
Lashmanbala
No description available
carolyntw
This project focuses on automating the Extract, Transform, Load (ETL) process and data processing for YouTube analytics.
lokmanTech
End-to-end YouTube data analysis project covering ingestion, AWS data lake, ETL, scalability, and reporting with Python and AWS services.
sip-n-code
A project following DataWithBaraa's awesome DE project on YouTube: Building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics.
docmhvr
Data Engineering project using Youtube trending video statistics data with data lake, data pipelines, ETLs using AWS services and a Tableau Dashboard
The aim of this project is to combine secure data management with insightful analysis of YouTube video categories and trends using an ETL pipeline.
dhe3ra-git
Production-style ETL pipeline built with Apache Airflow, PySpark, Docker, and YouTube API. This project demonstrates a real-world Data Engineering workflow using Bronze / Silver / Gold architecture.
sangeetha2402-ravichandran
End-to-end data analytics project on YouTube comments using Python, covering ETL, text cleaning, EDA, sentiment analysis, emoji analysis, word clouds, and engagement insights with visualizations.
lalitharavi98
An end-to-end data engineering project for analyzing YouTube trending videos using AWS services. This project leverages S3 for storage, Glue for ETL, Athena for querying, IAM Roles for security, QuickSight for visualization, and Lambda for automation.
Valen1oussama
This project showcases a robust data pipeline for extracting comments from YouTube videos using the YouTube Data API, processing the comments, and storing them in a MySQL database. What sets this project apart is its integration with Apache Airflow, which allows you to schedule and orchestrate the ETL (Extract, Transform, Load) process with ease.
vikramramanathan0908
This repository showcases an end-to-end data engineering project using AWS services (S3, Glue, Lambda, Athena, QuickSight) to process and analyze YouTube trending data. It covers automated ETL pipelines, data transformations, and insightful visualizations to uncover trends in video engagement and performance.
LucasAltazin
This project is a personal product & tech sandbox to turn my YouTube Music history into a real analytics product. I use it to practice end-to-end Product Owner work (user research, user stories, iterations, backlog) combined with data engineering and dashboarding (ETL, data model, BI).
ShreCodes2809
A cloud-based data pipeline for ingesting, processing, and analyzing large-scale YouTube data. This project leverages AWS services such as S3, Glue, Lambda, Athena, IAM, and QuickSight to build a scalable and serverless ETL system. It enables real-time data transformation, centralized storage, and interactive reporting through a BI dashboard.
VandanaBhumireddygari
This project focuses on securely managing, streamlining, and analyzing structured and semi-structured data from YouTube videos based on categories and trending metrics. The goal is to build a comprehensive ETL system to process and transform raw data into a usable format, store it in a centralized data lake, and scale the solutions.
PreethiThulasi
YouTube ETL Project