Search Results

Found 24 repositories(showing 24)

ci-cd-for-data-processing-workflow

GoogleCloudPlatform

❤️26

No description available

Apache-2.0

HCL

Updated 3 months ago

Leftovers-The-Ecommerce-Food-Web-App

connor-hitchcock

❤️25

During this full year university project course I worked within a team of eight people to develop a web application to prevent kiwi’s from throwing away one third of our food. By providing food companies with an e-commerce platform to sell food products close to expiring to cost conscious individuals, our team hopes to address this issue. The project ran within an scrum and agile processing framework, where we closely communicated with the product owner to develop the application he had envisioned. The project consisted of six sprints. During each sprint we would initially plan out what stories we would take on, split them up into tasks in substantial detail and log the time and completion on Jira. Additionally, to keep everyone on the same page we had two standups a week with our scrum master. Furthermore, we used a range of strategies on our workflow to improve code quality and minimise risk. This Included the use of code reviews before finishing each task, the use of task branching to prevent merge conflicts, substantial automated unit and integration testing with Junit, and automated acceptance testing with cucumber. Moreover, to keep our team members accountable for their mistakes we created a wiki that includes strict code styles, decision making policies, definition of done, yellow card policy, git policy, user manuals, and our testing procedures. Our technology stack used a client-server pattern for our web application, where VueJS was used on the frontend, spring boot for the backend, RESTful APIs to connect the two, MarinaDB for storing data externally, and gradle, sonarqube, npm, git, and a CI/CD pipeline to improve code quality and for seamless collaboration within the team. I took on a leadership role within the team by helping teammates solve complex problems, completing admin tasks such as setting up the CI/CD pipeline and cucumber, and providing a bridge between our team and the product owner and scrum team. This project is still ongoing and will be finished in October.

Unlicense

Java

Updated 6 months ago

High-Performance-Cloud-Data-Retrieval-Visualization-Framework

AkhilRai28

❤️35

An advanced, open-source framework for retrieving, processing, and visualizing diverse cloud data. Built with Python, Docker, and integrated CI/CD workflows, this solution offers RESTful API integration, high-performance data analytics, and interactive visualization capabilities for scalable cloud data management.

Python

Updated 10 months ago

analyticsci-cdcloud-data+11

databricksProject

Srilekha-1106

❤️35

Implemented Azure Databricks for real-time data processing and governance using Unity Catalog, Spark Structured Streaming, Delta Lake features, Medallion Architecture, and end-to-end CI/CD pipelines. Focused on incremental loading, compute cluster management, maintaining data quality, and creating workflows.

Python

Updated 9 months ago

adlsgen2azureazuredatabricks+7

Vehicle-Insurance-Approval

PreethamVA

❤️35

This MLOps project showcases an end-to-end pipeline for vehicle insurance data, covering data processing, model training, deployment, and CI/CD automation. It highlights real-world ML workflows using modern tools and best practices, making it ideal for recruiters and developers exploring production-ready ML systems.

Python

Updated 4 months ago

Task-Automation-Toolkit

daminasaws

❤️35

Create a toolkit or library that offers a collection of pre-built automation scripts or tools for common tasks. This can include tasks like file processing, data manipulation, system administration, or CI/CD workflows. You can use technologies like Python or Bash scripting to build the toolkit.

Updated 2 years ago

ci-cd-for-data-processing-workflow

sheerryn

❤️30

No description available

Apache-2.0

Python

Updated 3 years ago

ci-cd-for-data-processing-workflow

kumarm-foxtel

❤️30

No description available

Apache-2.0

Updated 3 years ago

ci-cd-for-data-processing-workflow

victorgmrqs

❤️30

No description available

Apache-2.0

HCL

Updated 2 years ago

ci-cd-for-data-processing-workflow

chungtseng96

❤️30

No description available

Apache-2.0

Python

Updated 5 years ago

gcp-cicd-workflow

GDBSD

❤️35

Setting up a CI/CD pipeline for data-processing workflow

Updated 6 years ago

dbt-btc-project

DavidEMDias

❤️45

A complete data pipeline demonstrating cloud storage, data processing, data transformations, CI/CD workflows, and visualizations. Provides a practical foundation for building data engineering and analytics solutions.

Python

Updated 2 months ago

awsdbtlooker+1

Genomic-Guardrail-Cloud-Pipeline

AmberFu

❤️35

An automated, serverless bioinformatics pipeline designed for secure genomic data processing. This project demonstrates the integration of **Financial-grade DevOps (CI/CD, Guardrails)** into **Biotech data workflows**.

Updated 3 months ago

Global-data-deployment-with-CICD-pipeline

thunchanokbow

❤️35

The idea of CI/CD pipeline is becoming more and more important part especially for Data Engineer. GitHub Actions and GitLab CI allow us to have workflow to automate that process.

Python

Updated 2 years ago

aws-s3docker-imagegit+4

streaming-etl-with-eventhub

etl-kenobi

❤️35

A modular, reusable Data Quality validation framework designed for enterprise-scale data pipelines. Built on Azure with PySpark and Delta Lake, this project demonstrates CI/CD integration, batch ingestion workflows, and extensible DQ checks for production-ready data processing.

Updated 7 months ago

Spark-Data-Analysis

kamal-marouane

❤️35

Automated cloud-based data pipeline using Apache Spark and Kafka for large-scale cluster analysis. Infrastructure provisioned with Terraform, leveraging Google Cloud Data processing for batch jobs and real-time streaming. Includes CI/CD workflows and monitoring for optimized performance.

Jupyter Notebook

Updated 1 year ago

plant-inventory

codeSmithDave

❤️40

Full-stack plant inventory system built with Next.js/React + ASP.NET Core 9. Features large CSV processing, paginated APIs, EF Core/SQL Server integration, and CI/CD workflows. Designed for scalable data management (1M+ records).

MIT

TypeScript

Updated 7 months ago

csharpdotnet-corenextjs15+1

DataFueler-automated-ETL-pipenine

S-Delowar

❤️35

Building a production-ready ETL pipeline with automated workflows, cloud integration, and CI/CD deployment. By leveraging Airflow, Docker, and AWS services, the pipeline ensures scalability, automation, and reliability for handling large-scale data processing tasks.

Python

Updated 1 year ago

nexus-ops

AliGaffarToksoy

🧡55

Enterprise-grade real-time event and log processing pipeline designed to handle high-throughput data streams. Built with Kafka for scalable ingestion, OpenSearch (ELK) for indexing and visualization, Terraform for infrastructure automation, and Jenkins for CI/CD, enabling reliable, automated and observable data workflows.

Python

Updated 1 week ago

pii-redaction-pipeline

shivkhurana

❤️45

Automated Data Processing Pipeline designed to detect and redact PII (Personally Identifiable Information) from server logs using NLP (Spacy) and Regex. Containerized with Docker and integrated into a GitHub Actions CI/CD workflow for automated compliance testing.

Updated 1 month ago

ci-cddata-privacydocker+4

azure-mlops-classification-pipeline

hector-en

❤️40

This project demonstrates deploying a classification model using Azure DevOps, focusing on predicting customer license status. It covers the CI/CD pipeline setup, Docker containerization, and integration with Azure services for real-time data processing, enhancing operational workflows in the licensing domain.

Apache-2.0

Jupyter Notebook

Updated 1 year ago

movies-recommender-system

aniruddhapal

❤️35

A content-based movie recommender system built with scikit-learn and deployed as a live Streamlit app on Render. This project demonstrates an end-to-end MLOps workflow, including data processing pipelines, artifact optimization for a 512MB RAM limit, and robust deployment with CI/CD.

Python

Updated 9 months ago

data-sciencedeploymentmachine-learning+6

text-summarization-app

tahsinac

❤️35

NLP ML pipeline includes data processing, model training, and evaluation for text summarization with HuggingFace Transformers. Deploy predictions via FastAPI and implement a strong CI/CD workflow using GitHub Actions. This involves containerization, image pushing to Amazon ECR, and continuous deployment on EC2 for API serving.

Jupyter Notebook

Updated 2 years ago

Crypto-Market-Intelligence-Pipeline

ryanheng99

❤️35

A full-stack data engineering project that ingests real-time Bitcoin price data from CoinGecko, processes it, trains an ARIMA model for forecasting, and serves predictions via a FastAPI endpoint. The entire workflow is automated using CI/CD with GitHub Actions and containerized with Docker.

Python

Updated 4 months ago

All 24 repositories loaded

GitHub Explorer

Search Results

ci-cd-for-data-processing-workflow

Leftovers-The-Ecommerce-Food-Web-App

High-Performance-Cloud-Data-Retrieval-Visualization-Framework

databricksProject

Vehicle-Insurance-Approval

Task-Automation-Toolkit

ci-cd-for-data-processing-workflow

ci-cd-for-data-processing-workflow

ci-cd-for-data-processing-workflow

ci-cd-for-data-processing-workflow

gcp-cicd-workflow

dbt-btc-project

Genomic-Guardrail-Cloud-Pipeline

Global-data-deployment-with-CICD-pipeline

streaming-etl-with-eventhub

Spark-Data-Analysis

plant-inventory

DataFueler-automated-ETL-pipenine

nexus-ops

pii-redaction-pipeline

azure-mlops-classification-pipeline

movies-recommender-system

text-summarization-app

Crypto-Market-Intelligence-Pipeline

ci-cd-for-data-processing-workflow

Leftovers-The-Ecommerce-Food-Web-App

High-Performance-Cloud-Data-Retrieval-Visualization-Framework

databricksProject

Vehicle-Insurance-Approval

Task-Automation-Toolkit

ci-cd-for-data-processing-workflow

ci-cd-for-data-processing-workflow

ci-cd-for-data-processing-workflow

ci-cd-for-data-processing-workflow

gcp-cicd-workflow

dbt-btc-project

Genomic-Guardrail-Cloud-Pipeline

Global-data-deployment-with-CICD-pipeline

streaming-etl-with-eventhub

Spark-Data-Analysis

plant-inventory

DataFueler-automated-ETL-pipenine

nexus-ops

pii-redaction-pipeline

azure-mlops-classification-pipeline

movies-recommender-system

text-summarization-app

Crypto-Market-Intelligence-Pipeline