Found 7,132 repositories(showing 30)
A Full Stack ML (Machine Learning) Roadmap involves learning the necessary skills and technologies to become proficient in all aspects of machine learning, including data collection and preprocessing, model development, deployment, and maintenance.
omnata-labs
A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.
MLD3
FlexIble Data-Driven pipeLinE – a preprocessing pipeline that transforms structured EHR data into feature vectors to be used with ML algorithms. https://doi.org/10.1093/jamia/ocaa139
rachellea
End-to-end Python CT volume preprocessing pipeline to convert raw DICOMs into clean 3D numpy arrays for ML. From paper Draelos et al. "Machine-Learning-Based Multiple Abnormality Prediction with Large-Scale Chest Computed Tomography Volumes."
carlosrod723
An MQL5 EA for MetaTrader 5 featuring fractal-based liquidity sweeps, Fibonacci zones, order blocks, partial exits, trailing stops, and an optional ML LSTM model. Python scripts handle data preprocessing, training, and real-time probability signals to confirm trades.
sujithvarshan28
Diabetes Risk Prediction System using Machine Learning and React. The project performs clinical risk assessment based on health and lifestyle inputs. Features include data preprocessing, ML classification, and a React UI with age ranges, tooltips, and risk-based outputs.
r2llab
Parallel data preprocessing for NLP and ML.
PrachiDhiman5
Machine Learning project that predicts lifestyle-related health risks using data analysis and predictive modeling. Built an end-to-end ML pipeline including preprocessing, feature engineering, EDA, and models (regression, classification, clustering) using Python, Pandas, NumPy, and Scikit-learn.
SKawsar
No description available
Developed an end-to-end ML system on Azure to predict loan defaults, leveraging advanced data preprocessing, feature engineering, and machine learning models to optimize accuracy. This project includes a comprehensive suite of tools and techniques for robust financial risk assessment, deployed to enhance decision-making for high-risk exposures.
Hassanmahmood4
End-to-end ML project for predicting La Liga match winners. Includes data preprocessing, model training, and a web UI.
utkarshshukla2912
A complete pipeline for reading preprocessing and classification of EEG data using different statistical ML models and different deep Learning models
gyrdym
Implementation of popular data preprocessing algorithms for Machine learning
Quantmetry
an easy way to define preprocessing data pipeline (similar to sklean-pandas but for Spark ML)
antonin-lfv
A complete No-Code Machine Learning platform built with Streamlit. Upload datasets, visualize data, and train models (Regression, SVM, K-Means, PCA) directly in the browser.
deaneeth
A production-grade MLOps pipeline for predicting telecom customer churn, featuring automated data preprocessing, ML model training, experiment tracking with MLflow, distributed training using PySpark, real-time inference via Kafka streaming, Airflow DAG orchestration, and Dockerized REST API deployment.
MuhammedSinanHQ
An end-to-end predictive maintenance project built on the NASA CMAPSS turbofan dataset. Includes data preprocessing, feature engineering, model training, evaluation, and a production-ready FastAPI inference service. Designed to demonstrate practical ML and MLOps skills through a real, working workflow.
NhanPhamThanh-IT
🍎 Fruit & vegetable image classifier using TensorFlow CNN. 18 categories, Streamlit web app, Jupyter training notebooks, modular Python code. Learn computer vision & deep learning. Automated preprocessing, metrics visualization, easy deployment. Educational ML project with clean architecture.
OpenTabular
pretab is a flexible and extensible preprocessing library for tabular data, built on top of scikit-learn. It provides advanced transformations, spline and neural feature expansions, and seamless integration with embeddings – all designed for modern tabular ML workflows.
abhashpanwar
Used Car Price Prediction using Machine Learning includes Data Cleaning, Data Preprocessing, 8 Different ML Models and Some Insights from Data
HacktivSpace
A solution for deepfake detection across multiple modalities, including images, audio, and video, using ML models like CNNs, Transformers, SVMs, Bayesian networks, and Vision Transformers. This repository includes data preprocessing, model training, evaluation scripts, and Docker support for deployment.
madapathi-guruprasad
ML project on IMDb data using KNN, SVM, Logistic Regression, Linear Regression & XGBoost. Includes preprocessing, SMOTE, and EDA with Seaborn to classify certifications and predict ratings based on genre, duration, year, and votes.
Lekshmi2003-glitch
This repository explores the impact of data preprocessing on regression modeling using the Carseats dataset. It compares models built with raw data versus preprocessed data, highlighting improvements through handling missing values, outliers, scaling, and encoding categorical features.
AI-UNIT-IT-KKU
A structured, review-ready repository covering the full “ML A-Z [2025]” curriculum: preprocessing, models, evaluation, and key takeaways.
amandeep-singh28
A research-driven ML workflow demonstrating data preprocessing, feature engineering, 13+ model experiments, hyperparameter tuning, boosting variations, and performance metrics analysis for real-world churn prediction.
rvandewater
🥧 Easily define reproducible preprocessing steps for ML on Polars and Pandas DataFrames.
megsdata
Preprocessing, feature extraction and ML to derive a BP signal from PPG waveform
Matts52
This dbt package produces a number of different machine learning preprocessing techniques inline in sql
rhnfzl
Text preprocessing and PII anonymisation for NLP/ML. ONNX NER ensemble, language detection, stopword removal. Built for statistical ML and language models.
Dash10107
A well-organized collection of Jupyter notebooks covering the full machine learning journey—from data preprocessing and classic algorithms to deep learning, NLP, and reinforcement learning. Ideal for learners and professionals to explore, experiment, and master ML with real code.