Search Results

Found 20,656 repositories(showing 30)

Data-Science-Projects

alexattia

💛71

DataScience projects for learning : Kaggle challenges, Object Recognition, Parsing, etc.

1.1k

456

Jupyter Notebook

Updated 10 hours ago

challengehackerrankkaggle+2

kaggle_criteo_ctr_challenge-

chengstone

🧡52

This is a kaggle challenge project called Display Advertising Challenge by CriteoLabs at 2014.这是2014年由CriteoLabs在kaggle上发起的广告点击率预估挑战项目。

361

125

MIT

Jupyter Notebook

Updated 2 months ago

Skin-Cancer-Classification-using-Deep-Learning

Tirth27

💛71

Classify Skin cancer from the skin lesion images using Image classification. The dataset for the project is obtained from the Kaggle SIIM-ISIC-Melanoma-Classification competition.

168

MIT

Jupyter Notebook

Updated 2 days ago

albumentationscomputer-visionconvolutional-neural-networks+10

pytorch-template

lyakaap

❤️40

My PyTorch project template (for Kaggle and research)

150

MIT

Python

Updated 7 months ago

machineLearning

animesh1012

🧡66

Kaggle-sourced machine learning projects in Python Jupyter notebooks, spanning computer vision to predictive modeling.

126

132

Jupyter Notebook

Updated 5 days ago

Kaggle-House-Prices-Advanced-Regression-Techniques

Shitao-zz

❤️46

Udacity capstone project: Kaggle competition on house prices prediction using advanced regression techniques

114

Jupyter Notebook

Updated 1 month ago

House-Price-Prediction-Analysis

rohanrajput04

❤️46

This is Kaggle project for the house price prediction

111

Jupyter Notebook

Updated 2 weeks ago

Pneumonia-Diagnosis-using-XRays-96-percent-Recall

deadskull7

❤️41

BEST SCORE ON KAGGLE SO FAR , EVEN BETTER THAN THE KAGGLE TEAM MEMBER WHO DID BEST SO FAR. The project is about diagnosing pneumonia from XRay images of lungs of a person using self laid convolutional neural network and tranfer learning via inceptionV3. The images were of size greater than 1000 pixels per dimension and the total dataset was tagged large and had a space of 1GB+ . My work includes self laid neural network which was repeatedly tuned for one of the best hyperparameters and used variety of utility function of keras like callbacks for learning rate and checkpointing. Could have augmented the image data for even better modelling but was short of RAM on kaggle kernel. Other metrics like precision , recall and f1 score using confusion matrix were taken off special care. The other part included a brief introduction of transfer learning via InceptionV3 and was tuned entirely rather than partially after loading the inceptionv3 weights for the maximum achieved accuracy on kaggle till date. This achieved even a higher precision than before.

105

MIT

Jupyter Notebook

Updated 7 months ago

confusion-matrixf1-scorehealthcare+7

allstate_capstone

dnkirill

❤️36

Allstate Kaggle Competition ML Capstone Project

Jupyter Notebook

Updated 5 months ago

capstonecudnndata-science+11

Twitter-sentiment-analysis-using-Python-Machine-Learning-Project-8

roshancyriacmathew

❤️45

This project walks you on how to create a twitter sentiment analysis model using python. Twitter sentiment analysis is performed to identify the sentiments of the people towards various topics. For this project, we will be analysing the sentiment of people towards Pfizer vaccines. We will be using the data available on Kaggle to create this machine learning model. The collected tweets from Twitter will be analysed using machine learning to identify the different sentiments present in the tweets. The different sentiments identified in this project include positive sentiment, negative sentiment and neutral sentiment. We will also be using different classifiers to see which classifier gives the best model accuracy.

Jupyter Notebook

Updated 1 month ago

machine-learningmachine-learning-projectsmachinelearning-python+6

Fake-News-Detector

AmirhosseinHonardoust

🧡60

A complete NLP and Machine Learning project to detect fake and real news using TF-IDF and Logistic Regression. Includes full training pipeline, evaluation charts, and an interactive Streamlit web app for real-time credibility analysis. Dataset adapted from Kaggle’s Fake and Real News Dataset.

MIT

Python

Updated 1 week ago

ai-projectdata-sciencedata-visualization+12

Housing

RubixML

💛70

An example project that predicts house prices for a Kaggle competition using a Gradient Boosted Machine.

MIT

PHP

Updated 1 day ago

data-scienceensemblegradient-boost+17

Walmart-Sales-Data-Analysis--SQL-Project

MohammedShehbazDamkar

🧡65

This SQL-based Walmart data analysis project aims to identify top-performing branches and products, optimize sales strategies using Kaggle's Walmart Sales Forecasting Competition dataset.

Updated 1 day ago

data-analysisedasql

-open-source-datasets-for-data-science

shaungt1

🧡55

Best free, open-source datasets for data science and machine learning projects. Top government data including census, economic, financial, agricultural, image datasets, labeled and unlabeled, autonomous car datasets, and much more. Data.gov NOAA - https://www.ncdc.noaa.gov/cdo-web/ atmospheric, ocean Bureau of Labor Statistics - https://www.bls.gov/data/ employment, inflation US Census Data - https://www.census.gov/data.html demographics, income, geo, time series Bureau of Economic Analysis - http://www.bea.gov/data/gdp/gross-dom... GDP, corporate profits, savings rates Federal Reserve - https://fred.stlouisfed.org/ curency, interest rates, payroll Quandl - https://www.quandl.com/ financial and economic Data.gov.uk UK Dataservice - https://www.ukdataservice.ac.uk Census data and much more WorldBank - https://datacatalog.worldbank.org census, demographics, geographic, health, income, GDP IMF - https://www.imf.org/en/Data economic, currency, finance, commodities, time series OpenData.go.ke Kenya govt data on agriculture, education, water, health, finance, … https://data.world/ Open Data for Africa - http://dataportal.opendataforafrica.org/ agriculture, energy, environment, industry, … Kaggle - https://www.kaggle.com/datasets A huge variety of different datasets Amazon Reviews - https://snap.stanford.edu/data/web-Am... 35M product reviews from 6.6M users GroupLens - https://grouplens.org/datasets/moviel... 20M movie ratings Yelp Reviews - https://www.yelp.com/dataset 6.7M reviews, pictures, businesses IMDB Reviews - http://ai.stanford.edu/~amaas/data/se... 25k Movie reviews Twitter Sentiment 140 - http://help.sentiment140.com/for-stud... 160k Tweets Airbnb - http://insideairbnb.com/get-the-data.... A TON of data by geo UCI ML Datasets - http://mlr.cs.umass.edu/ml/ iris, wine, abalone, heart disease, poker hands, …. Enron Email dataset - http://www.cs.cmu.edu/~enron/ 500k emails from 150 people From 2001 energy scandal. See the movie: The Smartest Guys in the Room. Spambase - https://archive.ics.uci.edu/ml/datase... Emails Jeopardy Questions - https://www.reddit.com/r/datasets/com... 200k Questions and answers in json Gutenberg Ebooks - http://www.gutenberg.org/wiki/Gutenbe... Large collection of books

Jupyter Notebook

Updated 2 weeks ago

Semantic-segmentation-with-PyTorch-Satellite-Imagery

JenAlchimowicz

🧡65

Project developed for a Kaggle Competition organised by CentraleSupelec Deep Learning course. Final result: 1st place

Jupyter Notebook

Updated 4 days ago

Real-Time-Number-Plate-Recognition

harshitkd

🧡50

This project is used to detect the license plate of the vehicle in real time, trained using Car Detection Licence Plate dataset available on Kaggle. Used yolov4 because it performs much better than traditional cv techniques and then used EasyOCR to extract text from the number plate. Please see readme for details.

MIT

Jupyter Notebook

Updated 2 months ago

easyocrobject-detectionopencv+2

CodeReview-AI-Agent

smirk-dev

💛70

Multi-agent AI system for automated code review using Google's ADK - Kaggle Agents Intensive Capstone Project 2025

MIT

Python

Updated 3 days ago

hate-speech-detection-using-machine-learning

roshancyriacmathew

❤️45

This is a python project that is used to identify hate speech in tweets. The dataset used to train the model is available on Kaggle and consists of labelled tweets where 1 indicates hate speech tweets and 0 indicates non-hate speech tweets.

Jupyter Notebook

Updated 1 month ago

hatespeechhatespeech-detectionmachine-learning+7

PROJECT-PORTFOLIO--Superstore-Sales-SQL-Data-Analysis

Michel-Nguegang

🧡55

In this personal Superstore Sales SQL Data Analysis project, an exploratory data analysis was performed on the Superstore Sales Data available on Kaggle. The main aim of the project is to uncover insights into the store's sales and profits trends and patterns from 2014 to 2017.

Updated 1 week ago

data-cleaningdata-visualizationdatabase+5

FLASK-End-to-end-Zomato-Restaurant-Price-Prediction-and-Deployment

MrBriit

🧡65

# **ABSTRACT** Main Objective: The main agenda of this project is: Perform extensive Exploratory Data Analysis(EDA) on the Zomato Dataset. Build an appropriate Machine Learning Model that will help various Zomato Restaurants to predict their respective Ratings based on certain features DEPLOY the Machine learning model via Flask that can be used to make live predictions of restaurants ratings A step by step guide is attached to this documnet as well as a video explanation of each concpet. Zomato is one of the best online food delivery apps which gives the users the ratings and the reviews on restaurants all over india.These ratings and the Reviews are considered as one of the most important deciding factors which determine how good a restaurant is. We will therefore use the real time Data set with variuos features a user would look into regarding a restaurant. We will be considering Banglore City in this analysis. Content The basic idea of analyzing the Zomato dataset is to get a fair idea about the factors affecting the establishment of different types of restaurant at different places in Bengaluru, aggregate rating of each restaurant, Bengaluru being one such city has more than 12,000 restaurants with restaurants serving dishes from all over the world. With each day new restaurants opening the industry has’nt been saturated yet and the demand is increasing day by day. Inspite of increasing demand it however has become difficult for new restaurants to compete with established restaurants. Most of them serving the same food. Bengaluru being an IT capital of India. Most of the people here are dependent mainly on the restaurant food as they don’t have time to cook for themselves. With such an overwhelming demand of restaurants it has therefore become important to study the demography of a location. What kind of a food is more popular in a locality. Do the entire locality loves vegetarian food. If yes then is that locality populated by a particular sect of people for eg. Jain, Marwaris, Gujaratis who are mostly vegetarian. These kind of analysis can be done using the data, by studying the factors such as • Location of the restaurant • Approx Price of food • Theme based restaurant or not • Which locality of that city serves that cuisines with maximum number of restaurants • The needs of people who are striving to get the best cuisine of the neighborhood • Is a particular neighborhood famous for its own kind of food. “Just so that you have a good meal the next time you step out” The data is accurate to that available on the zomato website until 15 March 2019. The data was scraped from Zomato in two phase. After going through the structure of the website I found that for each neighborhood there are 6-7 category of restaurants viz. Buffet, Cafes, Delivery, Desserts, Dine-out, Drinks & nightlife, Pubs and bars. Phase I, In Phase I of extraction only the URL, name and address of the restaurant were extracted which were visible on the front page. The URl's for each of the restaurants on the zomato were recorded in the csv file so that later the data can be extracted individually for each restaurant. This made the extraction process easier and reduced the extra load on my machine. The data for each neighborhood and each category can be found here Phase II, In Phase II the recorded data for each restaurant and each category was read and data for each restaurant was scraped individually. 15 variables were scraped in this phase. For each of the neighborhood and for each category their onlineorder, booktable, rate, votes, phone, location, resttype, dishliked, cuisines, approxcost(for two people), reviewslist, menu_item was extracted. See section 5 for more details about the variables. Acknowledgements The data scraped was entirely for educational purposes only. Note that I don’t claim any copyright for the data. All copyrights for the data is owned by Zomato Media Pvt. Ltd.. Source: Kaggle

Jupyter Notebook

Updated 1 day ago

kaggle-predict-future-sales

jukiewiczm

❤️35

Kaggle's Predict Future Sales competition project (TOP 15 solution as of March 2020)

HTML

Updated 3 years ago

embeddingsgensimkaggle-competition+6

Kaggle-Projects

sayaliwalke30

🧡65

This repo contains 4 different projects. Built various machine learning models for Kaggle competitions. Also carried out Exploratory Data Analysis, Data Cleaning, Data Visualization, Data Munging, Feature Selection etc

Jupyter Notebook

Updated 6 days ago

bankloanpredictioncreditcardfrauddetectiondata-analysis+11

Sale-Data-Analysis-PowerBI

deepamkalekar

❤️45

Updated 1 month ago

dashboarddata-cleaningdata-visualization+4

Induction-Motor-Faults-Detection-with-Stacking-Ensemble-Method-and-Deep-Learning

mo26-web

❤️45

This is a induction motor faults detection project implemented with Tensorflow. We use Stacking Ensembles method (with Random Forest, Support Vector Machine, Deep Neural Network and Logistic Regression) and Machinery Fault Dataset dataset available on kaggle.

Jupyter Notebook

Updated 2 months ago

anomaly-detectiondeep-neural-networksensemble-classifier+13

Time-Series-Forcasting-Seq2Seq

Olliang

❤️35

A time series forecasting project from Kaggle that uses Seq2Seq + LSTM technique to forecast the headcounts. Detailed explanation on how the special neural network structure works is provided.

Jupyter Notebook

Updated 5 months ago

competitionexplanationseq2seq+1

mlprojects

polakowo

❤️30

Some of my ML projects and Kaggle competitions

Jupyter Notebook

Updated 1 month ago

fastaimachine-learningpytorch+2

EmotionClassification_FER2013

ratnajitmukherjee

❤️40

Emotion classification has always been a very challenging task in Computer Vision. Using the SSD object detection algorithm to extract the face in an image and using the FER 2013 released by Kaggle, this project couples a deep learning based face detector and an emotion classification DNN to classify the six/seven basic human emotions.

BSD-3-Clause

Python

Updated 5 months ago

convolutional-neural-networksemotion-analysisemotion-detection+5

Kaggle

autoliuweijie

❤️35

This repository contains some competition projects of kaggle

Jupyter Notebook

Updated 5 years ago

distracted-drivers-tf

fomorians

❤️35

Starter project for the Kaggle State Farm Distracted Driver Detection Competition

Python

Updated 1 year ago

Youtube-Recommend-Master-ETL-Pipeline

longNguyen010203

🧡60

A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Docker. Data from kaggle and youtube-api

MIT

Jupyter Notebook

Updated 4 weeks ago

cleaning-datadagsterdata-engineering+17

GitHub Explorer

Search Results

Data-Science-Projects

kaggle_criteo_ctr_challenge-

Skin-Cancer-Classification-using-Deep-Learning

pytorch-template

machineLearning

Kaggle-House-Prices-Advanced-Regression-Techniques

House-Price-Prediction-Analysis

Pneumonia-Diagnosis-using-XRays-96-percent-Recall

allstate_capstone

Twitter-sentiment-analysis-using-Python-Machine-Learning-Project-8

Fake-News-Detector

Housing

Walmart-Sales-Data-Analysis--SQL-Project

-open-source-datasets-for-data-science

Semantic-segmentation-with-PyTorch-Satellite-Imagery

Real-Time-Number-Plate-Recognition

CodeReview-AI-Agent

hate-speech-detection-using-machine-learning

PROJECT-PORTFOLIO--Superstore-Sales-SQL-Data-Analysis

FLASK-End-to-end-Zomato-Restaurant-Price-Prediction-and-Deployment

kaggle-predict-future-sales

Kaggle-Projects

Sale-Data-Analysis-PowerBI

Induction-Motor-Faults-Detection-with-Stacking-Ensemble-Method-and-Deep-Learning

Time-Series-Forcasting-Seq2Seq

mlprojects

EmotionClassification_FER2013

Kaggle

distracted-drivers-tf

Youtube-Recommend-Master-ETL-Pipeline

Data-Science-Projects

kaggle_criteo_ctr_challenge-

Skin-Cancer-Classification-using-Deep-Learning

pytorch-template

machineLearning

Kaggle-House-Prices-Advanced-Regression-Techniques

House-Price-Prediction-Analysis

Pneumonia-Diagnosis-using-XRays-96-percent-Recall

allstate_capstone

Twitter-sentiment-analysis-using-Python-Machine-Learning-Project-8

Fake-News-Detector

Housing

Walmart-Sales-Data-Analysis--SQL-Project

-open-source-datasets-for-data-science

Semantic-segmentation-with-PyTorch-Satellite-Imagery

Real-Time-Number-Plate-Recognition

CodeReview-AI-Agent

hate-speech-detection-using-machine-learning

PROJECT-PORTFOLIO--Superstore-Sales-SQL-Data-Analysis

FLASK-End-to-end-Zomato-Restaurant-Price-Prediction-and-Deployment

kaggle-predict-future-sales

Kaggle-Projects

Sale-Data-Analysis-PowerBI

Induction-Motor-Faults-Detection-with-Stacking-Ensemble-Method-and-Deep-Learning

Time-Series-Forcasting-Seq2Seq

mlprojects

EmotionClassification_FER2013

Kaggle

distracted-drivers-tf

Youtube-Recommend-Master-ETL-Pipeline