Found 20,656 repositories(showing 30)
alexattia
DataScience projects for learning : Kaggle challenges, Object Recognition, Parsing, etc.
chengstone
This is a kaggle challenge project called Display Advertising Challenge by CriteoLabs at 2014.这是2014年由CriteoLabs在kaggle上发起的广告点击率预估挑战项目。
Classify Skin cancer from the skin lesion images using Image classification. The dataset for the project is obtained from the Kaggle SIIM-ISIC-Melanoma-Classification competition.
lyakaap
My PyTorch project template (for Kaggle and research)
animesh1012
Kaggle-sourced machine learning projects in Python Jupyter notebooks, spanning computer vision to predictive modeling.
Udacity capstone project: Kaggle competition on house prices prediction using advanced regression techniques
rohanrajput04
This is Kaggle project for the house price prediction
BEST SCORE ON KAGGLE SO FAR , EVEN BETTER THAN THE KAGGLE TEAM MEMBER WHO DID BEST SO FAR. The project is about diagnosing pneumonia from XRay images of lungs of a person using self laid convolutional neural network and tranfer learning via inceptionV3. The images were of size greater than 1000 pixels per dimension and the total dataset was tagged large and had a space of 1GB+ . My work includes self laid neural network which was repeatedly tuned for one of the best hyperparameters and used variety of utility function of keras like callbacks for learning rate and checkpointing. Could have augmented the image data for even better modelling but was short of RAM on kaggle kernel. Other metrics like precision , recall and f1 score using confusion matrix were taken off special care. The other part included a brief introduction of transfer learning via InceptionV3 and was tuned entirely rather than partially after loading the inceptionv3 weights for the maximum achieved accuracy on kaggle till date. This achieved even a higher precision than before.
dnkirill
Allstate Kaggle Competition ML Capstone Project
roshancyriacmathew
This project walks you on how to create a twitter sentiment analysis model using python. Twitter sentiment analysis is performed to identify the sentiments of the people towards various topics. For this project, we will be analysing the sentiment of people towards Pfizer vaccines. We will be using the data available on Kaggle to create this machine learning model. The collected tweets from Twitter will be analysed using machine learning to identify the different sentiments present in the tweets. The different sentiments identified in this project include positive sentiment, negative sentiment and neutral sentiment. We will also be using different classifiers to see which classifier gives the best model accuracy.
AmirhosseinHonardoust
A complete NLP and Machine Learning project to detect fake and real news using TF-IDF and Logistic Regression. Includes full training pipeline, evaluation charts, and an interactive Streamlit web app for real-time credibility analysis. Dataset adapted from Kaggle’s Fake and Real News Dataset.
RubixML
An example project that predicts house prices for a Kaggle competition using a Gradient Boosted Machine.
MohammedShehbazDamkar
This SQL-based Walmart data analysis project aims to identify top-performing branches and products, optimize sales strategies using Kaggle's Walmart Sales Forecasting Competition dataset.
Best free, open-source datasets for data science and machine learning projects. Top government data including census, economic, financial, agricultural, image datasets, labeled and unlabeled, autonomous car datasets, and much more. Data.gov NOAA - https://www.ncdc.noaa.gov/cdo-web/ atmospheric, ocean Bureau of Labor Statistics - https://www.bls.gov/data/ employment, inflation US Census Data - https://www.census.gov/data.html demographics, income, geo, time series Bureau of Economic Analysis - http://www.bea.gov/data/gdp/gross-dom... GDP, corporate profits, savings rates Federal Reserve - https://fred.stlouisfed.org/ curency, interest rates, payroll Quandl - https://www.quandl.com/ financial and economic Data.gov.uk UK Dataservice - https://www.ukdataservice.ac.uk Census data and much more WorldBank - https://datacatalog.worldbank.org census, demographics, geographic, health, income, GDP IMF - https://www.imf.org/en/Data economic, currency, finance, commodities, time series OpenData.go.ke Kenya govt data on agriculture, education, water, health, finance, … https://data.world/ Open Data for Africa - http://dataportal.opendataforafrica.org/ agriculture, energy, environment, industry, … Kaggle - https://www.kaggle.com/datasets A huge variety of different datasets Amazon Reviews - https://snap.stanford.edu/data/web-Am... 35M product reviews from 6.6M users GroupLens - https://grouplens.org/datasets/moviel... 20M movie ratings Yelp Reviews - https://www.yelp.com/dataset 6.7M reviews, pictures, businesses IMDB Reviews - http://ai.stanford.edu/~amaas/data/se... 25k Movie reviews Twitter Sentiment 140 - http://help.sentiment140.com/for-stud... 160k Tweets Airbnb - http://insideairbnb.com/get-the-data.... A TON of data by geo UCI ML Datasets - http://mlr.cs.umass.edu/ml/ iris, wine, abalone, heart disease, poker hands, …. Enron Email dataset - http://www.cs.cmu.edu/~enron/ 500k emails from 150 people From 2001 energy scandal. See the movie: The Smartest Guys in the Room. Spambase - https://archive.ics.uci.edu/ml/datase... Emails Jeopardy Questions - https://www.reddit.com/r/datasets/com... 200k Questions and answers in json Gutenberg Ebooks - http://www.gutenberg.org/wiki/Gutenbe... Large collection of books
JenAlchimowicz
Project developed for a Kaggle Competition organised by CentraleSupelec Deep Learning course. Final result: 1st place
harshitkd
This project is used to detect the license plate of the vehicle in real time, trained using Car Detection Licence Plate dataset available on Kaggle. Used yolov4 because it performs much better than traditional cv techniques and then used EasyOCR to extract text from the number plate. Please see readme for details.
smirk-dev
Multi-agent AI system for automated code review using Google's ADK - Kaggle Agents Intensive Capstone Project 2025
roshancyriacmathew
This is a python project that is used to identify hate speech in tweets. The dataset used to train the model is available on Kaggle and consists of labelled tweets where 1 indicates hate speech tweets and 0 indicates non-hate speech tweets.
Michel-Nguegang
In this personal Superstore Sales SQL Data Analysis project, an exploratory data analysis was performed on the Superstore Sales Data available on Kaggle. The main aim of the project is to uncover insights into the store's sales and profits trends and patterns from 2014 to 2017.
# **ABSTRACT** Main Objective: The main agenda of this project is: Perform extensive Exploratory Data Analysis(EDA) on the Zomato Dataset. Build an appropriate Machine Learning Model that will help various Zomato Restaurants to predict their respective Ratings based on certain features DEPLOY the Machine learning model via Flask that can be used to make live predictions of restaurants ratings A step by step guide is attached to this documnet as well as a video explanation of each concpet. Zomato is one of the best online food delivery apps which gives the users the ratings and the reviews on restaurants all over india.These ratings and the Reviews are considered as one of the most important deciding factors which determine how good a restaurant is. We will therefore use the real time Data set with variuos features a user would look into regarding a restaurant. We will be considering Banglore City in this analysis. Content The basic idea of analyzing the Zomato dataset is to get a fair idea about the factors affecting the establishment of different types of restaurant at different places in Bengaluru, aggregate rating of each restaurant, Bengaluru being one such city has more than 12,000 restaurants with restaurants serving dishes from all over the world. With each day new restaurants opening the industry has’nt been saturated yet and the demand is increasing day by day. Inspite of increasing demand it however has become difficult for new restaurants to compete with established restaurants. Most of them serving the same food. Bengaluru being an IT capital of India. Most of the people here are dependent mainly on the restaurant food as they don’t have time to cook for themselves. With such an overwhelming demand of restaurants it has therefore become important to study the demography of a location. What kind of a food is more popular in a locality. Do the entire locality loves vegetarian food. If yes then is that locality populated by a particular sect of people for eg. Jain, Marwaris, Gujaratis who are mostly vegetarian. These kind of analysis can be done using the data, by studying the factors such as • Location of the restaurant • Approx Price of food • Theme based restaurant or not • Which locality of that city serves that cuisines with maximum number of restaurants • The needs of people who are striving to get the best cuisine of the neighborhood • Is a particular neighborhood famous for its own kind of food. “Just so that you have a good meal the next time you step out” The data is accurate to that available on the zomato website until 15 March 2019. The data was scraped from Zomato in two phase. After going through the structure of the website I found that for each neighborhood there are 6-7 category of restaurants viz. Buffet, Cafes, Delivery, Desserts, Dine-out, Drinks & nightlife, Pubs and bars. Phase I, In Phase I of extraction only the URL, name and address of the restaurant were extracted which were visible on the front page. The URl's for each of the restaurants on the zomato were recorded in the csv file so that later the data can be extracted individually for each restaurant. This made the extraction process easier and reduced the extra load on my machine. The data for each neighborhood and each category can be found here Phase II, In Phase II the recorded data for each restaurant and each category was read and data for each restaurant was scraped individually. 15 variables were scraped in this phase. For each of the neighborhood and for each category their onlineorder, booktable, rate, votes, phone, location, resttype, dishliked, cuisines, approxcost(for two people), reviewslist, menu_item was extracted. See section 5 for more details about the variables. Acknowledgements The data scraped was entirely for educational purposes only. Note that I don’t claim any copyright for the data. All copyrights for the data is owned by Zomato Media Pvt. Ltd.. Source: Kaggle
jukiewiczm
Kaggle's Predict Future Sales competition project (TOP 15 solution as of March 2020)
sayaliwalke30
This repo contains 4 different projects. Built various machine learning models for Kaggle competitions. Also carried out Exploratory Data Analysis, Data Cleaning, Data Visualization, Data Munging, Feature Selection etc
deepamkalekar
Sales Data analysis project using Power BI and SQL | Kaggle Dataset | Data Cleaning | Data Visualisation | Power Query | Model | DAX | SQL Data Analysis | SSMS
This is a induction motor faults detection project implemented with Tensorflow. We use Stacking Ensembles method (with Random Forest, Support Vector Machine, Deep Neural Network and Logistic Regression) and Machinery Fault Dataset dataset available on kaggle.
A time series forecasting project from Kaggle that uses Seq2Seq + LSTM technique to forecast the headcounts. Detailed explanation on how the special neural network structure works is provided.
polakowo
Some of my ML projects and Kaggle competitions
ratnajitmukherjee
Emotion classification has always been a very challenging task in Computer Vision. Using the SSD object detection algorithm to extract the face in an image and using the FER 2013 released by Kaggle, this project couples a deep learning based face detector and an emotion classification DNN to classify the six/seven basic human emotions.
autoliuweijie
This repository contains some competition projects of kaggle
fomorians
Starter project for the Kaggle State Farm Distracted Driver Detection Competition
longNguyen010203
A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Docker. Data from kaggle and youtube-api