Found 29,281 repositories(showing 30)
Viveckh
A Machine Learning Project implemented from scratch which involves web scraping, data engineering, exploratory data analysis and machine learning to predict housing prices in New York Tri-State Area.
khanhnamle1994
An exploratory data analysis and data visualization project using data from Spotify Web API
Patotricks15
Repository for my data science projects (Web scraping and automation + exploratory data analysis + machine learning + recommendation system)
Hrishikesh332
The Project "Exploratory Data Analysis and Prediction of Lung Cancer" is aim towards: finding the insights from the dataset, so that we can create a concrete conclusion out of it and to perform exploratory data analysis. Also we are going to apply machine learning model to predict the lung cancer diagnosis.
EDA Project using Python & Pandas Framework
mrankitgupta
An exploratory data analysis (EDA) and data visualization project using data from Spotify using Python.
A Data science and Analytics project with the main aim of doing some Descriptive and Exploratory Data Analysis and then applying predictive modelling for predicting why and which are the best and most experienced employees leaving prematurely?
katiehuangx
Exploratory Data Analysis on Bellabeat fitness tracker app using Python. Capstone project from Google Data Analytics Professional Certification.
khanhnamle1994
An exploratory data analysis and data visualization project for World Cup 2018
Michel-Nguegang
In this personal Superstore Sales SQL Data Analysis project, an exploratory data analysis was performed on the Superstore Sales Data available on Kaggle. The main aim of the project is to uncover insights into the store's sales and profits trends and patterns from 2014 to 2017.
# **ABSTRACT** Main Objective: The main agenda of this project is: Perform extensive Exploratory Data Analysis(EDA) on the Zomato Dataset. Build an appropriate Machine Learning Model that will help various Zomato Restaurants to predict their respective Ratings based on certain features DEPLOY the Machine learning model via Flask that can be used to make live predictions of restaurants ratings A step by step guide is attached to this documnet as well as a video explanation of each concpet. Zomato is one of the best online food delivery apps which gives the users the ratings and the reviews on restaurants all over india.These ratings and the Reviews are considered as one of the most important deciding factors which determine how good a restaurant is. We will therefore use the real time Data set with variuos features a user would look into regarding a restaurant. We will be considering Banglore City in this analysis. Content The basic idea of analyzing the Zomato dataset is to get a fair idea about the factors affecting the establishment of different types of restaurant at different places in Bengaluru, aggregate rating of each restaurant, Bengaluru being one such city has more than 12,000 restaurants with restaurants serving dishes from all over the world. With each day new restaurants opening the industry has’nt been saturated yet and the demand is increasing day by day. Inspite of increasing demand it however has become difficult for new restaurants to compete with established restaurants. Most of them serving the same food. Bengaluru being an IT capital of India. Most of the people here are dependent mainly on the restaurant food as they don’t have time to cook for themselves. With such an overwhelming demand of restaurants it has therefore become important to study the demography of a location. What kind of a food is more popular in a locality. Do the entire locality loves vegetarian food. If yes then is that locality populated by a particular sect of people for eg. Jain, Marwaris, Gujaratis who are mostly vegetarian. These kind of analysis can be done using the data, by studying the factors such as • Location of the restaurant • Approx Price of food • Theme based restaurant or not • Which locality of that city serves that cuisines with maximum number of restaurants • The needs of people who are striving to get the best cuisine of the neighborhood • Is a particular neighborhood famous for its own kind of food. “Just so that you have a good meal the next time you step out” The data is accurate to that available on the zomato website until 15 March 2019. The data was scraped from Zomato in two phase. After going through the structure of the website I found that for each neighborhood there are 6-7 category of restaurants viz. Buffet, Cafes, Delivery, Desserts, Dine-out, Drinks & nightlife, Pubs and bars. Phase I, In Phase I of extraction only the URL, name and address of the restaurant were extracted which were visible on the front page. The URl's for each of the restaurants on the zomato were recorded in the csv file so that later the data can be extracted individually for each restaurant. This made the extraction process easier and reduced the extra load on my machine. The data for each neighborhood and each category can be found here Phase II, In Phase II the recorded data for each restaurant and each category was read and data for each restaurant was scraped individually. 15 variables were scraped in this phase. For each of the neighborhood and for each category their onlineorder, booktable, rate, votes, phone, location, resttype, dishliked, cuisines, approxcost(for two people), reviewslist, menu_item was extracted. See section 5 for more details about the variables. Acknowledgements The data scraped was entirely for educational purposes only. Note that I don’t claim any copyright for the data. All copyrights for the data is owned by Zomato Media Pvt. Ltd.. Source: Kaggle
Madhuarvind
A complete exploratory data analysis (EDA) and forecasting project focused on retail sales data. The project identifies key sales patterns, seasonal trends, and builds predictive models to forecast future demand at the item-store level.
VenkyAdi
Exploratory Data Analysis (EDA) Projects A collection of EDA projects exploring various datasets to uncover patterns, gain insights, and visualize trends across different industries. Projects include analyses of Amazon Prime content, banking fraud detection, logistics performance, hotel booking trends, and more.
sayaliwalke30
This repo contains 4 different projects. Built various machine learning models for Kaggle competitions. Also carried out Exploratory Data Analysis, Data Cleaning, Data Visualization, Data Munging, Feature Selection etc
This project is designed as Automated Application for performing Exploratory Data Analysis for given Dataset to generate insights using Python, Streamlit. For executing all the operations customized function has been created and with support of these functions every step will be executed. EDA like basic information about data, Tabulation Analysis, Distribution Analysis, Correlation Analysis and it has been extended to perform Advance Statistical Analysis with some Basic Feature Engineering has been Automated. This Project has been Deployed with Streamlit in Heroku Cloud Platform
avinashvignesh00003
"Incorporating exploratory data analysis (EDA) techniques in a data science project focused on terrorism. EDA involves visualizing, summarizing, and interpreting terrorism data to reveal patterns, relationships, and potential insights. This critical phase informs subsequent analyses and aids in developing a comprehensive understanding.
nafiul-araf
This credit risk modeling project for a modest-scale finance company encompasses the entire process, from data collection and exploratory data analysis (EDA) to deployment.
medhajha810
No description available
This project involves an exploratory data analysis (EDA) of Amazon's best selling books dataset.
KhushiBhadange
In this repository, explore insightful solutions through exploratory data analysis focusing on mental health problems. Gain valuable insights into understanding and addressing key challenges in this critical domain.
Tahsin-Mayeesha
Exploratory data analysis on various datasets from FiveThirtyEight and Udacity coursework
anwarcsebd
Stock Market Analysis and Prediction is the project related to Exploratory data analysis( EDA), Data visualization and Predictive analysis using real-time financial data, provided by The Investors Exchange (IEX).
Uday029
This project involves a comprehensive Exploratory Data Analysis (EDA) on a hospital dataset to uncover insights related to patient demographics, hospital stay durations, diagnoses, treatment outcomes, and other healthcare-related variables. The goal is to identify patterns, anomalies, and trends that could assist in improving healthcare
For this project, I used publicly available Electronic Health Records (EHRs) datasets. The MIT Media Lab for Computational Physiology has developed MIMIC-IIIv1.4 dataset based on 46,520 patients who stayed in critical care units of the Beth Israel Deaconess Medical Center of Boston between 2001 and 2012. MIMIC-IIIv1.4 dataset is freely available to researchers across the world. A formal request should be made directly to www.mimic.physionet.org, to gain access to the data. There is a required course on human research ‘Data or Specimens Only Research’ prior to data access request. I have secured one here -www.citiprogram.org/verify/?kb6607b78-5821-4de5-8cad-daf929f7fbbf-33486907. We built flexible and better performing model using the same 17 variables used in the SAPS II severity prediction model. The question ‘Can we improve the prediction performance of widely used severity scores using a more flexible model?’ is the central question of our project. I used the exact 17 variables used to develop the SAPS II severity prediction algorithm. These are 13 physiological variables, three underlying (chronic) disease variables and one admission variable. The physiological variables includes demographic (age), vital (Glasgow Comma Scale, systolic blood pressure, Oxygenation, Renal, White blood cells count, serum bicarbonate level, blood sodium level, blood potassium level, and blood bilirubin level). The three underlying disease variables includes Acquired Immunodeficiency Syndrome (AIDS), metastatic cancer, and hematologic malignancy. Finally, whether admission was scheduled surgical or unscheduled surgical was included in the model. The dataset has 26 relational tables including patient’s hospital admission, callout information when patient was ready for discharge, caregiver information, electronic charted events including vital signs and any additional information relevant to patient care, patient demographic data, list of services the patient was admitted or transferred under, ICU stay types, diagnoses types, laboratory measurments, microbiology tests and sensitivity, prescription data and billing information. Although I have full access to the MIMIC-IIIv1.4 datasets, I can not share any part of the data publicly. If you are interested to learn more about the data, there is a MIMIC III Demo dataset based on 100 patients https://mimic.physionet.org/gettingstarted/demo/. If you are interested to requesting access to the data - https://mimic.physionet.org/gettingstarted/access/. Linked repositories: Exploratory-Data-Analysis-Clinical-Deterioration, Data-Wrangling-MIMICIII-Database, Clinical-Deterioration-Prediction-Model--Inferential-Statistics, Clinical-Deterioration-Prediction-Model--Ensemble-Algorithms-, Clinical-Deterioration-Prediction-Model--Logistic-Regression, Clinical-Deterioration-Prediction-Model---KNN © 2020 GitHub, Inc.
gamzeakkurt
This project uses machine learning models (Linear Regression and LSTM) to analyze and forecast stock market prices. It retrieves stock data from Yahoo Finance, performs exploratory data analysis (EDA), processes and engineers features, and predicts future prices. The project includes model evaluation metrics
venkat-0706
This exploratory data analysis (EDA) project focuses on examining sugarcane production data. Through this analysis, we seek to gain valuable insights into factors influencing sugarcane production, develop predictive models for future yields, and ultimately support efforts to optimize production efficiency and sustainability.
Kanjo-Elkamira-Ndi
A complete, end-to-end data science pipeline applied to a survey dataset investigating mobile money scam prevalence, victim demographics, and loss patterns in Cameroon. The project covers Exploratory Data Analysis, Data Preprocessing, Feature Engineering, Predictive Modelling, and Evaluation, culminating in a fully formatted Word report.
shridhar1504
Develop a data science project using historical sales data to build a regression model that accurately predicts future sales. Preprocess the dataset, conduct exploratory analysis, select relevant features, and employ regression algorithms for model development. Evaluate model performance, optimize hyperparameters, and provide actionable insights.
nikolasscoolis
Data project which performs exploratory analysis and synthesizes data in order to uncover critical insights that will improve Elist Electronics's commercial success.
KumudRanjan4295
This project leverages machine learning to predict liver disease using clinical data from the Indian Liver Patient Dataset. It combines exploratory data analysis (EDA), classification, and regression modeling to extract meaningful healthcare insights.