Found 129 repositories(showing 30)
sharmaroshan
Learning Statistics is one of the most Important step to get into the World of Data Science and Machine Learning. Statistics helps us to know data in a much better way and explains the behavior of the data based upon certain factors. It has many Elements which help us to understand the data better that includes Probability, Distributions, Descriptive Analysis, Inferential Analysis, Comparative Analysis, Chi-Square Test, T Test, Z test, AB Testing etc.
For this project, I used publicly available Electronic Health Records (EHRs) datasets. The MIT Media Lab for Computational Physiology has developed MIMIC-IIIv1.4 dataset based on 46,520 patients who stayed in critical care units of the Beth Israel Deaconess Medical Center of Boston between 2001 and 2012. MIMIC-IIIv1.4 dataset is freely available to researchers across the world. A formal request should be made directly to www.mimic.physionet.org, to gain access to the data. There is a required course on human research ‘Data or Specimens Only Research’ prior to data access request. I have secured one here -www.citiprogram.org/verify/?kb6607b78-5821-4de5-8cad-daf929f7fbbf-33486907. We built flexible and better performing model using the same 17 variables used in the SAPS II severity prediction model. The question ‘Can we improve the prediction performance of widely used severity scores using a more flexible model?’ is the central question of our project. I used the exact 17 variables used to develop the SAPS II severity prediction algorithm. These are 13 physiological variables, three underlying (chronic) disease variables and one admission variable. The physiological variables includes demographic (age), vital (Glasgow Comma Scale, systolic blood pressure, Oxygenation, Renal, White blood cells count, serum bicarbonate level, blood sodium level, blood potassium level, and blood bilirubin level). The three underlying disease variables includes Acquired Immunodeficiency Syndrome (AIDS), metastatic cancer, and hematologic malignancy. Finally, whether admission was scheduled surgical or unscheduled surgical was included in the model. The dataset has 26 relational tables including patient’s hospital admission, callout information when patient was ready for discharge, caregiver information, electronic charted events including vital signs and any additional information relevant to patient care, patient demographic data, list of services the patient was admitted or transferred under, ICU stay types, diagnoses types, laboratory measurments, microbiology tests and sensitivity, prescription data and billing information. Although I have full access to the MIMIC-IIIv1.4 datasets, I can not share any part of the data publicly. If you are interested to learn more about the data, there is a MIMIC III Demo dataset based on 100 patients https://mimic.physionet.org/gettingstarted/demo/. If you are interested to requesting access to the data - https://mimic.physionet.org/gettingstarted/access/. Linked repositories: Exploratory-Data-Analysis-Clinical-Deterioration, Data-Wrangling-MIMICIII-Database, Clinical-Deterioration-Prediction-Model--Inferential-Statistics, Clinical-Deterioration-Prediction-Model--Ensemble-Algorithms-, Clinical-Deterioration-Prediction-Model--Logistic-Regression, Clinical-Deterioration-Prediction-Model---KNN © 2020 GitHub, Inc.
ayman-gassi
No description available
parhamzm
A 6-day hands-on bootcamp for beginners to learn Python, exploratory data analysis (EDA), data visualization, inferential statistics, and machine learning using real-world datasets.
Sanket-Sv
Hands-on journey into Statistics for Data Analysis using Python. Covers descriptive & inferential statistics, probability, hypothesis testing, regression, and data visualization. Designed as part of my Data Analytics portfolio to demonstrate problem-solving, actionable insights, and data-driven business decision-making.
Mgobeaalcoba
Explore the world of inferential statistics using Python. Learn hypothesis testing, confidence intervals, and statistical analysis techniques for data-driven decision-making and insights.
nalinimacharla9
In this assignment, we chose “Portuguese Bank Marketing” Dataset that was retrieved from the UCI Machine Learning Repository. Here, we will perform Exploratory Data Analysis, Inferential Statistics, and Regularization Techniques on the bank dataset and build a machine learning models like Logistic Regression to know the relationship between the variables, Support Vector Machines, Cross Validation, and Decision Tree Classifiers in order to predict the terms of a Deposit Subscription by the clients in the bank. It will be a supervised ML model which will try to solve the classification problem like whether the client will subscribe the term deposit in a bank or not (a simple yes or no). Also, we will do some analysis on one-sample t-test, two-sample t-test, Paired t-test, the test of equal or given proportions, and F-tests. A one-sample t-test states whether an unidentified population means is dissimilar from a definite value. The two-sample t-test is also known as independent samples t-test to test whether the unknown population means of 2 groups are identical or not. Paired t-test also called the dependent sample t-test to discover whether the mean change between 2 sets is 0. They are validated two times, resulting for pairs of observations. The test of equal or given proportions will test whether or not a sample from a population represents the true proportion from the entire population. The last test, F-test signifies the linearity gives improved fit.
DanielPNewman
Code for my blue-enriched light /attention experiment at Monash University. Includes paradigm files (matlab/psychtoolbox), data processing/analysis files (matlab/eeglab) and Inferential Statistics/data analysis files
joshmgarciaa
2021 UH Psi Chi Data Analysis Workshops, introducing basic descriptive and inferential statistics in jamovi / R
Gampasani
This repository contains an in-depth analysis of the UCI Heart Disease Dataset. The analysis involves data preprocessing, exploratory data analysis (EDA), descriptive statistics, and inferential statistics to uncover relationships between various medical attributes and heart disease.
daflafel
Data analysis examples where I used R to conduct quantitative analysis and research methods in political science, international relations, and conflict analysis. I used concepts and tools of descriptive and inferential statistics to complete these exercises.
adetayoadekunle
Analysis of NHANES data using Python in Google Colab to perform inferential statistics, explore health metrics, and examine demographic relationships. Includes code and insights in a Google Colab notebook.
neo-potron
Applied inferential statistics projects in Python, covering hypothesis testing, confidence intervals, and real-world data analysis. The repository includes synthetic data experiments and a biostatistics case study on primary hypercholesterolemia, with clean, modular, and well-documented code.
Paula0923
In this project, I analyse various influences on maternal employment. The data was gathered from OECD Family Database and Eurostat and contains EU-27 countries. The analysis contains EDA, Inferential Statistics and Regression Models.
neeraj123-kk
Statistical Analysis on Indian Diabetic Patients Objectives: Performing various inferential Statistical Test to check whether BMI is having significant effect on diabetes or not and also looking over other factors like Age, Blood Pressure, Skin Thickness etc Key Skills: Inferential Statistics, Python, Data Visualization
RanaivosonHerimanitra
A curated list of resources for statisticians, social scientists, biologists, students involved in data analysis and interpretation tasks such as inferential statistics, Bioinformatics. All with straightforward application using R and/or Python programming languages.
This analysis explores a large cyber security dataset. The analysis is done using descriptive and inferential statistics. In addition some machine learning is done on the data using logistic regression to create a binary classifier as well as a multiclass classifier.
This project implements key inferential statistics concepts using Python, including hypothesis testing, confidence intervals, chi-square test, paired t-test, ANOVA, and probability (Bayes theorem). Each problem is solved with code and clear interpretation, showcasing practical data analysis skills.
stephenajwhatley
Two reproducible analyses of ELISA and Learning Index (LI) data. Three standalone R scripts are provided: two ELISA-based BDNF analyses and one LI Score (behavioural) analysis. Outputs include technical outlier removal, 4PL fit, concentration interpolation, biological outlier detection, summary and inferential statistics, and boxplot generation.
ioannislivieris
No description available
No description available
Bhagyashri2511
Data Types and Probability ,Descriptive Statistics, Inferential Statistics, Data Distribution Analysis, Probability
Kanayochi
Data analysis projects using Excel, descriptive and inferential statistics
mdalisouayah
SAT & ACT Analysis Data importing and cleaning Exploratory Data Analysis Data Visualization Descriptive and Inferential Statistics Research
Neman-River
A collection of statistics practice notebooks covering exploratory data analysis, inferential statistics, and anomaly detection.
mateoniksic
data manipulation, descriptive and inferential statistics, static analysis (CODE) [ PYTHON - PANDAS / NUMPY / SCIPY ]
ZacTanner
Exploratory Data Analysis, Inferential Statistics, and Data Visualizations for Residential Real Estate Listings across Canada
debasishdp552-oss
Explore raw datasets, perform exploratory data analysis, and apply inferential statistics to uncover insights and trends.
energyfirefox
Notes for my course "Intro to Data Analysis and Inferential Statistics with R" (Prometheus MOOC)
olliemerriden
-Carrying out exploratory data analysis, descriptive and inferential statistics to abstract information and make inferences to critique data.