Found 146 repositories(showing 30)
Analyzed CMSU student data, shingle moisture levels, and salary by education and occupation using inferential statistics. Applied probability, t-tests, and ANOVA to uncover insights on demographics, product quality, and salary trends. Key findings support data-driven decisions.
carolinaemanuele
A descriptive and inferential statistical analysis from the Kaggle database on the data collected by an IoT smoke detection device. Machine learning techniques were also used to help build this smart device, increasing its accuracy.
ndcastillo
Contains inferential statistical practices for machine learning models and analyses. Using Python and developing statistical thinking to work with a limited sample of data and be able to generate predictions about it. Applying confidence intervals to estimate unknown values. Using bootstrapping to simulate data acquisition repeatedly. Development of hypotheses of their models. Sampling of populations to facilitate analysis.
bhneelima
Comprehension The pharmaceutical company Sun Pharma is manufacturing a new batch of painkiller drugs, which are due for testing. Around 80,000 new products are created and need to be tested for their time of effect (which is measured as the time taken for the drug to completely cure the pain), as well as the quality assurance (which tells you whether the drug was able to do a satisfactory job or not). Question 1: The quality assurance checks on the previous batches of drugs found that — it is 4 times more likely that a drug is able to produce a satisfactory result than not. Given a small sample of 10 drugs, you are required to find the theoretical probability that at most, 3 drugs are not able to do a satisfactory job. a.) Propose the type of probability distribution that would accurately portray the above scenario, and list out the three conditions that this distribution follows. b.) Calculate the required probability. Question 2: For the effectiveness test, a sample of 100 drugs was taken. The mean time of effect was 207 seconds, with the standard deviation coming to 65 seconds. Using this information, you are required to estimate the range in which the population mean might lie — with a 95% confidence level. a.)Discuss the main methodology using which you will approach this problem. State all the properties of the required method. Limit your answer to 150 words. b.)Find the required range. Question 3: a) The painkiller drug needs to have a time of effect of at most 200 seconds to be considered as having done a satisfactory job. Given the same sample data (size, mean, and standard deviation) of the previous question, test the claim that the newer batch produces a satisfactory result and passes the quality assurance test. Utilize 2 hypothesis testing methods to make your decision. Take the significance level at 5 %. Clearly specify the hypotheses, the calculated test statistics, and the final decision that should be made for each method. b) You know that two types of errors can occur during hypothesis testing — namely Type-I and Type-II errors — whose probabilities are denoted by α and β respectively. For the current sample conditions (sample size, mean, and standard deviation), the value of α and β come out to be 0.05 and 0.45 respectively. Now, a different sampling procedure(with different sample size, mean, and standard deviation) is proposed so that when the same hypothesis test is conducted, the values of α and β are controlled at 0.15 each. Explain under what conditions would either method be more preferred than the other, i.e. give an example of a situation where conducting a hypothesis test having α and β as 0.05 and 0.45 respectively would be preferred over having them both at 0.15. Similarly, give an example for the reverse scenario - a situation where conducting the hypothesis test with both α and β values fixed at 0.15 would be preferred over having them at 0.05 and 0.45 respectively. Also, provide suitable reasons for your choice(Assume that only the values of α and β as mentioned above are provided to you and no other information is available). Question 4: Now, once the batch has passed all the quality tests and is ready to be launched in the market, the marketing team needs to plan an effective online ad campaign to attract new customers. Two taglines were proposed for the campaign, and the team is currently divided on which option to use. Explain why and how A/B testing can be used to decide which option is more effective. Give a stepwise procedure for the test that needs to be conducted.
Sengarofficial
Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability. Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates.
Mgobeaalcoba
Explore the world of inferential statistics using Python. Learn hypothesis testing, confidence intervals, and statistical analysis techniques for data-driven decision-making and insights.
srahman16
This repo has scripts which can be handy for geospatial/data professionals to implement artificial intelligence (ANNs, LSTMs, RNNs, CNNs), machine learning (RF, SVM), and applied inferential statistical methods for spatial and big data analysis.
This specialization is designed to teach learners beginning and intermediate concepts of statistical analysis using the Python programming language. Learners will learn where data come from, what types of data can be collected, study data design, data management, and how to effectively carry out data exploration and visualization. They will be able to utilize data for estimation and assessing theories, construct confidence intervals, interpret inferential results, and apply more advanced statistical modeling procedures. Finally, they will learn the importance of and be able to connect research questions to the statistical and data analysis methods taught to them.
momin-butt
This course will introduce the learner to the basics of the python programming environment, including fundamental python programming techniques such as lambdas, reading and manipulating csv files, and the numpy library. The course will introduce data manipulation and cleaning techniques using the popular python pandas data science library and introduce the abstraction of the Series and DataFrame as the central data structures for data analysis, along with tutorials on how to use functions such as groupby, merge, and pivot tables effectively. By the end of this course, students will be able to take tabular data, clean it, manipulate it, and run basic inferential statistical analyses. This course should be taken before any of the other Applied Data Science with Python courses: Applied Plotting, Charting & Data Representation in Python, Applied Machine Learning in Python, Applied Text Mining in Python, Applied Social Network Analysis in Python.
alfien5
Inferential and Bayesian statistical analysis of gym data.
HarmeetSinghB
A comprehensive analysis of Amazon consumer behaviour using descriptive and inferential statistical methods to enhance customer experience, increase sales, and improve customer retention.
santoshbnalawade
Mini project demonstrating descriptive and inferential statistical analysis using Python. Includes data preprocessing, visualization, and hypothesis testing, implemented in a Google Colab Notebook.
tanvirrezac
Statistical and regression analysis project applying inferential methods to real data as part of the Master of Data Science & Analytics (MDSA) program.
neeraj123-kk
Statistical Analysis on Indian Diabetic Patients Objectives: Performing various inferential Statistical Test to check whether BMI is having significant effect on diabetes or not and also looking over other factors like Age, Blood Pressure, Skin Thickness etc Key Skills: Inferential Statistics, Python, Data Visualization
R-based statistical and machine learning analysis of stroke risk factors in female patients. Combines inferential testing and classification models to examine how demographic and health variables relate to stroke occurrence in a healthcare analytics context.
Performed statistical analysis and hypotheses testing on four factors that drive home sale prices. Conducted t-tests in Excel utilizing inferential statistics to find the mean differences between groups to accept or reject the null hypotheses.
In this repository, to test the previous research findings, I conducted 4 different hypothesis tests in R. For each hypothesis test, I checked the assumption, defined the null hypothesis and alternative hypothesis, plotted hypothesis test graphs, and performed inferential statistical analysis.
chinmai-budati
A Python-based EDA and inferential analysis of 50,000 global oncology records to uncover clinical risk factors, early-detection gaps, and healthcare disparities. Features statistical modeling and data visualizations to evaluate how genetics, lifestyle, and environmental exposure impact cancer severity and economic burden.
agadavictoria16-lang
A comprehensive analysis of school attendance in Nigeria from 2019 to 2022, investigating how COVID-19, internet availability, and funding influenced attendance rates. The project applies descriptive and inferential statistical methods and presents findings with interactive Excel dashboards to support data-driven decision-making.
mohamedawnallah
This specialization is designed to teach learners beginning and intermediate concepts of statistical analysis using the Python programming language. Learners will learn where data come from, what types of data can be collected, study data design, data management, and how to effectively carry out data exploration and visualization. They will be able to utilize data for estimation and assessing theories, construct confidence intervals, interpret inferential results, and apply more advanced statistical modeling procedures. Finally, they will learn the importance of and be able to connect research questions to the statistical and data analysis methods taught to them.
This course introduces learners to the basics of the python programming environment, including fundamental python programming techniques such as lambdas, reading and manipulating CSV files, and the NumPy library. The course introduces data manipulation and cleaning techniques using the popular python pandas data science library and introduce the abstraction of the Series and DataFrame as the central data structures for data analysis, along with tutorials on how to use functions such as group-by, merge, and pivot tables effectively. Along with that you can see tabular data, clean it, manipulate it, and run basic inferential statistical analyses.
In this week, you will continue working on your capstone project. Please remember by the end of this week, you will need to submit the following: A full report consisting of all of the following components (15 marks): Introduction where you discuss the business problem and who would be interested in this project. Data where you describe the data that will be used to solve the problem and the source of the data. Methodology section which represents the main component of the report where you discuss and describe any exploratory data analysis that you did, any inferential statistical testing that you performed, if any, and what machine learnings were used and why. Results section where you discuss the results. Discussion section where you discuss any observations you noted and any recommendations you can make based on the results. Conclusion section where you conclude the report.
Sean-Toroghi
Statistical and inferential analysis
salvaggio-mary
Descriptive and inferential statistical analysis in Python
tessyXavier
EDA and Inferential Statistical analysis on OTT Platform
SakshiSVBorkar
Focuses on advanced statistical methods and inferential analysis techniques.
ninja-shankar
Exploratory Data Analysis and Inferential Statistical Analysis of Yulu's data
Hypothesis testing on datasets
hlfernandez
Descriptive and Inferential Statistical Analysis for one Sample (R and Shiny)
No description available