Found 618 repositories(showing 30)
Aastha2104
Introduction Parkinson’s Disease is the second most prevalent neurodegenerative disorder after Alzheimer’s, affecting more than 10 million people worldwide. Parkinson’s is characterized primarily by the deterioration of motor and cognitive ability. There is no single test which can be administered for diagnosis. Instead, doctors must perform a careful clinical analysis of the patient’s medical history. Unfortunately, this method of diagnosis is highly inaccurate. A study from the National Institute of Neurological Disorders finds that early diagnosis (having symptoms for 5 years or less) is only 53% accurate. This is not much better than random guessing, but an early diagnosis is critical to effective treatment. Because of these difficulties, I investigate a machine learning approach to accurately diagnose Parkinson’s, using a dataset of various speech features (a non-invasive yet characteristic tool) from the University of Oxford. Why speech features? Speech is very predictive and characteristic of Parkinson’s disease; almost every Parkinson’s patient experiences severe vocal degradation (inability to produce sustained phonations, tremor, hoarseness), so it makes sense to use voice to diagnose the disease. Voice analysis gives the added benefit of being non-invasive, inexpensive, and very easy to extract clinically. Background Parkinson's Disease Parkinson’s is a progressive neurodegenerative condition resulting from the death of the dopamine containing cells of the substantia nigra (which plays an important role in movement). Symptoms include: “frozen” facial features, bradykinesia (slowness of movement), akinesia (impairment of voluntary movement), tremor, and voice impairment. Typically, by the time the disease is diagnosed, 60% of nigrostriatal neurons have degenerated, and 80% of striatal dopamine have been depleted. Performance Metrics TP = true positive, FP = false positive, TN = true negative, FN = false negative Accuracy: (TP+TN)/(P+N) Matthews Correlation Coefficient: 1=perfect, 0=random, -1=completely inaccurate Algorithms Employed Logistic Regression (LR): Uses the sigmoid logistic equation with weights (coefficient values) and biases (constants) to model the probability of a certain class for binary classification. An output of 1 represents one class, and an output of 0 represents the other. Training the model will learn the optimal weights and biases. Linear Discriminant Analysis (LDA): Assumes that the data is Gaussian and each feature has the same variance. LDA estimates the mean and variance for each class from the training data, and then uses properties of statistics (Bayes theorem , Gaussian distribution, etc) to compute the probability of a particular instance belonging to a given class. The class with the largest probability is the prediction. k Nearest Neighbors (KNN): Makes predictions about the validation set using the entire training set. KNN makes a prediction about a new instance by searching through the entire set to find the k “closest” instances. “Closeness” is determined using a proximity measurement (Euclidean) across all features. The class that the majority of the k closest instances belong to is the class that the model predicts the new instance to be. Decision Tree (DT): Represented by a binary tree, where each root node represents an input variable and a split point, and each leaf node contains an output used to make a prediction. Neural Network (NN): Models the way the human brain makes decisions. Each neuron takes in 1+ inputs, and then uses an activation function to process the input with weights and biases to produce an output. Neurons can be arranged into layers, and multiple layers can form a network to model complex decisions. Training the network involves using the training instances to optimize the weights and biases. Naive Bayes (NB): Simplifies the calculation of probabilities by assuming that all features are independent of one another (a strong but effective assumption). Employs Bayes Theorem to calculate the probabilities that the instance to be predicted is in each class, then finds the class with the highest probability. Gradient Boost (GB): Generally used when seeking a model with very high predictive performance. Used to reduce bias and variance (“error”) by combining multiple “weak learners” (not very good models) to create a “strong learner” (high performance model). Involves 3 elements: a loss function (error function) to be optimized, a weak learner (decision tree) to make predictions, and an additive model to add trees to minimize the loss function. Gradient descent is used to minimize error after adding each tree (one by one). Engineering Goal Produce a machine learning model to diagnose Parkinson’s disease given various features of a patient’s speech with at least 90% accuracy and/or a Matthews Correlation Coefficient of at least 0.9. Compare various algorithms and parameters to determine the best model for predicting Parkinson’s. Dataset Description Source: the University of Oxford 195 instances (147 subjects with Parkinson’s, 48 without Parkinson’s) 22 features (elements that are possibly characteristic of Parkinson’s, such as frequency, pitch, amplitude / period of the sound wave) 1 label (1 for Parkinson’s, 0 for no Parkinson’s) Project Pipeline pipeline Summary of Procedure Split the Oxford Parkinson’s Dataset into two parts: one for training, one for validation (evaluate how well the model performs) Train each of the following algorithms with the training set: Logistic Regression, Linear Discriminant Analysis, k Nearest Neighbors, Decision Tree, Neural Network, Naive Bayes, Gradient Boost Evaluate results using the validation set Repeat for the following training set to validation set splits: 80% training / 20% validation, 75% / 25%, and 70% / 30% Repeat for a rescaled version of the dataset (scale all the numbers in the dataset to a range from 0 to 1: this helps to reduce the effect of outliers) Conduct 5 trials and average the results Data a_o a_r m_o m_r Data Analysis In general, the models tended to perform the best (both in terms of accuracy and Matthews Correlation Coefficient) on the rescaled dataset with a 75-25 train-test split. The two highest performing algorithms, k Nearest Neighbors and the Neural Network, both achieved an accuracy of 98%. The NN achieved a MCC of 0.96, while KNN achieved a MCC of 0.94. These figures outperform most existing literature and significantly outperform current methods of diagnosis. Conclusion and Significance These robust results suggest that a machine learning approach can indeed be implemented to significantly improve diagnosis methods of Parkinson’s disease. Given the necessity of early diagnosis for effective treatment, my machine learning models provide a very promising alternative to the current, rather ineffective method of diagnosis. Current methods of early diagnosis are only 53% accurate, while my machine learning model produces 98% accuracy. This 45% increase is critical because an accurate, early diagnosis is needed to effectively treat the disease. Typically, by the time the disease is diagnosed, 60% of nigrostriatal neurons have degenerated, and 80% of striatal dopamine have been depleted. With an earlier diagnosis, much of this degradation could have been slowed or treated. My results are very significant because Parkinson’s affects over 10 million people worldwide who could benefit greatly from an early, accurate diagnosis. Not only is my machine learning approach more accurate in terms of diagnostic accuracy, it is also more scalable, less expensive, and therefore more accessible to people who might not have access to established medical facilities and professionals. The diagnosis is also much simpler, requiring only a 10-15 second voice recording and producing an immediate diagnosis. Future Research Given more time and resources, I would investigate the following: Create a mobile application which would allow the user to record his/her voice, extract the necessary vocal features, and feed it into my machine learning model to diagnose Parkinson’s. Use larger datasets in conjunction with the University of Oxford dataset. Tune and improve my models even further to achieve even better results. Investigate different structures and types of neural networks. Construct a novel algorithm specifically suited for the prediction of Parkinson’s. Generalize my findings and algorithms for all types of dementia disorders, such as Alzheimer’s. References Bind, Shubham. "A Survey of Machine Learning Based Approaches for Parkinson Disease Prediction." International Journal of Computer Science and Information Technologies 6 (2015): n. pag. International Journal of Computer Science and Information Technologies. 2015. Web. 8 Mar. 2017. Brooks, Megan. "Diagnosing Parkinson's Disease Still Challenging." Medscape Medical News. National Institute of Neurological Disorders, 31 July 2014. Web. 20 Mar. 2017. Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection', Little MA, McSharry PE, Roberts SJ, Costello DAE, Moroz IM. BioMedical Engineering OnLine 2007, 6:23 (26 June 2007) Hashmi, Sumaiya F. "A Machine Learning Approach to Diagnosis of Parkinson’s Disease."Claremont Colleges Scholarship. Claremont College, 2013. Web. 10 Mar. 2017. Karplus, Abraham. "Machine Learning Algorithms for Cancer Diagnosis." Machine Learning Algorithms for Cancer Diagnosis (n.d.): n. pag. Mar. 2012. Web. 20 Mar. 2017. Little, Max. "Parkinsons Data Set." UCI Machine Learning Repository. University of Oxford, 26 June 2008. Web. 20 Feb. 2017. Ozcift, Akin, and Arif Gulten. "Classifier Ensemble Construction with Rotation Forest to Improve Medical Diagnosis Performance of Machine Learning Algorithms." Computer Methods and Programs in Biomedicine 104.3 (2011): 443-51. Semantic Scholar. 2011. Web. 15 Mar. 2017. "Parkinson’s Disease Dementia." UCI MIND. N.p., 19 Oct. 2015. Web. 17 Feb. 2017. Salvatore, C., A. Cerasa, I. Castiglioni, F. Gallivanone, A. Augimeri, M. Lopez, G. Arabia, M. Morelli, M.c. Gilardi, and A. Quattrone. "Machine Learning on Brain MRI Data for Differential Diagnosis of Parkinson's Disease and Progressive Supranuclear Palsy."Journal of Neuroscience Methods 222 (2014): 230-37. 2014. Web. 18 Mar. 2017. Shahbakhi, Mohammad, Danial Taheri Far, and Ehsan Tahami. "Speech Analysis for Diagnosis of Parkinson’s Disease Using Genetic Algorithm and Support Vector Machine."Journal of Biomedical Science and Engineering 07.04 (2014): 147-56. Scientific Research. July 2014. Web. 2 Mar. 2017. "Speech and Communication." Speech and Communication. Parkinson's Disease Foundation, n.d. Web. 22 Mar. 2017. Sriram, Tarigoppula V. S., M. Venkateswara Rao, G. V. Satya Narayana, and D. S. V. G. K. Kaladhar. "Diagnosis of Parkinson Disease Using Machine Learning and Data Mining Systems from Voice Dataset." SpringerLink. Springer, Cham, 01 Jan. 1970. Web. 17 Mar. 2017.
Divyesh-1306
Lung cancer prediction applies machine learning to classify cases as normal, benign, or malignant using patient data like demographics, symptoms, or imaging. Models like Logistic Regression or CNNs are trained on labeled data and evaluated with metrics like accuracy and AUC-ROC, enabling early detection and timely treatment.
mistersharmaa
Breast cancer has the second highest mortality rate in women next to lung cancer. As per clinical statistics, 1 in every 8 women is diagnosed with breast cancer in their lifetime. However, periodic clinical check-ups and self-tests help in early detection and thereby significantly increase the chances of survival. Invasive detection techniques cause rupture of the tumor, accelerating the spread of cancer to adjoining areas. Hence, there arises the need for a more robust, fast, accurate, and efficient non-invasive cancer detection system. Early detection can give patients more treatment options. In order to detect signs of cancer, breast tissue from biopsies is stained to enhance the nuclei and cytoplasm for microscopic examination. Then, pathologists evaluate the extent of any abnormal structural variation to determine whether there are tumors. Architectural Distortion (AD) is a very subtle contraction of the breast tissue and may represent the earliest sign of cancer. Since it is very likely to be unnoticed by radiologists, several approaches have been proposed over the years but none using deep learning techniques. AI will become a transformational force in healthcare and soon, computer vision models will be able to get a higher accuracy when researchers have the access to more medical imaging datasets. The application of machine learning models for prediction and prognosis of disease development has become an irrevocable part of cancer studies aimed at improving the subsequent therapy and management of patients. The application of machine learning models for accurate prediction of survival time in breast cancer on the basis of clinical data is the main objective. We have developed a computer vision model to detect breast cancer in histopathological images. Two classes will be used in this project: Benign and Malignant
0xpranjal
Breast cancer detection using 4 different models i.e. Logistic Regression, KNN, SVM, and Decision Tree Machine Learning models and optimizing them for even a better accuracy.
Aayushi-2808
# Cervical_cancer_detection_using_ML # Introduction According to World Health Organisation (WHO), when detected at an early stage, cervical cancer is one of the most curable cancers. Hence, the main motive behind this project is to detect the cancer in its early stages so that it can be treated and managed in the patients effectively. # Flow of project is as explained below: This project is divided into 5 parts: 1. Data Cleaning 2. Exploratory Data Analysis 3. Baseline model: Logistic Regression 4. Ensemble Models: Bagging with Decision Trees, Random forest and Boosting 5. Model Comparison and results # Refer below for References: Link to basic information regarding cervical cancer : https://www.cdc.gov/cancer/cervical/basic_info/index.htm The dataset for tackling the problem is supplied by the UCI repository for Machine Learning. Link to Dataset : https://archive.ics.uci.edu/ml/datasets/Cervical+cancer+%28Risk+Factors%29 The dataset contains a list of risk factors that lead up to the Biopsy examination. The generation of the predictor variable is taken care of in part 2 (Exploratory data analysis) of this report. We will try to predict the 'biopsy' variable from the dataset using Logistic Regression, Random Forest, Bagging with Decision Trees and Boosting with XGBoost Classifier. # Results: Based on our Base model and The Ensemble Models we used, we observed - 1. After the entire process of training, hyperparameter tuning and tackling class imbalance was complete , we obtained the results as depicted through the graphics. 2. We observe that Bagging and Random Forest gives the highest accuracy and precision of 97.09 and 80% resp. 3. Plotting the Confusion matrix showed us that Random Forest using upsampling and class weights gives us 2 false positives and 3 false negatives with auc of 0.87 # Why random forest is the best model?? 1. So as we see, while comparing all of our models,RF has maximum f1_score and accuracy along with Bagging i.e. 76.2 n 97.09% resp. 2. And it also produces the same amount of false negatives with a recall of 72.73% just like all the other models. 3. But we still consider RF better coz of its added advantage that, the decision trees are decorrelated as compared to bagging leading to lesser variance and greater ability to generalize. # Conclusion: On observing the feature importance of the best model i.e random forest, we can see that the most important features are Schiller, Hinselmann, HPV, Citology, etc. This also makes sense because Schiller and Hinselmann are actually the tests used to detect cervical cancer. # Problems Faced: A major problem encountered while training the model was that it had too little data to train. On collaborating with all the hospitals in India, we can have enough data points to train a model with a higher recall, thus making the model better. # Scope of Improvement As next steps I would want to do exactly that, to deploy the model and refine it. We may also modify the number of the predictor variables, as it may well turn out that there are other predictors which may not be present in our current dataset. This can only be found by practical implementation of our predictions.
Develop a machine learning (ML) model for lung cancer detection using U-Net and DenseNet architectures. Achieve an accuracy of at least 99.96% in lung nodule detection and classification. Achieved validation of 99.9%.
nano-bot01
Breast cancer detection using machine learning with deployment of model
This project develops a machine learning model to predict cancer risk levels (High, Medium, Low) based on demographic, behavioral, and health data. It addresses class imbalance using techniques like SMOTE and optimizes model performance with hyperparameter tuning, providing crucial insights for early detection and intervention.
Abstract— This paper presents a machine learning (ML) method for detection and visual analysis of invasive ductal carcinoma (IDC) locations in whole slide images (WSI) of breast cancer. Machine learning is an artificial intelligence approach that learns from the experience consisting of computational methods and statistics to learn information directly from the dataset for modeling the relationships in data. It is a similar approach to how the human brain works by interpreting features such as representative layers.
Praveenanand333
Empowering early cancer detection through advanced machine learning models. Our project focuses on predicting oral, cervical, and brain tumors using a blend of image and risk factor data. Join us in the journey to enhance healthcare outcomes through cutting-edge technology
This course dives into the basics of machine learning using an approachable, and well-known programming language, Python. In this course, we will be reviewing two main components: First, you will be learning about the purpose of Machine Learning and where it applies to the real world. Second, you will get a general overview of Machine Learning topics such as supervised vs unsupervised learning, model evaluation, and Machine Learning algorithms. In this course, you practice with real-life examples of Machine learning and see how it affects society in ways you may not have guessed! By just putting in a few hours a week for the next few weeks, this is what you’ll get. 1) New skills to add to your resume, such as regression, classification, clustering, sci-kit learn and SciPy 2) New projects that you can add to your portfolio, including cancer detection, predicting economic trends, predicting customer churn, recommendation engines, and many more. 3) And a certificate in machine learning to prove your competency, and share it anywhere you like online or offline, such as LinkedIn profiles and social media. If you choose to take this course and earn the Coursera course certificate, you will also earn an IBM digital badge upon successful completion of the course.
lindselliott
Deltahacks 2019 - Early Skin Cancer Detection using Machine Learning Image Processing Models
1AyaNabil1
Explore our open-source repository focused on healthcare machine learning. We've developed predictive models for cardiovascular disease, diabetes, breast cancer, and more. Our projects employ diverse machine learning algorithms and data science techniques, enhancing early detection, diagnosis, and patient outcomes.
Breast cancer detection using machine learning classification is a project where you build a model to identify whether a given set of medical features indicates the presence of breast cancer. This project involves using a labeled dataset of medical records, where each record is classified as either indicating breast cancer or not.
Breast Cancer Detection Using Machine Learning Classifier Goal of this ML project : I have extracted features of breast cancer patient cells and normal person cells then I create an ML model to classify malignant and benign tumor. To complete this ML project i used the supervised machine learning classifier algorithm. Author: Mannai Mohamed Mortadha
rajas2716
Machine Learning model for breast cancer detection
This repository contains the model utilized for the study "Applying Explainable Machine Learning Models for Detection of Breast Cancer Lymph Node Metastasis in Patients Eligible for Neoadjuvant Treatment"
javadAlikhani-ML
Breast Cancer Detection using Deep Learning This project aims to build and evaluate machine learning models for predicting breast cancer diagnosis based on the Breast Cancer Wisconsin dataset. The dataset contains 569 samples with 30 features describing cell nuclei characteristics, and the task is to classify tumors as malignant or benign.
J-TECH-bot
A Machine Learning + Deep Learning powered web application for breast cancer detection based on medical data. This project uses trained models to classify whether a tumor is Malignant (cancerous) or Benign (non-cancerous).
coder-apr-5
Machine Learning Breast Cancer Classification involves developing predictive models to classify breast cancer as benign or malignant based on clinical data, such as tumor size and cell features. Using algorithms like logistic regression, SVM, or neural networks, aiding early detection and improving patient outcomes.
MohammadMardi
Our project focuses on using machine learning classification algorithms to develop a breast cancer detection system. We gathered a diverse dataset and applied preprocessing techniques, feature selection, and various classification algorithms to train and evaluate our models.
The goal of the project is to utilize advanced image processing and machine learning models to accurately detect chromatin within cell images, ultimately enhancing cancer research and diagnosis. The project leverages techniques such as 3D Convolutional Neural Networks (CNNs), object detection models, synthetic data generation, and error prediction
This college project explores machine learning to aid in early ovarian cancer detection. We applied K-Nearest Neighbors (KNN), Random Forest (RF), and Decision Tree (DT) models, plus an RF-DT ensemble for enhanced accuracy in future predeiction on un-seen data.
SanketGore10
Breast Cancer Detection Machine Learning Model
Amit288
Breast Cancer Detection using Machine Learning Model
98.25% accurate Breast Cancer detection - This is an Ensemble Machine learning Model utilising Pytorch and Tensorflow neural networks. Scikit voting classifier was used to create soft voting.
Habiboys
API to implement a machine learning model for cancer detection on Google Cloud
darien-schettler
EDA and baseline machine learning model prediction for histopathologic cancer detection challenge via Kaggle
alihassanml
This project implements a Breast Cancer Detection system using Principal Component Analysis (PCA) for dimensionality reduction and a machine learning model for classification.
One of the most common and fatal cancer in the universe is skin cancer which arise from skin of epidermis, the topmost layer of the skin, it can happen anywhere in the body. We can find out the cancer by early detection. Skin cancer detection is a time consuming process and very critical. So in clinical applications, the machine learning analysis of skin cancer is failed to give correct images for a model. In our paper we followed three pre-processing steps which are: a) removing the shadows from the image which is illumination correction processing, b) to find the border of the skin lesion in the segmentation part, c) feature extraction by doing the ABCD framework. Our thesis makes an attempt to implement the method of Convolutional Neural Network. Using this classification, we find out the best result in inception v3 which was trained on skin lesions and we got the accuracy of 82.4\%. So, our primary focus of this thesis is to differentiate between cancerous and non-cancerous image. Then our goal is to reduce importance of one of the painful process in cancer detection which is known as biopsy. Biopsy is removing tissue from a body and later it goes to many laboratory tests.