Description of the Project: + The "Breast Cancer Dataset" is used in this project. It has df.shape=(569, 31) which means 569 rows and 32 columns. + The link of the datset used in this project is -https://www.kaggle.com/uciml/breast-cancer-wisconsin-data + I am importing the important python packages- skelarn, pandas, numpy, seaborn and matplotlib to complete the project. + The machine learning models such as Logistic Regression, Decision Tree, Random Forest, XGBoost, AdaBoost and Gradient Boosting classifier have been used. + The performance of the machine learnig models have been tested on the basis of accuracy score, confusion matrix, classification report, f1 score and roc auc score. + I had tuned hyperparameters to improve the perforamnce for XGBoost model + The good visualization is also important along with accuracy score in model building. The performance of the model have been visualized in this project. Problem statement: The full form of XGBoost is eXtreme Gradient Boosting, also called winner for several kaggle competetion machine learning model. Most of the literatues of Machine Learning found in google has described this model as having best accuracy, efficient and feasibility. It is a decision-tree-based ensemble ML algorithm based on gradient boosting framework. It is considered that XGBoost provides a convenient way of cross-validation. Cross-validation technique is applied to test the model's overfitting during the training phase. If the model gives good accuracy in training dataset but the model works very poor in testing unseen dataset then it is called overfitting or a model of low bias and high variance. I have to calculate the model training and testing errors with different learning rates.As we know that the best technique to choose the learning rate value is between 0 and 1. I will be going to start the test by putting the learning rate as 0.01. It would easy to see the results through good visualization. I am also going to visualize the training and testing errors and accuracies by making a graph. Finally, I will tune the hyperparameters which helps us predict the testing datasets i.e. x_test.
Stars
2
Forks
0
Watchers
2
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
12
commits
Update Comparative Perspectives of Boosting Classifiers in Machine Learning.ipynb
5dbe3caView on GitHubUpdate Comparative Perspectives of Boosting Classifiers in Machine Learning.ipynb
33975acView on GitHub