This project will investigate whether customer account information can be used to build machine learning models to successfully predict whether or not a customer will default on their next month's payment. This is a supervised learning problem; the data set has two different labels ((1=yes will default next month, 0=no will not default next month). I will evaluate the performance of six machine learning model frameworks to understand if the dataset has the ability to accurately predict whether a customer will default on their account in the next month. The six models used are Logistic Regression, KNN classifier, Bagging classifier, AdaBoost classifier, XGBoost classifier, and Random forest classifier. The hyperparameters will be be optimized and model performance will be evaluated using a confusion matrix, classification report (Precision, Recall, and F1), and Area under the Receiver operating characteristic curve (AUCROC).
Stars
3
Forks
0
Watchers
3
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
19
commits
Credit card companies make money by collecting interest on loans. When a customer is unable to pay, then the loan defaults. Defaults cause credit card companies to lose money by either writing the balance off completely, or by selling the balance to a collection agency for pennies on the dollar. This project will investigate whether customer account information can be used to build machine learning models to successfully predict whether or not a customer will default on their next month's payment. This is a supervised learning problem; the data set has two different labels ((1=yes will default next month, 0=no will not default next month). I will use Python to evaluate the performance of six machine learning model frameworks to understand if the dataset has the ability to accurately predict whether a customer will default on their account in the next month. The six models used are Logistic Regression, KNN classifier, Bagging classifier, AdaBoost classifier, XGBoost classifier, and Random forest classifier. The hyperparameters will be be optimized and model performance will be evaluated using a confusion matrix, classification report (Precision, Recall, and F1), and Area under the Receiver operating characteristic curve (AUCROC).
006398cView on GitHub