Rare Event Classification (Ensemble Modelling incorporating Random Under-sampling). The data consists of 10,500 credit applications, each classified as good or bad credit. However, there are only 500 bad credit applications. Since this is less than 5% of the data, classifying applicants as bad credit is referred to as a rare event problem. This is also known as anomaly dete ction in many applications. Approach: The best ratio is discovered by trying ratios between 50:50 to 85:15. Build an ensemble model based on the optimum ratio selected. This is done my creating ensemble of trees using the optimum ratio, fitting a model to each, making classification probability predictions for each and then averaging those to get predicted classification probabilities. From that we can calculate the loss totaled over all the trees. The base model is a decision tree with a minimum leaf size is 5, and the minimum split size is 5. The optimum depth for this model is determined by optimizing the F1-score using 10-fold cross-validation.
Stars
4
Forks
2
Watchers
4
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
2
commits
Rare Event Classification-Ensemble Modelling-Random Under sampling
2309a66View on GitHub