Found 98 repositories(showing 30)
1Amrit-Singh
Credit risk analysis and default prediction to support data-driven lending decisions.
EmmanuelLwele
Interview Coding Challenge Data Science Step 1 of the Data Scientist Interview process. Follow the instructions below to complete this portion of the interview. Please note, although we do not set a time limit for this challenge, we recommend completing it as soon as possible as we evaluate candidates on a first come, first serve basis... If you have any questions, please feel free to email support@TheZig.io. We will do our best to clarify any issues you come across. Prerequisites: A Text Editor - We recommend Visual Studio Code for the ClientSide code, its lightweight, powerful and Free! (https://code.visualstudio.com/) SQL Server Management Studio (https://docs.microsoft.com/en-us/sql/ssms/download-sql-server-management-studio-ssms?view=sql-server-2017) R - Software Environment for statitistal computing and graphics. You can download R at the mirrors listed here (https://cran.r-project.org/mirrors.html) Azure - Microsoft's Cloud Computing platform. You can create an account without a credit card by using the Azure Pass available at this link (https://azure.microsoft.com/en-us/offers/azure-pass/) Git - For source control and committing your final solution to a new private repo (https://git-scm.com/downloads) a. If you're not very familiar with git commands, here's a helpful cheatsheet (https://services.github.com/on-demand/downloads/github-git-cheat-sheet.pdf) 'R' Challenge For each numbered section below, write R code and comments to solve the problem or to show your rationale. For sections that ask you to give outputs, provide outputs in separate files and name them with the section number and the word output "Section 1 - Output". Create a private repo and submit your modified R script along with any supporting files. Load in the dataset from the accompanying file "account-defaults.csv" This dataset contains information about loan accounts that either went delinquent or stayed current on payments within the loan's first year. FirstYearDelinquency is the outcome variable, all others are predictors. The objective of modeling with this dataset is to be able to predict the probability that new accounts will become delinquent; it is primarily valuable to understand lower-risk accounts versus higher-risk accounts (and not just to predict 'yes' or 'no' for new accounts). FirstYearDelinquency - indicates whether the loan went delinquent within the first year of the loan's life (values of 1) AgeOldestIdentityRecord - number of months since the first record was reported by a national credit source AgeOldestAccount - number of months since the oldest account was opened AgeNewestAutoAccount - number of months since the most recent auto loan or lease account was opened TotalInquiries - total number of credit inquiries on record AvgAgeAutoAccounts - average number of months since auto loan or lease accounts were opened TotalAutoAccountsNeverDelinquent - total number of auto loan or lease accounts that were never delinquent WorstDelinquency - worst status of days-delinquent on an account in the first 12 months of an account's life; values of '400' indicate '400 or greater' HasInquiryTelecomm - indicates whether one or more telecommunications credit inquires are on record within the last 12 months (values of 1) Perform an exploratory data analysis on the accounts data In your analysis include summary statistics and visualizations of the distributions and relationships. Build one or more predictive model(s) on the accounts data using regression techniques Identify the strongest predictor variables and provide interpretations. Identify and explain issues with the model(s) such as collinearity, etc. Calculate predictions and show model performance on out-of-sample data. Summarize out-of-sample data in tiers from highest-risk to lowest-risk. Split up the dataset by the WorstDelinquency variable. For each subset, run a simple regression of FirstYearDelinquency ~ TotalInquiries. Extract the predictor's coefficient and p-value from each model. Store the in a list where the names of the list correspond to the values of WorstDelinquency. Load in the dataset from the accompanying file "vehicle-depreciation.csv". The dataset contains information about vehicles that our company purchases at auction, sells to customers, repossess from defaulted accounts, and finally re-sell at auction to recover some of our losses. Perform an analysis and/or build a predictive model that provides a method to estimate the depreciation of vehicle worth (from auction purchase to auction sale). Use whatever techniques you want to provide insight into the dataset and walk us through your results - this is your chance to show off your analytical and storytelling skills! CustomerGrade - the credit risk grade of the customer AuctionPurchaseDate - the date that the vehicle was purchased at auction AuctionPurchaseAmount - the dollar amount spent purchasing the vehicle at auction AuctionSaleDate - the date that the vehicle was sold at auction AuctionSaleAmount - the dollar amount received for selling the vehicle at auction VehicleType - the high-level class of the vehicle Year - the year of the vehicle Make - the make of the vehicle Model - the model of the vehicle Trim - the trim of the vehicle BodyType - the body style of the vehicle AuctionPurchaseOdometer - the odometer value of the vehicle at the time of purchase at the auction AutomaticTransmission - indicates (with value of 1) whether the vehicle has an automatic transmission DriveType - the drivetrain type of the vehicle
credit risk using advanced machine learning techniques on the Home Credit Default Risk dataset. This study implements advanced feature engineering, handles missing data, and addresses class imbalances to improve model performance and reliability. Model explainability is achieved using SHAP (SHapley Additive exPlanations
Credit defaulting results in a large profit loss to banks and other credit lenders. The success of the banking industry results in the ability to understand risk. This project uses big data technologies like Mapreduce, HDFS along with PySpark and AWS for analysis of credit history and its prediction
riyagoyal08010-glitch
Credit Card Default Risk Prediction is an end-to-end machine learning project that predicts the likelihood of credit card default using a Logistic Regression model. The project includes detailed exploratory data analysis, feature engineering, class imbalance handling, and model evaluation, followed by deployment as an interactive web application.
Loan default prediction is an important aspect in banking industry. In Finance and Banking sector the losses incurred by this Industry due to loan defaults or we can say customer not paying back their loan is increasing drastically. In this study we have built a loan default prediction model on the data collected for borrowers of multiple states in the Unites States of America. The research focuses on constructing a model that would predict whether the borrower would repay the loan or would end up being a defaulter. The research uses Random Forest classifier, Adaboost classifier and Artificial neural network model to compare the performance of these classifiers. It also works on understating and obtaining the important features that are to be monitored carefully before sanctioning any such credits. Keywords: Credit risk analysis, Loan Default, Machine Learning, Random Forest classifier, Adaboost Classifier, Artificial Neural network.
vedantbhatiaa
No description available
Built an end-to-end credit risk prediction pipeline using structured financial data. Performed domain-driven feature engineering, hypothesis validation, and model comparison (Logistic Regression, Decision Tree, Random Forest) using business-focused evaluation metrics.
RahulDhanasiri
Lending Club data
Finance and Risk Analytics Project: Predicting credit default risk using machine learning models (Logistic Regression, Random Forest) and assessing stock market risk through historical returns and volatility analysis to guide financial risk management and investment strategies.
thaisvsthinhs
Credit risk prediction pipeline using Kaggle’s "Give Me Some Credit" dataset, featuring WOE & IV analysis, LightGBM modeling, AUC-ROC evaluation, and SHAP-based interpretability for actionable default risk insights.
abh2050
This project consists of two components: a Jupyter notebook for credit risk analysis and a Streamlit app for real-time loan default prediction.
mohdareeb0x-commits
Machine learning–powered credit risk prediction API with FastAPI and REST endpoints. Supports single and batch applicant analysis, risk scoring, and default probability estimation. Fully Dockerized for easy deployment and integration into financial systems.
auroraeye-dev
Built an ML-based loan default prediction system using borrower financial and credit data. Performed data cleaning, feature engineering, and model evaluation, achieving 87% training and 80% test accuracy. Used confusion matrix analysis to support credit risk assessment and underwriting decisions.
Abdullah321Umar
🔴 Credit Risk Prediction 🔴 A machine-learning–based analysis designed to predict whether a loan applicant is likely to default. Using a refined Credit Risk Dataset, I cleaned, processed, and visualized key financial features such as income, loan amount, and credit history. Multiple classification models were trained with accuracy.
Muthuram3010
This project focuses on building an explainable AI model for credit risk analysis using machine learning and SHAP (SHapley Additive exPlanations). The model predicts the likelihood of credit default and provides clear, interpretable insights into the factors influencing each prediction.
Jossian
This repository contains data analysis and predictive modeling tools focused on credit card default payments in a banking context, using the UCI Machine Learning Repository dataset. The project leverages PySpark for large-scale data processing and model training, enabling scalable and efficient analysis of customer behavior and risk prediction.
Home Credit Default Risk Prediction is a machine learning project that aims to assess the creditworthiness of loan applicants and predict their ability to repay loans. By leveraging a dataset containing telco data and transactional data, this project utilizes exploratory data analysis, feature engineering, and various classification algorithms.
shikha-shetty-analyst
No description available
Mohaneesh1703
No description available
narangnaman29
No description available
AMITVPATIL9
No description available
nguyen010103
No description available
utkarshujwal
No description available
Devanshimaheshwari07
No description available
virendermachra0705-hash
No description available
No description available
No description available
The FRA project consists of two parts. Part A focuses on credit default prediction, aiming to assess a company's ability to meet its debt obligations. Part B involves market risk analysis, where the mean and std deviation of stock returns are calculated to gain insights into stock performance and volatility.
Project analyzes financial data from 1,000 customers to estimate credit risk and predict potential defaults. Using features like credit scores and transaction history, it aims to enhance decision-making in risk management.