Found 313 repositories(showing 30)
rohanmistry231
A Python-based project for analyzing customer churn using data visualization and machine learning models to predict churn probability. Employs libraries like Pandas, Scikit-learn, and Matplotlib for data preprocessing, model training, and insightful visualizations.
DataAstronomy
For the last few years Betfair is losing too many customers and is finding a solution to retain its customers. Aim of this project is to build a customer churn model to predict the customers who are about to get churned so that Betfair can implement different business strategies to retain those customers before they actually leave. The tools I am using for this analysis are R-studio and Tableau. Package mlr was chosen as the modeling package. The data for the purpose of prediction was provided by Betfair. After proper data exploration and visualization, important features for the customer churn prediction model was identified. The 8 different classification models were applied on the data in separate steps of configuring the learner task, making the learner, training the learner, prediction and performance evaluation. Out of the 8 different models, Random Forest was chosen as the best model. Cross-validation was done using random forest was done and obtained a mean miss classification error rate of 0.1278126. Hyper-parametric tuning of the random forest model was performed using package mlrHyperopt. There was only 0.05% improvement in the model accuracy after Hyper-parametric tuning. The model obtained is good enough to predict the customers who are about to fall in the churned customer category. Applying this model on the real-time data in Betfair can save huge money in revenue.
After Covid-19, with the help of advances in the technology online shopping have become a part of daily life and it is expected to grow more all around the world. Accordingly Customer behavior are becoming more and more complex with the passage of time. With increasing competitor in the market, Retailers tries their best to hold their customer because attracting new customers cost several times then retaining the existing customer. For this purpose, Retailer analysis their customers purchase so that they can provide better service and maximize their profit margins. In this work, EDA of e-Retail data has been performed, using RFM analysis to identify the categorical segmentation of customers and Time Series Analysis with ARIMA Model to identify trends and clustering and classification models are implemented to identify the customers who are likely to churn. Furthermore, will also analyze top factors that influence user retention.
This project deals with the classification of the bank customers on whether a customer will leave the bank (i.e.; churn) or not, by applying the below steps of a Data Science Project Life-Cycle 1. Data Exploration, Analysis and Visualisations 2. Data Pre-processing 3. Data Preparation for the Modelling 4. Model Training 5. Model Validation 6. Optimized Model Selection based on Various Performance Metrics 7. Deploying the Best Optimized Model into Unseen Test Data 8. Evaluating the Optimized Model’s Performance Metrics The business case of determining the churn status of bank customers are explored, trained and validated on 7 different classification algorithms/models as listed below and the best optimized model is selected based on the accuracy metrics. 1. Decision Tree Classifier - CART (Classification and Regression Tree) Algorithm 2. Decision Tree Classifier - IDE (Iterative Dichotomiser) Algorithm 3. Ensemble Random Forest Classifier Algorithm 4. Ensemble Adaptive Boosting Classifier Algorithm 5. Ensemble Hist Gradient Boosting Classifier Algorithm 6. Ensemble Extreme Gradient Boosting (XGBoost) Classifier Algorithm 7. Support Vector Machine (SVM) Classifier Algorithm
Mahanamana
this project represents the analysis exploratory and the creation of machine learning model to solve the problem of churn classification and Customer retention in the field of telecommunication.
muqadasejaz
A machine learning project that predicts customer churn using classification algorithms such as KNN, SVC, Logistic Regression, Decision Tree, and Random Forest. Includes data analysis, preprocessing, visualization, model comparison, and a CLI prediction interface with saved models.
ikigai-aa
Sentiment analysis is the interpretation and classification of emotions (positive, negative and neutral) within text data using text analysis techniques. Sentiment analysis allows businesses to identify customer sentiment toward products, brands or services in online conversations and feedback. Sentiment analysis is a text analysis method that detects polarity (e.g. a positive or negative opinion) within text, whether a whole document, paragraph, sentence, or clause. Why Perform Sentiment Analysis? It’s estimated that 80% of the world’s data is unstructured, in other words it’s unorganized. Huge volumes of text data (emails, support tickets, chats, social media conversations, surveys, articles, documents, etc), is created every day but it’s hard to analyze, understand, and sort through, not to mention time-consuming and expensive. Sentiment analysis, however, helps businesses make sense of all this unstructured text by automatically tagging it. Benefits of sentiment analysis include: Sorting Data at Scale Can you imagine manually sorting through thousands of tweets, customer support conversations, or surveys? There’s just too much data to process manually. Sentiment analysis helps businesses process huge amounts of data in an efficient and cost-effective way. Real-Time Analysis Sentiment analysis can identify critical issues in real-time, for example is a PR crisis on social media escalating? Is an angry customer about to churn? Sentiment analysis models can help you immediately identify these kinds of situations and gauge brand sentiment, so you can take action right away. Consistent criteria It’s estimated that people only agree around 60-65% of the time when determining the sentiment of a particular text. Tagging text by sentiment is highly subjective, influenced by personal experiences, thoughts, and beliefs. By using a centralized sentiment analysis system, companies can apply the same criteria to all of their data, helping them improve accuracy and gain better insights.
keerthikonari
Machine Learning project to predict telecom customer churn using classification models and data analysis techniques
Lalitha-radhakrishnan
This project involves building an ANN-based churn model that can determine whether certain bank customers will continue using their service or not. The ANN model analyzes the relationship between customer churn & multiple independent variables affecting churn. Recommendations for improvements in service were suggested based on the results of the analysis. Skills and Tools Neural Networks, Classification, Keras, Tensorflow
Anindo21
A machine learning project to predict customer churn in the telecom sector using classification algorithms. Includes exploratory data analysis, feature engineering, model evaluation, and visualizations using Python (scikit-learn, pandas, seaborn, etc.).
ShrutiM1234
Business problem overview In the telecom industry, customers are able to choose from multiple service providers and actively switch from one operator to another. In this highly competitive market, the telecommunications industry experiences an average of 15-25% annual churn rate. Given the fact that it costs 5-10 times more to acquire a new customer than to retain an existing one, customer retention has now become even more important than customer acquisition. For many incumbent operators, retaining high profitable customers is the number one business goal. To reduce customer churn, telecom companies need to predict which customers are at high risk of churn. In this project, you will analyse customer-level data of a leading telecom firm, build predictive models to identify customers at high risk of churn and identify the main indicators of churn. Understanding and defining churn There are two main models of payment in the telecom industry - postpaid (customers pay a monthly/annual bill after using the services) and prepaid (customers pay/recharge with a certain amount in advance and then use the services). In the postpaid model, when customers want to switch to another operator, they usually inform the existing operator to terminate the services, and you directly know that this is an instance of churn. However, in the prepaid model, customers who want to switch to another network can simply stop using the services without any notice, and it is hard to know whether someone has actually churned or is simply not using the services temporarily (e.g. someone may be on a trip abroad for a month or two and then intend to resume using the services again). Thus, churn prediction is usually more critical (and non-trivial) for prepaid customers, and the term ‘churn’ should be defined carefully. Also, prepaid is the most common model in India and Southeast Asia, while postpaid is more common in Europe in North America. This project is based on the Indian and Southeast Asian market. Definitions of churn There are various ways to define churn, such as: Revenue-based churn: Customers who have not utilised any revenue-generating facilities such as mobile internet, outgoing calls, SMS etc. over a given period of time. One could also use aggregate metrics such as ‘customers who have generated less than INR 4 per month in total/average/median revenue’. The main shortcoming of this definition is that there are customers who only receive calls/SMSes from their wage-earning counterparts, i.e. they don’t generate revenue but use the services. For example, many users in rural areas only receive calls from their wage-earning siblings in urban areas. Usage-based churn: Customers who have not done any usage, either incoming or outgoing - in terms of calls, internet etc. over a period of time. A potential shortcoming of this definition is that when the customer has stopped using the services for a while, it may be too late to take any corrective actions to retain them. For e.g., if you define churn based on a ‘two-months zero usage’ period, predicting churn could be useless since by that time the customer would have already switched to another operator. In this project, you will use the usage-based definition to define churn. High-value churn In the Indian and the Southeast Asian market, approximately 80% of revenue comes from the top 20% customers (called high-value customers). Thus, if we can reduce churn of the high-value customers, we will be able to reduce significant revenue leakage. In this project, you will define high-value customers based on a certain metric (mentioned later below) and predict churn only on high-value customers. Understanding the business objective and the data The dataset contains customer-level information for a span of four consecutive months - June, July, August and September. The months are encoded as 6, 7, 8 and 9, respectively. The business objective is to predict the churn in the last (i.e. the ninth) month using the data (features) from the first three months. To do this task well, understanding the typical customer behaviour during churn will be helpful. Understanding customer behaviour during churn Customers usually do not decide to switch to another competitor instantly, but rather over a period of time (this is especially applicable to high-value customers). In churn prediction, we assume that there are three phases of customer lifecycle : The ‘good’ phase: In this phase, the customer is happy with the service and behaves as usual. The ‘action’ phase: The customer experience starts to sore in this phase, for e.g. he/she gets a compelling offer from a competitor, faces unjust charges, becomes unhappy with service quality etc. In this phase, the customer usually shows different behaviour than the ‘good’ months. Also, it is crucial to identify high-churn-risk customers in this phase, since some corrective actions can be taken at this point (such as matching the competitor’s offer/improving the service quality etc.) The ‘churn’ phase: In this phase, the customer is said to have churned. You define churn based on this phase. Also, it is important to note that at the time of prediction (i.e. the action months), this data is not available to you for prediction. Thus, after tagging churn as 1/0 based on this phase, you discard all data corresponding to this phase. In this case, since you are working over a four-month window, the first two months are the ‘good’ phase, the third month is the ‘action’ phase, while the fourth month is the ‘churn’ phase. Data dictionary The dataset can be download using this link. The data dictionary is provided for download below. Data Dictionary - Telecom Churn Download The data dictionary contains meanings of abbreviations. Some frequent ones are loc (local), IC (incoming), OG (outgoing), T2T (telecom operator to telecom operator), T2O (telecom operator to another operator), RECH (recharge) etc. The attributes containing 6, 7, 8, 9 as suffixes imply that those correspond to the months 6, 7, 8, 9 respectively. Data Preparation The following data preparation steps are crucial for this problem: 1. Derive new features This is one of the most important parts of data preparation since good features are often the differentiators between good and bad models. Use your business understanding to derive features you think could be important indicators of churn. 2. Filter high-value customers As mentioned above, you need to predict churn only for the high-value customers. Define high-value customers as follows: Those who have recharged with an amount more than or equal to X, where X is the 70th percentile of the average recharge amount in the first two months (the good phase). After filtering the high-value customers, you should get about 29.9k rows. 3. Tag churners and remove attributes of the churn phase Now tag the churned customers (churn=1, else 0) based on the fourth month as follows: Those who have not made any calls (either incoming or outgoing) AND have not used mobile internet even once in the churn phase. The attributes you need to use to tag churners are: total_ic_mou_9 total_og_mou_9 vol_2g_mb_9 vol_3g_mb_9 After tagging churners, remove all the attributes corresponding to the churn phase (all attributes having ‘ _9’, etc. in their names). Modelling Build models to predict churn. The predictive model that you’re going to build will serve two purposes: It will be used to predict whether a high-value customer will churn or not, in near future (i.e. churn phase). By knowing this, the company can take action steps such as providing special plans, discounts on recharge etc. It will be used to identify important variables that are strong predictors of churn. These variables may also indicate why customers choose to switch to other networks. In some cases, both of the above-stated goals can be achieved by a single machine learning model. But here, you have a large number of attributes, and thus you should try using a dimensionality reduction technique such as PCA and then build a predictive model. After PCA, you can use any classification model. Also, since the rate of churn is typically low (about 5-10%, this is called class-imbalance) - try using techniques to handle class imbalance. You can take the following suggestive steps to build the model: Preprocess data (convert columns to appropriate formats, handle missing values, etc.) Conduct appropriate exploratory analysis to extract useful insights (whether directly useful for business or for eventual modelling/feature engineering). Derive new features. Reduce the number of variables using PCA. Train a variety of models, tune model hyperparameters, etc. (handle class imbalance using appropriate techniques). Evaluate the models using appropriate evaluation metrics. Note that it is more important to identify churners than the non-churners accurately - choose an appropriate evaluation metric which reflects this business goal. Finally, choose a model based on some evaluation metric. The above model will only be able to achieve one of the two goals - to predict customers who will churn. You can’t use the above model to identify the important features for churn. That’s because PCA usually creates components which are not easy to interpret. Therefore, build another model with the main objective of identifying important predictor attributes which help the business understand indicators of churn. A good choice to identify important variables is a logistic regression model or a model from the tree family. In case of logistic regression, make sure to handle multi-collinearity. After identifying important predictors, display them visually - you can use plots, summary tables etc. - whatever you think best conveys the importance of features. Finally, recommend strategies to manage customer churn based on your observations.
hebbarvn
In the telecom industry, customers are able to choose from multiple service providers and actively switch from one operator to another. In this highly competitive market, the telecommunications industry experiences an average of 15-25% annual churn rate. Given the fact that it costs 5-10 times more to acquire a new customer than to retain an existing one, customer retention has now become even more important than customer acquisition. For many incumbent operators, retaining high profitable customers is the number one business goal. To reduce customer churn, telecom companies need to predict which customers are at high risk of churn. In this project, you will analyse customer-level data of a leading telecom firm, build predictive models to identify customers at high risk of churn and identify the main indicators of churn. Understanding and Defining Churn There are two main models of payment in the telecom industry - postpaid (customers pay a monthly/annual bill after using the services) and prepaid (customers pay/recharge with a certain amount in advance and then use the services). In the postpaid model, when customers want to switch to another operator, they usually inform the existing operator to terminate the services, and you directly know that this is an instance of churn. However, in the prepaid model, customers who want to switch to another network can simply stop using the services without any notice, and it is hard to know whether someone has actually churned or is simply not using the services temporarily (e.g. someone may be on a trip abroad for a month or two and then intend to resume using the services again). Thus, churn prediction is usually more critical (and non-trivial) for prepaid customers, and the term ‘churn’ should be defined carefully. Also, prepaid is the most common model in India and southeast Asia, while postpaid is more common in Europe in North America. This project is based on the Indian and Southeast Asian market. Definitions of Churn There are various ways to define churn, such as: Revenue-based churn: Customers who have not utilised any revenue-generating facilities such as mobile internet, outgoing calls, SMS etc. over a given period of time. One could also use aggregate metrics such as ‘customers who have generated less than INR 4 per month in total/average/median revenue’. The main shortcoming of this definition is that there are customers who only receive calls/SMSes from their wage-earning counterparts, i.e. they don’t generate revenue but use the services. For example, many users in rural areas only receive calls from their wage-earning siblings in urban areas. Usage-based churn: Customers who have not done any usage, either incoming or outgoing - in terms of calls, internet etc. over a period of time. A potential shortcoming of this definition is that when the customer has stopped using the services for a while, it may be too late to take any corrective actions to retain them. For e.g., if you define churn based on a ‘two-months zero usage’ period, predicting churn could be useless since by that time the customer would have already switched to another operator. In this project, you will use the usage-based definition to define churn. High-value Churn In the Indian and the southeast Asian market, approximately 80% of revenue comes from the top 20% customers (called high-value customers). Thus, if we can reduce churn of the high-value customers, we will be able to reduce significant revenue leakage. In this project, you will define high-value customers based on a certain metric (mentioned later below) and predict churn only on high-value customers. Understanding the Business Objective and the Data The dataset contains customer-level information for a span of four consecutive months - June, July, August and September. The months are encoded as 6, 7, 8 and 9, respectively. The business objective is to predict the churn in the last (i.e. the ninth) month using the data (features) from the first three months. To do this task well, understanding the typical customer behaviour during churn will be helpful. Understanding Customer Behaviour During Churn Customers usually do not decide to switch to another competitor instantly, but rather over a period of time (this is especially applicable to high-value customers). In churn prediction, we assume that there are three phases of customer lifecycle : The ‘good’ phase: In this phase, the customer is happy with the service and behaves as usual. The ‘action’ phase: The customer experience starts to sore in this phase, for e.g. he/she gets a compelling offer from a competitor, faces unjust charges, becomes unhappy with service quality etc. In this phase, the customer usually shows different behaviour than the ‘good’ months. Also, it is crucial to identify high-churn-risk customers in this phase, since some corrective actions can be taken at this point (such as matching the competitor’s offer/improving the service quality etc.) The ‘churn’ phase: In this phase, the customer is said to have churned. You define churn based on this phase. Also, it is important to note that at the time of prediction (i.e. the action months), this data is not available to you for prediction. Thus, after tagging churn as 1/0 based on this phase, you discard all data corresponding to this phase. In this case, since you are working over a four-month window, the first two months are the ‘good’ phase, the third month is the ‘action’ phase, while the fourth month is the ‘churn’ phase. The data dictionary contains meanings of abbreviations. Some frequent ones are loc (local), IC (incoming), OG (outgoing), T2T (telecom operator to telecom operator), T2O (telecom operator to another operator), RECH (recharge) etc. The attributes containing 6, 7, 8, 9 as suffixes imply that those correspond to the months 6, 7, 8, 9 respectively. Data Preparation The following data preparation steps are crucial for this problem: 1. Derive new features This is one of the most important parts of data preparation since good features are often the differentiators between good and bad models. Use your business understanding to derive features you think could be important indicators of churn. 2. Filter high-value customers As mentioned above, you need to predict churn only for the high-value customers. Define high-value customers as follows: Those who have recharged with an amount more than or equal to X, where X is the 70th percentile of the average recharge amount in the first two months (the good phase). After filtering the high-value customers, you should get about 29.9k rows. 3. Tag churners and remove attributes of the churn phase Now tag the churned customers (churn=1, else 0) based on the fourth month as follows: Those who have not made any calls (either incoming or outgoing) AND have not used mobile internet even once in the churn phase. The attributes you need to use to tag churners are: total_ic_mou_9 total_og_mou_9 vol_2g_mb_9 vol_3g_mb_9 After tagging churners, remove all the attributes corresponding to the churn phase (all attributes having ‘ _9’, etc. in their names). Modelling Build models to predict churn. The predictive model that you’re going to build will serve two purposes: It will be used to predict whether a high-value customer will churn or not, in near future (i.e. churn phase). By knowing this, the company can take action steps such as providing special plans, discounts on recharge etc. It will be used to identify important variables that are strong predictors of churn. These variables may also indicate why customers choose to switch to other networks. In some cases, both of the above-stated goals can be achieved by a single machine learning model. But here, you have a large number of attributes, and thus you should try using a dimensionality reduction technique such as PCA and then build a predictive model. After PCA, you can use any classification model. Also, since the rate of churn is typically low (about 5-10%, this is called class-imbalance) - try using techniques to handle class imbalance. You can take the following suggestive steps to build the model: Preprocess data (convert columns to appropriate formats, handle missing values, etc.) Conduct appropriate exploratory analysis to extract useful insights (whether directly useful for business or for eventual modelling/feature engineering). Derive new features. Reduce the number of variables using PCA. Train a variety of models, tune model hyperparameters, etc. (handle class imbalance using appropriate techniques). Evaluate the models using appropriate evaluation metrics. Note that is is more important to identify churners than the non-churners accurately - choose an appropriate evaluation metric which reflects this business goal. Finally, choose a model based on some evaluation metric. The above model will only be able to achieve one of the two goals - to predict customers who will churn. You can’t use the above model to identify the important features for churn. That’s because PCA usually creates components which are not easy to interpret. Therefore, build another model with the main objective of identifying important predictor attributes which help the business understand indicators of churn. A good choice to identify important variables is a logistic regression model or a model from the tree family. In case of logistic regression, make sure to handle multi-collinearity. After identifying important predictors, display them visually - you can use plots, summary tables etc. - whatever you think best conveys the importance of features. Finally, recommend strategies to manage customer churn based on your observations. Note: Everything has to be submitted in one Jupyter notebook. The evaluation rubrics are mentioned on the next page.
This paper aims to predict the churn of telecom customers, which will help us react in time and try to retain the existing users who want to switch to different networks. We will be using three different machine learning techniques for classification Support Vector Machines, K-Nearest Neighbour and Random Forest also find out the best model for classification.The data consists of information about almost six thousand users including the services they use, their demographic characteristics, the duration of the operator’s services, the amount of payment and the method of payment.In the dataset there are 20 variables, some of them which are numerical and most are categorical. There are also some missing values in the dataset. We have to do data pre- processing before implementing any model.(Data Pre-Processing) Let’s first remove the null values from the dataset. There are only 10 missing values present in total charge variable. The customers with NA values all have a tenure of 0, they are new clients who has yet to pay their bills therefore total charge value for them should be zero. We also have to drop unwanted columns like ‘gender’, ‘MultipleLines’ , ‘PhoneServices’ , ‘differences’.Exploratory Data Analysis Why the clients are more inclined to leave the company and on what factors it depends.'Phone services' were available in 91% of cases. 88 percent had a "month-to-month" contract, 82 percent had no "dependents," 78 percent had no "online security," 77 percent had no "tech support," 75 percent had "paperless billing," and 75 percent are "older citizens." 68 percent had fibre optic internet, 65 percent had no 'online backup' or 'device protection,' and 64 percent had no partner. 57 percent paid with an electronic check, 50 percent did not have'streaming TV,' were male, and did not have'streaming movies,' and 45 percent had'multiplelines.' hypotheses formulation Based on our observations, we believe that a client is more likely to depart if he has a high MonthlyCharge. This is especially true if the client is new (less than 15 months). It lacks particular services such as internet security, tech assistance, online backup, and/or device protection if the decision to quit is simple, i.e. there is no firm commitment: has a month-to-month contract, no other person involved in the decision: no dependant and/or spouse, everything can be done via the internet or over the phone: Paperless Billing and Phone Services are available.Class Imbalance It is clearly visible that there is a huge difference between the two classes (customers who stayed and the customers who left the company) one is the majority class and the other one is minority.The challenge here that we can face with such a imbalanced data is that most of the classification techniques will not consider the minority class (customers who left), and in turn show poor prediction.Here we will use one approach to address the problem SMOTE. SMOTE (Synthetic Minority Oversampling Technique) is an oversampling technique used to create synthetic samples for the minority class instead of creating copies. We will be using the from imblearn.over_sampling import SMOTE python library. The method chooses two or more comparable examples (through a distance measure) and perturbs one characteristic at a time by a random amount within the difference between the surrounding examples.The last thing we have to do is to split and scale the dataset, In splitting we will split the data into training samples and testing samples randomly and in scaling we solely normalise continuous data and leave dummy variables alone. We also apply the min-max scaler to those continuous variables, giving them the identical minimum of zero, maximum of one, and range of one. (Correlation Heatmap) Correlation heatmap is shown in the below figure it helps us to depict the relations between different variables.And also I have plotted histogram and scatter plot between variables and their relations with the target variables.We can observe that, in general, clients that desire to quit (churn = 'Yes') are new clients (low tenure 15 months, and hence low TotalCharges) with high MonthlyCharges > 65$/month. Because there is no linear relationship between tenure and TotalCharges, additional fees must be determined.( MACHINE LEARNING CLASSIFICATION TECHNIQUES) are used such as Support Vector Machine, K-Nearest Neighbor, Random Forest where Random Forest is the most perdicted accuracy model with 83.2%. random forest classification method we can get the best prediction for the customers leaving the telecom company.
No description available
Performed customer sentiment analysis on the tweets related to 4 major streaming services, applied various NLP and text analysis tools, and created multiple ensemble and deep learning classification models to implement churn detection and performed a rule-based feature extraction to find out the reasons for churn. Created a visualization dashboard using Plotly and Dash and deployed on Heroku.
Customer Churn Analysis using both Supervised (Classification) and Unsupervised (Clustering) approaches to predict churn risk and segment churned customers for targeted retention strategies.
Kriishna1
Machine learning project predicting customer churn using data analysis and Random Forest classification model.
Bilalktk79
A Machine Learning project to predict customer churn using data analysis, feature engineering, and classification models.
spoledzki
Report on the classification of bank customers in terms of credit card churn, consisting of complete analysis with data visualization and classification models.
benxu001
Telecom customer churn analysis using Python, EDA, feature engineering, and classification models to identify key churn drivers and high-risk customer segments. Focused on translating data insights into actionable retention strategies.
yashyaks
Project unveiled insights and trends from the Olist data and derived customer segments using RFM analysis. Developed ML models for customer segmentation, sales prediction, churn classification, and sentiment analysis.
cart-el
An analysis of different machine learning classification models and their ability to accurately predict customers at the risk of churning.
SheemaMasood381
Showcase of internship projects at eCodeCamp, including Customer Churn Analysis, Titanic Survival Prediction, and Image Classification with CNNs, using Python, ML, and web development technologies.
kspritu4-ux
End-to-end telecom churn analysis using Python, including data cleaning, EDA, and classification modeling with scikit-learn. The project identifies key churn drivers and provides actionable insights to support customer retention strategies.
LucasDS9
Machine learning project focused on classifying customer churn (Exited) using a structured bank customer dataset. The repository includes data preprocessing, exploratory analysis, feature engineering, model training (Random forest classifier) , and evaluation of classification algorithms to identify clients with higher churn risk.
This project presents a comprehensive analysis of customer data from a streaming service, focusing on predictive modeling and unsupervised customer segmentation to optimize business decisions. Using both regression and classification models, we predict monthly customer spending and churn behavior.
KhanNosheen
An end-to-end Machine Learning project to predict customer churn using Python. Includes Exploratory Data Analysis (EDA), data preprocessing, and classification modeling to improve retention strategies.
Rohitkanithi
This project analyzes customer churn in a telecom dataset using feature engineering, multicollinearity analysis, and predictive modeling with H2O.ai AutoML. The objective is to identify factors influencing customer retention and build the most accurate predictive model for churn classification. The pipeline includes data preprocessing, exploratory
yuki04160
In this repository, to predict customer churn (classification), I built a logistic regression and a decision tree in R, and gave data-driven recommendations to a Telecom company based on analysis results.
MaheswaranThayani
This project applies multivariate statistical techniques to analyze customer churn behavior in the Telco dataset. It explores dimensionality reduction, latent factors, customer classification, and correlation structures using PCA, Factor Analysis, Discriminant Analysis, and Canonical Correlation. The goal is to uncover key patterns and drivers of c