Found 1,975 repositories(showing 30)
Mindinventory
This repository contains Python code for visualizing the Bank Marketing dataset using various data visualization techniques. The dataset is loaded from a CSV file, and both numerical and categorical features are explored using popular libraries such as Pandas, Matplotlib, Seaborn, and Plotly.
Jace-Yang
Project of Categorical Data Analysis 2020 fall in CUFE.
Predict if the client will subscribe to direct marketing campaign for a banking institution Problem Statement: The data is related to direct marketing campaigns of a Portuguese banking institution. Predict if the client will subscribe to a term deposit based on a marketing campaign-Data Set Download: https://drive.google.com/drive/folders/1urwTQPkUypJ6dGDJgS9Gszb83bfXEG6z?usp=sharing Data Set Information: The data is related to direct marketing campaigns of a Portuguese banking institution. The marketing campaigns were based on phone calls. Often, more than one contact to the same client was required, in order to access if the product (bank term deposit) would be ('yes') or not ('no') subscribed. There are four datasets: 1. bank-additional-full.csv with all examples (41188) and 20 inputs, ordered by date (from May 2008 to November 2010), very close to the data analyzed in [Moro et al., 2014] 2. bank-additional.csv with 10% of the examples (4119), randomly selected from 1), and 20 inputs. 3. bank-full.csv with all examples and 17 inputs, ordered by date (older version of this dataset with fewer inputs). 4. bank.csv with 10% of the examples and 17 inputs, randomly selected from 3 (older version of this dataset with fewer inputs). The smallest datasets are provided to test more computationally demanding machine learning algorithms Goal:- The classification goal is to predict if the client will subscribe (yes/no) a term deposit (variable y).
onehungrybird
The Portuguese Bank had run a telemarketing campaign in the past, making sales calls for a term-deposit product. Whether a prospect had bought the product or not is mentioned in the column named 'response'. The marketing team wants to launch another campaign, and they want to learn from the past one. You, as an analyst, decide to build a supervised model in R/Python and achieve the following goals: Reduce the marketing cost by X% and acquire Y% of the prospects (compared to random calling), where X and Y are to be maximized Present the financial benefit of this project to the marketing team
Predicted probabilities from machine learning classification algorithms may be used to tackle imbalance data. The study uses the Portuguese bank marketing dataset as a case study, as published in Towards Data Science on Medium.com
Machine Learning Implementations of Logistic Regression on Four Datasets (Diabetes, Breast Cancer, Weather Forecast, and Bank Marketing)
rishabhathiya
# Bank Marketing Dataset ## Marketing Introduction: The process by which companies create value for customers and build strong customer relationships in order to capture value from customers in return. - Kotler and Armstrong (2010). Marketing campaigns are characterized by focusing on the customer needs and their overall satisfaction. Nevertheless, there are different variables that determine whether a marketing campaign will be successful or not. There are certain variables that we need to take into consideration when making a marketing campaign. ## The 4 Ps: 1) Segment of the Population: To which segment of the population is the marketing campaign going to address and why? This aspect of the marketing campaign is extremely important since it will tell to which part of the population should most likely receive the message of the marketing campaign. 2) Distribution channel to reach the customer's place: Implementing the most effective strategy in order to get the most out of this marketing campaign. What segment of the population should we address? Which instrument should we use to get our message out? (Ex: Telephones, Radio, TV, Social Media Etc.) 3) Price: What is the best price to offer to potential clients? (In the case of the bank's marketing campaign this is not necessary since the main interest for the bank is for potential clients to open depost accounts in order to make the operative activities of the bank to keep on running.) 4) Promotional Strategy: This is the way the strategy is going to be implemented and how are potential clients going to be address. This should be the last part of the marketing campaign analysis since there has to be an indepth analysis of previous campaigns (If possible) in order to learn from previous mistakes and to determine how to make the marketing campaign much more effective. ## What is a Term Deposit? A Term deposit is a deposit that a bank or a financial institurion offers with a fixed rate (often better than just opening deposit account) in which your money will be returned back at a specific maturity time. For more information with regards to Term Deposits please click on this link from Investopedia: https://www.investopedia.com/terms/t/termdeposit.asp ## Outline: 1. Import data from dataset and perform initial high-level analysis: look at the number of rows, look at the missing values, look at dataset columns and their values respective to the campaign outcome. 2. Clean the data: remove irrelevant columns, deal with missing and incorrect values, turn categorical columns into dummy variables. 3. Use machine learning techniques to predict the marketing campaign outcome and to find out factors, which affect the success of the campaign. ## Dataset Link https://archive.ics.uci.edu/ml/datasets/Bank+Marketing ## Dataset Information The data is related with direct marketing campaigns of a Portuguese banking institution. The marketing campaigns were based on phone calls. Often, more than one contact to the same client was required, in order to access if the product (bank term deposit) would be ('yes') or not ('no') subscribed. There are four datasets: 1) bank-additional-full.csv with all examples (41188) and 20 inputs, ordered by date (from May 2008 to November 2010), very close to the data analyzed in [Moro et al., 2014] 2) bank-additional.csv with 10% of the examples (4119), randomly selected from 1), and 20 inputs. 3) bank-full.csv with all examples and 17 inputs, ordered by date (older version of this dataset with less inputs). 4) bank.csv with 10% of the examples and 17 inputs, randomly selected from 3 (older version of this dataset with less inputs). The smallest datasets are provided to test more computationally demanding machine learning algorithms (e.g., SVM). The classification goal is to predict if the client will subscribe (yes/no) a term deposit (variable y). ## Attribute Information Input variables: #### bank client data: 1-age (numeric) 2-job : type of job (categorical: 'admin.','blue-collar','entrepreneur','housemaid','management','retired','self-employed','services','student','technician','unemployed','unknown') 3-marital : marital status (categorical: 'divorced','married','single','unknown'; note: 'divorced' means divorced or widowed) 4-education(categorical:'basic.4y','basic.6y','basic.9y','high.school','illiterate','professional.course','university.degree','unknown') 5-default: has credit in default? (categorical: 'no','yes','unknown') 6-housing: has housing loan? (categorical: 'no','yes','unknown') 7-loan: has personal loan? (categorical: 'no','yes','unknown') #### related with the last contact of the current campaign: 8-contact: contact communication type (categorical: 'cellular','telephone') 9-month: last contact month of year (categorical: 'jan', 'feb', 'mar', ..., 'nov', 'dec') 10-day_of_week: last contact day of the week (categorical: 'mon','tue','wed','thu','fri') 11-duration: last contact duration, in seconds (numeric). Important note: this attribute highly affects the output target (e.g., if duration=0 then y='no'). Yet, the duration is not known before a call is performed. Also, after the end of the call y is obviously known. Thus, this input should only be included for benchmark purposes and should be discarded if the intention is to have a realistic predictive model. #### other attributes: 12-campaign: number of contacts performed during this campaign and for this client (numeric, includes last contact) 13-pdays: number of days that passed by after the client was last contacted from a previous campaign (numeric; 999 means client was not previously contacted) 14-previous: number of contacts performed before this campaign and for this client (numeric) 15-poutcome: outcome of the previous marketing campaign (categorical: 'failure','nonexistent','success') #### social and economic context attributes 16-emp.var.rate: employment variation rate - quarterly indicator (numeric) 17-cons.price.idx: consumer price index - monthly indicator (numeric) 18-cons.conf.idx: consumer confidence index - monthly indicator (numeric) 19-euribor3m: euribor 3 month rate - daily indicator (numeric) 20-nr.employed: number of employees - quarterly indicator (numeric) Output variable (desired target): 21-y - has the client subscribed a term deposit? (binary: 'yes','no') ## License This dataset is public available for research. Citations - 1.Moro et al., 2014] S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014 2.Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
lucinezhang
Course project of the class "Data Warehousing and Data Mining Technology" at PKU, Spring and Summer Semester, 2017.
frankscholten
Example of using Mahout's SGD Logistic Regression classifier on the bank marketing dataset
ShivankUdayawal
Bank Customer Acquisition Analysis
nickr007
Marketing refers to activities undertaken by a company to promote the buying or selling of a product or service. Marketing includes advertising, selling, and delivering products to consumers or other businesses. Our data is related with direct marketing campaigns of a Portuguese banking institution. The marketing campaigns were based on phone calls.
iftekarpatel
Problem Statement Understanding the consumption pattern for credit cards at an individual consumer level is important for customer relationship management. This understanding allows banks to customize for consumers and make strategic marketing plans. Thus it is imperative to study the relationship between the characteristics of the consumers and their consumption patterns. Here the dataset is of some XYZ Bank that has given a sample of their customers, along with their details like age, gender and other demographics. Also shared are information on liabilities, assets and history of transactions with the bank for each customer. In addition to the above, data has been provided for a particular set of customers' credit card spend in the previous 3 months (April, May & June) and their expected average spend in the coming 3 months (July, August & September). The average spend for different set of customers needs to be predicted in the test set for the coming 3 months. Data Dictionary id Unique ID for every Customer account_type Account Type – current or saving gender Gender of customer-M or F age Age of customer region_code Code assigned to region of residence (has order) cc_cons_apr Credit card spend in April dc_cons_apr Debit card spend in April cc_cons_may Credit card spend in May dc_cons_may Debit card spend in May cc_cons_jun Credit card spend in June dc_cons_jun Debit card spend in June cc_count_apr Number of credit card transactions in April cc_count_may Number of credit card transactions in May cc_count_jun Number of credit card transactions in June dc_count_apr Number of debit card transactions in April dc_count_may Number of debit card transactions in May dc_count_jun Number of debit card transactions in June card_lim Maximum Credit Card Limit allocated personal_loan_active Active personal loan with other bank vehicle_loan_active Active Vehicle loan with other bank personal_loan_closed Closed personal loan in last 12 months vehicle_loan_closed Closed vehicle loan in last 12 months investment_1 DEMAT investment in june investment_2 fixed deposit investment in june investment_3 Life Insurance investment in June investment_4 General Insurance Investment in June debit_amount_apr Total amount debited for April credit_amount_apr Total amount credited for April debit_count_apr Total number of times amount debited in april credit_count_apr Total number of times amount credited in april max_credit_amount_apr Maximum amount credited in April debit_amount_may Total amount debited for May credit_amount_may Total amount credited for May credit_count_may Total number of times amount credited in May debit_count_may Total number of times amount debited in May max_credit_amount_may Maximum amount credited in May debit_amount_jun Total amount debited for June credit_amount_jun Total amount credited for June credit_count_jun Total number of times amount credited in June debit_count_jun Total number of times amount debited in June max_credit_amount_jun Maximum amount credited in June loan_enq Loan enquiry in last 3 months (Y or N) emi_active Monthly EMI paid to other bank for active loans cc_cons (Target) Average Credit Card Spend in next three months Evaluation Metric Submissions are evaluated on Root Mean Squared Logarithmic Error(RMSLE) between the predicted credit card consumption and the observed target. Approach At first, I conducted exploratory data analysis of the dataset to gain a deeper understanding of the data. Next, I did feature engineering to create new variables.Then I tried some scikit-learn models out of which XGBoost and Random Forest gave good RMSLE. In the end I created a stacked model of those two with Linear Regression and it has been selected as the final model. RMSLE: 115.02
sanjay-bhat
Generate Decision Tree With Bank Marketing Dataset
ajaygtm
Extensive Analysis and Prediction on the Bank Marketing Dataset.
havelhakimi
Perform anomaly detection on Bank Marketing dataset
nikitaB2005
"Interactive Power BI dashboard analyzing UCI Bank Marketing dataset (45K+ records)"
KubaKrzych
Analysis of a dataset that contains information on Portugal bank marketing campaign results.
samikabir
Five Machine Learning Models - Logistic Regression, Linear Discriminant Analysis, K-Nearest Neighbor (KNN), Decision Tree Classifier, Gaussian Naive Bayes have been applied on bank-full.csv dataset. This dataset contains information related to bank's clients and bank's previous marketing campaign. Objective is to predict whether a client will subscribe to a term deposit product offering by the bank or not.
Final Project : Project based on a real life Business Problem. In this Project, you will be using all the skills that you have acquired throughout this course. Problem Statement Your client is a retail banking institution. Term deposits are a major source of income for a bank. A term deposit is a cash investment held at a financial institution. Your money is invested for an agreed rate of interest over a fixed amount of time, or term. The bank has various outreach plans to sell term deposits to their customers such as email marketing, advertisements, telephonic marketing and digital marketing. Telephonic marketing campaigns still remain one of the most effective way to reach out to people. However, they require huge investment as large call centers are hired to actually execute these campaigns. Hence, it is crucial to identify the customers most likely to convert beforehand so that they can be specifically targeted via call. You are provided with the client data such as : age of the client, their job type, their marital status, etc. Along with the client data, you are also provided with the information of the call such as the duration of the call, day and month of the call, etc. Given this information, your task is to predict if the client will subscribe to term deposit. Data You are provided with following files: 1. train.csv : Use this dataset to train the model. This file contains all the client and call details as well as the target variable “subscribed”. You have to train your model using this file. 2. test.csv : Use the trained model to predict whether a new set of clients will subscribe the term deposit. Data Dictionary Here is the description of all the variables : Variable Definition ID Unique client ID age Age of the client job Type of job marital Marital status of the client education Education level default Credit in default. housing Housing loan loan Personal loan contact Type of communication month Contact month day_of_week Day of week of contact duration Contact duration campaign number of contacts performed during this campaign to the client pdays number of days that passed by after the client was last contacted previous number of contacts performed before this campaign poutcome outcome of the previous marketing campaign Subscribed (target) has the client subscribed a term deposit? How good are your predictions? Evaluation Metric The Evaluation metric for this competition is accuracy. Solution Checker You can use solution_checker.xlsx to generate score (accuracy) of your predictions. This is an excel sheet where you are provided with the test IDs and you have to submit your predictions in the “subscribed” column. Below are the steps to submit your predictions and generate score: a. Save the predictions on test.csv file in a new csv file. b. Open the generated csv file, copy the predictions and paste them in the subscribed column of solution_checker.xlsx file. c. Your score will be generated automatically and will be shown in Your Accuracy Score column You can also check out the baseline Python Notebook provided to get started.
ashutoshmakone
The classification goal is to predict if the client will subscribe (yes/no) a term deposit. EDA followed by modeling with KNN, NB, LR, LR with Polynomial Features, SVM, DT, RF, XGBOOST
alexkataev
Case study for a famous bank marketing data set
Give a Marketing strategy to bank using dataset available on kaggle.
Ismgh
Data analytics project in M1, using unsupervised machine learning algorithms (K-MEANT, ACP, ACC ) on Bank Marketing Data Set form http://archive.ics.uci.edu/ml/datasets/Bank+Marketing
Akilankm
The project aims to analyze the Portuguese Bank Marketing dataset and predict weather the client will subscribed to the term deposit.
Adityagugnani
Build a decision tree classifier to predict whether a customer will purchase a product or service based on their demographic and behavioral data. Use a dataset such as the Bank Marketing dataset from the UCI Machine Learning Repository.
vipunsanjana
Implementation of ensemble methods for classification and regression using scikit-learn, XGBoost, and mlxtend. Covers bagging, boosting, random forests, voting, and stacking on Bank Marketing and Boston Housing datasets with evaluation metrics.
vipunsanjana
🚀 Support Vector Machine (SVM) Classifier on the Bank Marketing dataset using GridSearchCV & K-Fold Cross Validation. Includes preprocessing, train-test split, hyperparameter tuning, best model selection, and accuracy evaluation. Ideal for learning SVM fundamentals and model optimization in scikit-learn.
ravibhagwat19
Majority of the bank offers accounts where the owner can withdraw or deposit the money at any time. This makes it difficult for the banks to plan ahead of time about their lending power. To deal with it, banks introduced term deposit accounts where the money will be locked with the bank for a certain period of time. This gave the bank the flexibility to lend money forward. However, one of the major challenges was to identify customers who would be interested in subscribing for the term deposit. In this report, we will introduce you with a dataset related to the direct marketing campaigns of a Portuguese banking institute. We cleaned and pre-processed the data to prepare it to build a model. Then, we used Random forest and Support Vector Machine algorithms to build model and compare the results.
osmaantahir
This case is about a bank (Thera Bank) which has a growing customer base. Majority of these customers are liability customers (depositors) with varying size of deposits. The number of customers who are also borrowers (asset customers) is quite small, and the bank is interested in expanding this base rapidly to bring in more loan business and in the process, earn more through the interest on loans. In particular, the management wants to explore ways of converting its liability customers to personal loan customers (while retaining them as depositors). A campaign that the bank ran last year for liability customers showed a healthy conversion rate of over 9% success. This has encouraged the retail marketing department to devise campaigns with better target marketing to increase the success ratio with a minimal budget. The department wants to build a model that will help them identify the potential customers who have a higher probability of purchasing the loan. This will increase the success ratio while at the same time reduce the cost of the campaign. The dataset has data on 5000 customers. The data include customer demographic information (age, income, etc.), the customer's relationship with the bank (mortgage, securities account, etc.), and the customer response to the last personal loan campaign (Personal Loan). Among these 5000 customers, only 480 (= 9.6%) accepted the personal loan that was offered to them in the earlier campaign.
vikassharma1999
No description available