DATA6000 Capstone Industry Case Studies Sample
Your Task
Generate a unique business question which can be explored using available data sources and analytics methodologies mastered in the business analytics degrees.
Assessment Description
• An individual report of 1000-1500 words
• Students are to research the application of one analytics method to an industry and document their findings as a report
• Learning outcomes: LO1, LO2
Assessment Instructions
The first part of any good research project is a literature review. In approximately 1000-1500 words (excluding referencing) address the following sections which will assist you to outline a business industry problem that can be addressed through data analytics.
1. Industry Background:
Chose an Industry, e.g., healthcare, retail, education, finance, recreation, government, etc. and discuss three key business problems currently facing this industry.
2. Existing Analysis and Methodologies:
Research and evaluate existing analysis on the three business problems you have chosen and reflect on the data and analytics methodologies that have been employed this this analysis.
3. Data Sources:
Evaluate the types of data sources available to analysts in this industry. Explore the available data sources you can access to address the three business problems you have identified and evaluate what type of descriptive and predictive analytics techniques could be used.
4. Selecting Business Problem:
Generate a unique business question for this industry based on one of the business problems. This is the business problem you will address in Assessment 4 (Industry research report). For the business problem you select briefly summarise the:
1. Data source you will use
2. Methodologies you will explore (brief)
3. Originality of your contribution (i.e. Why your is analysis unique given existing research and analysis in this space?)
5. Provide at least ten relevant, credible references to support your ideas and explanations.
Use the Harvard referencing system recommended in the “Study Support Resources” section of the portal (See Referencing and Plagiarism Help)
Solution
Industry Problem
For this project, the industry chosen is from the finance sector, the banking sector. In the banking sector, the area of focus is the credit card services. The problem statement is the prediction of the churning customers. For Uni Assignment Help, Churning of customer means the customer is no longer interested in the services and quits the company, mostly looking forward for the better options and more benefits. The major business problem faced due to the bank churners is presented below (Domingos et al., 2021).
The first business problem faced here is that the manger of the bank is heavily concerned about the customer churning and moving to other credit card services. Secondly, the customer support team and the management team together are unlikely to help and is not able to handle the customer future actions. Furthermore, the bank is facing loss in terms of growth, because a loss of a customer is always equal to the loss of revenue overall (Rahman and Kumar, 2020).
The best alternative method for this business problem, is knowing the customers well, having deep insights about the customers thoughts and views or action to the corresponding bank. If this will be achieved and things go around smoothly, it will help the company in losing the churn rate. The most efficient and effective method for solving this problem is implementation of machine learning models. The best possible solution is to predict the customer churning and if the prediction takes place accurately, the customer support team can take care of the customers need and thus decrease the churn rate or indulge in not allowing the customer or client to churn. The main technique to solve this problem is by using the machine learning algorithms and the prediction models to help the bank understand the overall trend and factors affecting the churning and to analyse the existing ecosystem of the data collected appropriately (Tékouabou et al., 2022).
Data Sources and The Potential Methodology
The data to be used in this project is named as the bank churner dataset, in which the information is segregated in these following attributes such as the first is the unique customer id, attrition flag which consist of two entries as the existing customer or the attired customer respectfully, customer age, customer gender, number of dependents, education level, marital status and the income category he/she belong to, the card category as Blue, silver, gold, platinum and the month of book in the bank.
Firstly, before continuing to the methodologies section, it is necessary to understand the definition of churning. So, it can also be understood further with the help of 3W’s, i.e., first is the when churn, it practically means that if any customer is leaving the service in initial frame, then it I hard churn and a huge loss per customer wise as well, if the customer is leaving in months before completing a year, then it is soft churn. This rate is often calculated from month to month or either on monthly basis. Next, is the why churn. It determines why the customers are willing to leave, to know the reasons and make change accordingly as well. Third is the who churners, it is to understand which customers are more likely, the settled ones or the young people or the retired people and what rate of leaving is high, medium or low so that accordingly make offers and approach towards different schemes as well. So, in this a framework can be made and understood, so that proper decision can be made for the betterment of the bank and the customer or client as well (Hashmi et al., 2013).
For solving this problem of customer churning, the appropriate method to be used is building a predictive model. Data analysis has recently attracted a lot of attention and grown in prominence in the business world. Data analysis programmes aim to turn raw client data into useful data. However the reason why a lot of these start operations fall below expectations is because it is difficult to convert this valuable knowledge into enhanced customer pleasure and profitability (Leung and Chung, 2020).
The dataset will be first explored and presented in the form of different graphs using the exploratory data analysis and all the attributes will be explored as required. The different forms of graphs will be used to present such as bar plots, histograms, box plots, etc. After carrying out the following visualizations such as presenting the proportions of different card categories, proportion of education level and many more.
The next step involves training and testing the dataset and making of the predictive models using the random forest classifier, AdaBoost Classifier and the support vector machine. These algorithms will be used and the results generated from these algorithms will be compared first in the form of graphical presentation, then will be presented in the form of classification report and confusion matrix. The F1 score for the three models will be calculated and compared and finally the model with the best accuracy will be chosen to decide the future decisions to be made simultaneously (Jain et al., 2021).
Selecting Business Problem
The analysis so far to be carried out are in diverse areas, including all the attributes to understand the data and presenting in the well-defined graphs, easy for the bank team as the customer support or the management team to handle and understand from it. In this we are attempting to analyse the consumer's churning out behaviour when we create a system for churn prediction. The customer transaction actions need to be examined over a certain amount of time in order for this to be successful (Prasad and Madhavi, 2012).
The most important analysis that has to be carried out is first related to the customer personal life, then the bank link to him, then the card category and all. It will discuss the types of data needed for active systems, as well as how to get practical qualities from the information that is readily available.
The data source to be used consist of the following attributes that will be used such as the customer personal information as the age, marital status, gender, etc. and many more things such as the card category. Based on these attributes the prediction model will be built and analysed to see the trend and hidden pattern from the dataset and have an overall look of the following results. The methodology used is three machine learning algorithms, the Support vector machine, random forest classifier and AdaBoost Classifier as the prediction models (Amuda and Adeyemo, 2019).
The idea was chosen and well understood first because it has happened as well as a customer’s point of view, what issues we faced while the continuation of the account in the bank, so it was easy to understand what are the essential things that are needed so that the churning does not happen, because even the customer does not like to switch as there is not time effective.
The idea chosen is somehow original as the models to be used are selected by deep analysis of the models and understanding what model can help with what kind of output and how it is going to help and the attributes to use against each other are chosen itself for the easy understanding. The model chosen are three, so that the result generated from each can be helpful in comparing and there will be many options left to explore (Saran Kumar and Chandrakala, 2016). These issues are faced in almost every industry, not only applicable to the finance sector, so this idea will help to determine in other sectors and can be implemented easily which is the most unique feature of this project as it is flexible to other domains as well. Other than this, the analysis performed has all the necessary attributes well used for a better presentation and visualization purpose.
The dataset is extracted from the Kaggle website with the title as credit card customers. The dataset name is Bank Churners dataset. This dataset consists of 10,000 customer data and the attributes of the dataset consist of customer data mentioning their age, salary, dependent count, marital status, education level, credit card limit, credit card category, etc. and other bank-related information such as unique client number, attrition flag as if the customer is existing customer and attrited customer, income category and Period of relationship with bank. The information is all related to the customer’s personal data and finance related information as well as relation with the bank.
Analysis of Data
References
Rahman, M. and Kumar, V., 2020, November. Machine learning based customer churn prediction in banking. In 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA) (pp. 1196-1201). IEEE.
Prasad, U.D. and Madhavi, S., 2012. Prediction of churn behavior of bank customers using data mining tools. Business Intelligence Journal, 5(1), pp.96-101.
Domingos, E., Ojeme, B. and Daramola, O., 2021. Experimental analysis of hyperparameters for deep learning-based churn prediction in the banking sector. Computation, 9(3), p.34.
Tékouabou, S.C., Gherghina, ?.C., Toulni, H., Mata, P.N. and Martins, J.M., 2022. Towards Explainable Machine Learning for Bank Churn Prediction Using Data Balancing and Ensemble-Based Methods. Mathematics, 10(14), p.2379.
Leung, H.C. and Chung, W., 2020. A dynamic classification approach to churn prediction in banking industry.
Jain, H., Yadav, G. and Manoov, R., 2021. Churn prediction and retention in banking, telecom and IT sectors using machine learning techniques. In Advances in Machine Learning and Computational Intelligence (pp. 137-156). Springer, Singapore.
Saran Kumar, A. and Chandrakala, D., 2016. A survey on customer churn prediction using machine learning techniques. International Journal of Computer Applications, 975, p.8887.
Hashmi, N., Butt, N.A. and Iqbal, M., 2013. Customer churn prediction in telecommunication a decade review and classification. International Journal of Computer Science Issues (IJCSI), 10(5), p.271.
Amuda, K.A. and Adeyemo, A.B., 2019. Customers churn prediction in financial institution using artificial neural network.
Dataset Link: https://www.kaggle.com/datasets/sakshigoyal7/credit-card-customers