The document discusses building a customer churn prediction model for a telecom company in Syria using machine learning techniques. It proposes using the XGBoost algorithm to classify customers as churners or non-churners based on their customer data over 9 months. XGBoost builds sequential decision trees and increases the weights of misclassified variables to improve predictive performance. The model achieved an AUC of 93.3% and incorporated social network features to further enhance results. The document outlines the hardware, software and methodology used to develop and test the model on a large dataset from SyriaTel to predict customer churn.
VIP Call Girls in Jamshedpur Aarohi 8250192130 Independent Escort Service Jam...
XGBoost Model for Customer Churn Prediction Using SNA Features
1. Guided By:
(Dr./Prof.)Guide Name
Guide Designation
1
Department of Computer Science and Engineering
Project Phase – II (18CSP83)
Review – 1
Project Title
Student 1 USN Name
Student 2 USN Name
Student 3 USN Name
Student 4 USN Name
Group No.: Batch No.:
2. Contents
2
Introduction
Comparison with similar work
Problem Statement and Objectives
Methodology Proposed/ Design
Technologies / Tools Used
Implementation of Modules with codes
Snapshots
References
3. Introduction
3
• Customer churn is a major problem and one of the most important concerns for large companies. Due to the direct efect on the revenues of the companies, especially in the
telecom feld, companies are seeking to develop means to predict potential customer to churn. Therefore, fnding factors that increase customer churn is important to take
necessary actions to reduce this churn.
• The main contribution of our work is to develop a churn prediction model which assists telecom operators to predict customers who are most likely subject to churn. The model
developed in this work uses machine learning techniques on big data platform and builds a new way of features’ engineering and selection. In order to measure the performance
of the model,
• the Area Under Curve (AUC) standard measure is adopted, and the AUC value obtained is 93.3%. Another main contribution is to use customer social network in the prediction
model by extracting Social Network Analysis (SNA) features. The use of SNA enhanced the performance of the model from 84 to 93.3% against AUC standard. The model was
prepared and tested through Spark environment by working on a large dataset created by transforming big raw data provided by SyriaTel telecom company.
• The dataset contained all customers’ information over 9 months, and was used to train, test, and evaluate the system at SyriaTel. The model experimented four algorithms:
Decision Tree, Random Forest, Gradient Boosted Machine Tree “GBM” and Extreme Gradient Boosting “XGBOOST”. However, the best results were obtained by applying
XGBOOST algorithm. This algorithm was used for classifcation in this churn predictive model.
4. Comparison with similar work
4
in existing churn prediction model that uses various machine learning algorithms. The performance of a
classifier depends on the available dataset.
It is validated by using a real-world dataset of Call Detail Records (CDR) of a South Asian company. The
proposed churn prediction model is evaluated using information retrieval metrics.
Its don’t with machine learning and data mining and to propose a model for customer churn predictions, to
identify churning factors and to provide retention strategies.
From the experiments, we observed that our existingmodel performed better in term of classification of churners
by achieving moderate accuracy.
5. Problem Statement and Objectives
5
The system with the moderate accuracy to find the churn made the administration
in loss
But the problem is that in earliest time the prediction was not efficient and not
robust. But now most of the things have been changed due to tariff and many other
factors.
The objective of the study is to investigate the existing techniques in machine
learning with efficient boosting technique and to propose a model for customer
churn predictions.
to identify churning factors and to provide retention strategies.
6. Methodology Proposed
6
XgBoost stands for Extreme Gradient Boosting, which was proposed by the researchers at the
University of Washington. It is a library written in C++ which optimizes the training for Gradient
Boosting.
XGBoost is an implementation of Gradient Boosted decision trees. XGBoost models majorly dominate
in many Kaggle Competitions.
In this algorithm, decision trees are created in sequential form. Weights play an important role in
XGBoost. Weights are assigned to all the independent variables which are then fed into the decision
tree which predicts results. The weight of variables predicted wrong by the tree is increased and these
variables are then fed to the second decision tree. These individual classifiers/predictors then
ensemble to give a strong and more precise model. It can work on regression, classification, ranking,
and user-defined prediction problems.
XGBoost is an implementation of Gradient Boosted decision trees. XGBoost models majorly dominate in
many Kaggle Competitions.
In this algorithm, decision trees are created in sequential form. Weights play an important role in
XGBoost. Weights are assigned to all the independent variables which are then fed into the decision tree
which predicts results. The weight of variables predicted wrong by the tree is increased and these
variables are then fed to the second decision tree. These individual classifiers/predictors then ensemble
8. 8
HARDWARE REQUIREMENTS:
Processor : Pentium IV 2.4 GHz.
Hard Disk : 250 GB.
Monitor : 15 VGA Color.
RAM : 1 GB
Mouse : Optical
Keyboard : Multimedia
( These are Minimum Configuration)
SOFTWARE AND HARDWARE REQUIREMENTS
• Python 3.7.4 IDE
• Anaconda navigator
• Jupyter notebook
9. References (contd….)
9
[1] S. Babu, D. N. Ananthanarayanan, and V. Ramesh, ‘‘A survey on factorsimpacting churn in telecommunication using
datamininig techniques,’’ Int.J. Eng. Res. Technol., vol. 3, no. 3, pp. 1745–1748, Mar. 2014.
[2] C. Geppert, ‘‘Customer churn management: Retaining high-margin customers with customer relationship management
techniques,’’ KPMG &Associates Yarhands Dissou Arthur/Kwaku Ahenkrah/David Asamoah,2002.
[3] W. Verbeke, D. Martens, C. Mues, and B. Baesens, ‘‘Building comprehensible customer churn prediction models with
advanced rule inductiontechniques,’’ Expert Syst. Appl., vol. 38, no. 3, pp. 2354–2364, Mar. 2011.
[4] Y. Huang, B. Huang, and M.-T. Kechadi, ‘‘A rule-based method for customer churn prediction in telecommunication
services,’’ in Proc. Pacific–Asia Conf. Knowl. Discovery Data Mining. Berlin, Germany: Springer2011, pp. 411–422.
[5] A. Idris and A. Khan, ‘‘Customer churn prediction for telecommunication:
Employing various various features selection techniques and tree based
ensemble classifiers,’’ in Proc. 15th Int. Multitopic Conf., Dec. 2012,
pp. 23–27.
10. References
10
[6] M. Kaur, K. Singh, and N. Sharma, ‘‘Data mining as a tool to predict the churn behaviour among Indian bank
customers,’’ Int. J. Recent Innov. Trends Comput. Commun., vol. 1, no. 9, pp. 720–725, Sep. 2013.
[7] V. L. Miguéis, D. van den Poel, A. S. Camanho, and J. F. e Cunha,‘‘Modeling partial customer churn: On the value of
first product-category purchase sequences,’’ Expert Syst. Appl., vol. 12, no. 12, pp. 11250–11256,Sep. 2012.
[8] D. Manzano-Machob, ‘‘The architecture of a churn prediction systembased on stream mining,’’ in Proc. Artif. Intell. Res.
Develop., 16th Int.Conf. Catalan Assoc. Artif. Intell., vol. 256, Oct. 2013, p. 157.
[9] P. T. Kotler, Marketing Management: Analysis, Planning, Implementationand Control. London, U.K.: Prentice-Hall,
1994.
[10] F. F. Reichheld and W. E. Sasser, Jr., ‘‘Zero defections: Quality comes toservices,’’ Harvard Bus. Rev., vol. 68, no. 5,
pp. 105–111, 1990
Hinweis der Redaktion
Arduino is an open-source product, software/hardware which is accessible and flexible to customers.
Arduino is flexible because of offering variety of digital and analog pins, SPI and PWM outputs.
Arduino is easy to use, connected to a computer via a USB and communicates using serial protocol.
Inexpensive, around 500 rupees per board with free authoring software.
Arduino has growing online community where lots of source code is available for use, share and post examples for others to use too. Arduino is Cross-platform, which can work on Windows, Mac or Linux platforms.
Arduino follows Simple, clear programming environment as C language.
Arduino is an open-source product, software/hardware which is accessible and flexible to customers.
Arduino is flexible because of offering variety of digital and analog pins, SPI and PWM outputs.
Arduino is easy to use, connected to a computer via a USB and communicates using serial protocol.
Inexpensive, around 500 rupees per board with free authoring software.
Arduino has growing online community where lots of source code is available for use, share and post examples for others to use too. Arduino is Cross-platform, which can work on Windows, Mac or Linux platforms.
Arduino follows Simple, clear programming environment as C language.
Arduino is an open-source product, software/hardware which is accessible and flexible to customers.
Arduino is flexible because of offering variety of digital and analog pins, SPI and PWM outputs.
Arduino is easy to use, connected to a computer via a USB and communicates using serial protocol.
Inexpensive, around 500 rupees per board with free authoring software.
Arduino has growing online community where lots of source code is available for use, share and post examples for others to use too. Arduino is Cross-platform, which can work on Windows, Mac or Linux platforms.
Arduino follows Simple, clear programming environment as C language.
Arduino is an open-source product, software/hardware which is accessible and flexible to customers.
Arduino is flexible because of offering variety of digital and analog pins, SPI and PWM outputs.
Arduino is easy to use, connected to a computer via a USB and communicates using serial protocol.
Inexpensive, around 500 rupees per board with free authoring software.
Arduino has growing online community where lots of source code is available for use, share and post examples for others to use too. Arduino is Cross-platform, which can work on Windows, Mac or Linux platforms.
Arduino follows Simple, clear programming environment as C language.
Arduino is an open-source product, software/hardware which is accessible and flexible to customers.
Arduino is flexible because of offering variety of digital and analog pins, SPI and PWM outputs.
Arduino is easy to use, connected to a computer via a USB and communicates using serial protocol.
Inexpensive, around 500 rupees per board with free authoring software.
Arduino has growing online community where lots of source code is available for use, share and post examples for others to use too. Arduino is Cross-platform, which can work on Windows, Mac or Linux platforms.
Arduino follows Simple, clear programming environment as C language.
Arduino is an open-source product, software/hardware which is accessible and flexible to customers.
Arduino is flexible because of offering variety of digital and analog pins, SPI and PWM outputs.
Arduino is easy to use, connected to a computer via a USB and communicates using serial protocol.
Inexpensive, around 500 rupees per board with free authoring software.
Arduino has growing online community where lots of source code is available for use, share and post examples for others to use too. Arduino is Cross-platform, which can work on Windows, Mac or Linux platforms.
Arduino follows Simple, clear programming environment as C language.
Arduino is an open-source product, software/hardware which is accessible and flexible to customers.
Arduino is flexible because of offering variety of digital and analog pins, SPI and PWM outputs.
Arduino is easy to use, connected to a computer via a USB and communicates using serial protocol.
Inexpensive, around 500 rupees per board with free authoring software.
Arduino has growing online community where lots of source code is available for use, share and post examples for others to use too. Arduino is Cross-platform, which can work on Windows, Mac or Linux platforms.
Arduino follows Simple, clear programming environment as C language.