Deep learning based on recurrent neural networks for predicting customer churn
By : Flytxt Data Science R&D Team
Deep learning got popular after it was proven commercially for use cases like speech recognition and image processing. It has its own advantages over traditional machine learning models as it can learn better representations by itself from large raw data set with very little dependency on humans. It has been featuring now in the top 10 technology trends predicted by industry observers like Gartner for last few years. Deep learning is increasingly getting adopted across many industries including telecom.
An edge over others
Churn is arguably one of the most pressing challenge for enterprises like Telcos. An accurate churn prediction model becomes extremely useful as marketers can proactively reach out to potential churners with targeted promotions or other actions to minimise the churn. Data scientists are constantly looking out for techniques to improve accuracy of churn models and deep learning is certainly one of them being explored.
Unlike traditional machine learning methods, deep learning techniques generally scale well with increasing data sizes and can uncover hidden insights, detect patterns, detect underlying risks, and alert the organisation about the changes in customer behaviour with better accuracy. This makes it quite ideal to solve a complex problem like causal churn analysis.
How is it done?
Deep learning methods are representation-learning methods that allows a machine to be fed with raw data as to automatically discover the representations, thereby eliminating the need for feature engineering which occupies almost 90% of effort in industrial machine learning. Deep Learning has turned out to be very good at discovering intricate structures in high-dimensional data and is therefore applicable to many domains such as in dimensionality reduction, supervised learning, recommender systems, natural language processing, etc.
Deep learning with Recurrent Neural Networks (RNN)
Flytxt Data Science team has attempted to predict churn with deep learning models which use RNN – a class of neural networks with loops in them, allowing information to persist. In the analysis, Long Short Term Memory (LSTM) networks, which is a kind of RNN that are capable of learning long-term dependencies, was used. Figure 1 shows the basic architecture in a LSTM network.
The dataset used in the study consists of information with respect to usage, recharge, age on network etc. from 0.33 million users from a popular Asian telecom service provider. Region Operating Characteristic (ROC) curves give us the ability to assess the performance of the classifier over its entire operating range.
Traditional algorithms such as the artificial neural networks, allow signals to travel one way only which is from input to output. There are no feedback (loops). In other words, the output of any layer does not affect the same layer and any previous layers.
In RNN, however, signals can travel in both directions by introducing loops in the network. Computations derived from earlier input are fed back into the network, which gives them a kind of memory. Feedback networks are dynamic; their ‘state’ is changing continuously until they reach an equilibrium point. They remain at the equilibrium point until the input changes and a new equilibrium needs to be found. The advantage of such a topology is that it can learn functions that depends on inputs happenings much before due to its ability to store information from arbitrarily long time ago.
Figure 2 as shown below depicts the Area Under the ROC Curve (AUC) which is a widely used plot to visualise classifier performance. AUC for a classifier with does random guessing is 0.5 and the curve follows the diagonal. The AUC for the perfect classifier is 1.0. An AUC of 0.9 or more indicate excellent performance. We observe a high performance from the model. A high AUC of 92.51 % is obtained indicating the usefulness of the model.
Accuracy translates to savings
The study indicates that use of deep learning techniques like RNN can certainly improve accuracy of churn prediction model as well as save huge effort in tasks like feature engineering associated with traditional machine learning techniques.
The fact that deep learning is now proven for churn prediction can open up more possibilities. It can be useful across other possible use cases again featuring larger data set like fraud detection or upsell/cross-sell recommendation too.