The coming of age of cognitive computing and data science
By : Amit Meher
Senior Manager - Data Sciences
Industrial Conference of Data Mining (ICDM) and International Conference of Machine Learning and Data Mining (MLDM) are two established conferences in the field of Data Mining and Machine Learning. The conference proceedings are published by Springer, one of the most reputed international publishing houses.
While ICDM is more inclined towards application of data mining and machine learning to solve real world business problems, MLDM attracts researchers from both Industries and academia for discussions on theoretical and practical data science. This year, both the conferences were co-located at Newark, New Jersey and Flytxt presented a research paper at each of these conferences.
Industrial Conference of Data Mining (ICDM)
A handful of the papers presented in ICDM were pertinent to industrial applications in domains like retail, transportation, semiconductor and healthcare. Natural Language Processing, Association Rule Mining, Multivariate Time Series Analysis, Large Scale Classification, Sub graph Pattern Mining were key themes of presentations at this year’s conference.
Paper presentation at ICDM 2017
The paper titled “Towards a Large Scale Practical Churn Model in Prepaid Mobile Markets” presented at ICDM focuses on building a scalable prediction model to predict subscribers who are likely to churn in the near future. Here churn refers to the revenue churn (or inactivity based churn) where a subscriber stops consuming the network services (voice, data, SMS) for a period of 15 consecutive days. It emphasises practical aspects of model development such as data extraction, feature engineering, dimensionality reduction, model evaluation, distributed training and prediction and demonstrates an end to end modelling considering a real world dataset of 5 million active subscribers form a renowned Asian CSP.
Apart from these, the paper also provides insights on how the model output (in the form of actionable lift chart) can help marketers in designing appropriate retention campaigns. At the end, it touches upon the productionalisation aspects of the model where model can run seamlessly as an end-to-end workflow in the production environment. The paper can be read here.
AI: Deep Learning and Cognitive Computing to be the way forward?
One of the key highlights of the conference was an open discussion where we discussed and brainstormed about the current and future scope of AI and how “Deep Learning” and “Cognitive Computing” are going to shape up our future. It was interesting to know from the practitioners about their general skepticism towards deep learning a couple of decades back, when data was scant even to train a simple feed forward neural network. However now deep learning has transformed the landscape of analytics with abundance of data and sophistication of techniques available to train data for solving even the most complex problems.
One of the core issues of deep learning – “interpretability” was given much attention during the discussion. Other issues like need for large training set, longer training time, and actionability-vs-accuracy trade off were also highlighted. Apart of these, few open ended questions were also debated upon as follows:
- Will AI be ever have the motivation as human being have?
- Will AI ever ask relevant questions it does not know answer to?
- Will AI help in hypothesis generation rather than hypothesis validation based on a given context?
MLDM 2017
Unlike ICDM, MLDM was a bit more theoretical in nature along with a few industry use cases utilising machine learning techniques. Some of the key themes of this conference were optimisation in large scale machine learning, deep learning in computer vision, anomaly detection and sequential pattern mining.
Paper presentation at MLDM 2017
The paper titled ‘High Accuracy Predictive Modelling for Customer Churn Prediction in Telecom Industry’ was presented at MLDM. This paper was an extension of the first paper where efficacy of applying deep learning techniques on telecom dataset was analysed with respect to other standard linear and nonlinear machine learning techniques. The paper can be read here. Read more about this paper here.
An Interesting guest lecture at MLDM
On the Computer Vision front, there was an interesting talk by Professor Petia Radeva from the University of Barcelona, who discussed how nutritional habits (food intake behaviour) of users can be learned automatically via a deep learning framework, which could be helpful in efficient monitoring of medical conditions such as diabetes, obesity etc. A novel and fast approach based on Convolution Neural Network (CNN) was proposed for detecting and recognising food in conventional (pictures taken manually by mobile phones and other cameras) as well as egocentric images (pictures taken automatically by a wearable camera).
Professor Petia discussed about a novel food-related objects localisation algorithm, which can classify an image into food type or non-food type category, and discover bounding boxes containing generic food in a single image. She also discussed about a food recognition algorithm which can learn to recognise the type of food present in an image.
Interaction with practitioners and professors
It was an enriching experience when we got a chance to meet people from industry bigwigs like Intel, General Motors, and VMWare and to know about problems they are trying to solve using data science. It was also great to meet professors from reputed universities like the National University of Singapore and Indiana University of Pennsylvania and know their research areas.
Overall, it was a great learning experience.