Case Summary


To predict persistency of policy owners to renew thei

People Deployed

Project Manager, Functional Consultant, Database Architect, Statisticians

Processes Deployed

Variable Identification Process, Algorithm Identification Process, Testing Process, Production Deployment Process

Technologies Used

IBM Predictive Intelligence

Analytics Used

Persistency Modelling using IBM SPSS


The client is one of India’s growing life insurance companies. It is a three way joint-venture of an Indian development and commercial bank; one of India’s leading private sector banks and, a multinational insurance giant based out of Europe.The client was looking for a robust automated solution to identify potential early claimers in advance, so that more investigation could be carried out before providing them insurance. In addition to this they also wanted to build a robust automated solution to identify potential non-persistent customers in advance, so that targeted campaigns could help retain them.


The client wanted to leverage predictive model to identify propensity of customers who can be cross sold other products and calculate customers’ policy persistency. However, the client had never used predictive modelling earlier and hence lot of efforts were required to make business units understand the modelling workflow. Data was maintained in different formats and locations by different business units.


As part of Xerago’s Digital & Data Analytics Services, Xerago built cross-sell model and persistency models. To build these models, Xerago aggregated data from over eight different sources and created a single view of data with one policy per row.

Xerago did missing data completion, outliers exclusion and derived parameter calculation as part of data preparation. For policy persistency model, due data table was used as base and data from other sources were aggregated and transformed before using for model building.

Xerago then performed an audit to determine all the Predictor variables that have a direct relationship with the Target variables.
Of the available data, 70% of data was used for training and 30% for testing. CHAID, C5 Decision Tree, Neural Networks and Logistic regression were used for model building and CHAID provided better results on both the models. Xerago used AUC and Gini benchmarks to evaluate the model and determine the importance of predictor inputs.


The policy persistency models showed

  • 94.76% accuracy i.e.) out of the Actual On-Timers, how many customers the model has correctly predicted and
  • 68.11% precision i.e.) out of the Predicted On-Timers, how many customers were actual On-Timers

The cross sell model showed a total of 1,58,999 product recommendations across 9 product lines with a maximum proportion of 25.5% and a minimum of 3.1%

This might be a good time to check other Data and Digital Analytics case studies.