What are the pros and cons of segmentation prior to analytical model building?

By: Bart Baesens, Seppe vanden Broucke

This QA first appeared in Data Science Briefings, the DataMiningApps newsletter as a “Free Tweet Consulting Experience” — where we answer a data science or analytics question of 140 characters maximum. Also want to submit your question? Just Tweet us @DataMiningApps. Want to remain anonymous? Then send us a direct message and we’ll keep all your details private. Subscribe now for free if you want to be the first to receive our articles and stay up to data on data science news, or follow us @DataMiningApps.

You asked: What are the pros and cons of segmentation prior to analytical model building?

Our answer:

Segmentation could indeed be considered prior to analytical model building. A first reason to do so could be strategic. E.g. banks might want to adopt special strategies to specific segments of customers. It could also be motivated from an operational viewpoint. E.g., new customers must have separate models because the characteristics in the standard model do not make sense operationally for them. Segmentation could also be needed to take into account significant variable interactions. E.g., if one variable strongly interacts with a number of others, it might be sensible to segment according to this variable.

The segmentation can be conducted using the experience and knowledge from a business expert, or it could be based upon statistical analysis using e.g. decision trees, k-means clustering or self-organising maps.

Segmentation is a very useful pre-processing activity since one can now estimate different analytical models each tailored to a specific segment. However, one needs to be careful with it since by segmenting, the number of analytical models to estimate will increase, which will obviously also increase the production, monitoring and maintenance costs.