Critical Reflections on In versus Outsourcing Data Science

By: Bart Baesens, Seppe vanden Broucke Read and comment on this article on Medium

This article first appeared in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to receive our feature articles, or follow us @DataMiningApps. Do you also wish to contribute to Data Science Briefings? Shoot us an e-mail over at and let’s get in touch!

With data continuing to pile up, managing and analyzing these data resources in the most optimal way become critical success factors in creating competitive advantage and strategic leverage. To address these challenges, companies are hiring data scientists. A data scientist is a relatively new job profile and combines a unique skill set consisting of a well-balanced mix of quantitative, programming, business, communication and visualization skills. It speaks for itself that these profiles are hard to find in today’s job market. Universities are massively jumping on the Big Data & Analytics bandwagon and start offering various MSc programmes in Big Data and Analytics to close the gap.

The shortage of skilled talent and data scientists in Western Europe and the US has triggered the question to outsource analytical activities. This need is further amplified by competitive pressure on reduced time to market and lower costs. Companies need to choose between insourcing or building the analytical skillset internally, either at the corporate or business line level, outsource all analytical activities, or go for an intermediate solution whereby only part of the analytical activities are outsourced. The dominant players in the outsourcing analytics market are India, China and Eastern Europe with some other countries (e.g. Philippines, Russia, South Africa) gaining ground as well.

Various analytical activities can be considered for outsourcing ranging from the heavy lifting grunt work (e.g. data collection, cleaning and preprocessing), analytical platforms (hardware and software), training and education, to even the more complex analytical model construction, visualization, evaluation, monitoring and maintenance. Companies may choose to grow organically and start by outsourcing the analytical activities step by step, or immediately go for the full package of analytical services. It speaks for itself that the latter strategy has more risk associated with it and should thus be more carefully and critically evaluated.

Despite the benefits of outsourcing analytics, it should be approached with a clear strategic vision and critical reflection with awareness of all risks involved. First of all, the difference between outsourcing analytics and traditional ICT services is that analytics concerns a company’s front end strategy whereas most ICT services are part of company’s backend operations. Another important risk is the exchange of confidential information. Intellectual Property (IP) rights and data security issues should be clearly investigated, addressed and agreed upon. Moreover, all companies have access to the same analytical techniques, so they are only differentiated by the data they provide. Hence, an outsourcer should provide clear guidelines and guarantees about how intellectual property and data will be managed and protected (using e.g. encryption techniques, firewalls), especially if the outsourcer collaborates with various companies operating in the same industry sector.

Consider an example of two banks A and B working with outsourcer XYZ to develop their analytical credit risk models. Bank A invested in state of the art data quality solutions whereas bank B did not. Outsourcer XYZ can now use the high quality data from bank A to build high performing analytical credit risk models, and sell those to bank B as well, hereby diluting the competitive advantage of bank A. This danger is further amplified by the many mergers and acquisitions witnessed in the outsourcing sector. Furthermore, many of these outsourcers face high employee turnover due to intensive work schedules, the boredom of performing low level activities on a daily basis, and aggressive headhunters chasing these hard to find data science profiles. This attrition problem seriously inhibits the continuity of the partnership and a long-term thorough understanding of a customer’s analytical business processes and needs.

Another often cited complexity concerns the cultural mismatch (e.g. time management, different languages, local versus global issues, …) between the buyer and outsourcer. Exit strategies should also be clearly agreed upon. Many analytical outsourcing contracts have a maturity of 3 to 4 years. When these contracts expire it should be clearly agreed upon how the analytical models and knowledge can be transferred to the buyer thereof to ensure business continuity.

Finally, the shortage of data scientists in the US and Western Europe will also apply, and might even be worse, in the countries providing outsourcing services. These countries typically have universities with good statistical education and training programmes, but their graduates lack the necessary business skills, insights and experience to make a strategic contribution with analytics.

Given the above considerations (i.e.: cost of management, IP concerns, attrition, cultural mismatch, partner countries also experiencing local shortage of data scientists), many firms currently adopt a partial outsourcing strategy, whereby baseline, operational analytical activities such as query and reporting, multidimensional data analysis and OLAP are outsourced, whereas the advanced descriptive and predictive analytical skills are developed and managed in house.