This article first appeared in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to receive our feature articles, or follow us @DataMiningApps. Do you also wish to contribute to Data Science Briefings? Shoot us an e-mail over at firstname.lastname@example.org and let’s get in touch!
A key question puzzling many firms starting to wet their toes in analytics is whether they should develop the model internally or buy an externally developed model by a consulting provider. Obviously, developing an analytical skill and mindset in house is quite costly and resource intensive and an externally developed solution might be tempting to obtain quick results, grab the low hanging fruit and compete on analytics. Still, such external analytical models can also impose an additional burden of risk for various reasons.
A first one is that analytical models are more and more used to steer the strategic decisions of the firm. Hence, in our opinion, it is highly recommended to have all the knowledge about the data, its preprocessing, the model building, evaluation and deployment in-house. Externalizing your key strategic assets is not something that should be considered as a walkover decision.
Next, the risk of vendor lock-in is quite substantial. External model providers will know that once you opted for them it creates a tightly coupled and lucrative relationship where your switching costs become quite high if at some moment you would become unhappy with the model or service provided. The external provider may also strongly encourage you or subtly pressure you to regularly update the model and as such create an unwanted recurring cost effect. In the settings where we have seen this, we often witnessed that the external provider almost never provided the full model details to their client (e.g., parameters estimated, transformations used, etc.) so as to keep the link permanent to their own benefit. Another evident risk is that the external provider runs out of business.
Finally, there is the issue about data transfer. As said many times, data is one of your key competitive assets and transferring it outside the boundaries of your firm, even when all necessary privacy and/or anonymization guarantees have been provided and signed, should be considered with caution.
Does this imply that external models have no added value at all? Definitely not! Externally developed models can e.g. be a handy instrument to challenge your internal models and see whether they perform better, perhaps even on particular subgroups of the population. Insights obtained from these benchmarking studies can then be used to further perfect your internal models. In other cases, external models can be a low-cost and efficient solution to deal with more unstructured data sources, such as the text and image deep learning models as provided by Google, Amazon, and many other vendors. Such offerings can offer an attractive solution to set up use cases around these data types, or to use them as way to generate additional features which can then be included in internal models.