How to Organize your Data Science Team?

Contributed by: Jasmien Lismont, Jan Vanthienen, Bart Baesens, Wilfried Lemahieu

This article first appeared in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to receive our feature articles, or follow us @DataMiningApps. Do you also wish to contribute to Data Science Briefings? Shoot us an e-mail over at briefings@dataminingapps.com and let’s get in touch!


As data analytics becomes more and more recognized and valued in companies, companies are (starting to) develop their own data science teams. In a recent study [1], we found that the analytics teams are in general relatively small but growing. This means that companies need to start thinking about how they will organize this team. There are several important elements which come into play here. The necessary resources (software, hardware, trainings) need to be available and management needs to work hard on establishing a culture driven by data in order to truly capture its potential and not scare away motivated data scientists. In this blog post we will zoom in on two other important perspectives of analytics (as specifically mentioned in the DELTA framework [2]), namely leadership and the organizational perspective. Thirdly, we will shortly discuss an alternative concept which is gaining traction, the citizen data scientist.

Firstly, analytics is gaining importance in the board of directors [1]. We observed that in almost one third of companies existing chief-level officers are taking on this responsibility and in almost 25% an entirely new function is created. Figure 1 clearly shows the growing interest in job titles such as chief analytics officer (CAO) and chief data officer (CDO) compared to the already existing chief information officer (CIO). Having an analytics champion will emphasize that the management values data and wants to further establish an analytics culture.

Figure 1: Google Trends data on “chief analytics officer” (CAO), “chief data officer” (CDO), “chief financial officer” (CFO), and “chief information officer” (CIO) for 2007 until 2017.

Secondly, there are several ways to organize the data scientists themselves. As such, the following categorization can be applied [2]: a centralized model, a consulting model, a functional model, a center of excellence (CoE), and a decentralized model.

  • In a centralized model, the company of interest has one corporate organization for analytics. Clear advantages are scalability and repeatability of resources (data scientists, models, etc.). However, in large companies, the distance between the team and the business unit might be large leading to miscommunication.
  • In the consulting model, business units or departments can ‘hire’ internal data scientists as consultants for their analytical projects. A clear advantage is that projects are, as such, driven by the market. Moreover, the data scientists can also advice their colleagues and train them. However, in this type of model companies need to be careful about priorities being given to those departments with the highest budgets.
  • Next, we have a functional model where each department has his own data science team. Data scientists can migrate between departments and, as such, traverse their knowledge ánd stay close to business. However, potentially the level of engagement of data scientists can be lower because they frequently change environment. Moreover, during our research, we discovered that rotational deployment of data scientists is not very popular [1].
  • Another way to organize your analytics team, is by means of a CoE. This format lies between a centralized model a functional model. Data scientists are not employed in a corporate unit, but are nevertheless consolidated in a center of excellence. As such, the advantages of a community and shared knowledge and education are still available.
  • Finally, companies can have a decentralized model. Data scientists are not organized in any specific way. This model can only be advantageous in very diversified organizations where no gains can be made by organizing analytics. In general, this model is considered to be the least mature, although it is the most common format [2].

We found that most companies do prefer a centralized approach, either by means of a truly centralized team or a CoE. However, we also notice that companies combine different formats and we emphasize that it might be necessary to change your organizational format as your company and/or competitive environment evolve. [1]

Finally, we want to touch upon the topic of a ‘citizen data scientist’. Currently, many researchers as well as practitioners have observed a shortage of data scientists [e.g. 3,4]. At the same time, we observe a trend of making analytics more accessible [5] and, moreover, a need has developed for empowering novice, business users in applying analytics [6]. This introduces the concept of a citizen data scientist, an employee who doesn’t practice the role of a data scientist or statistician but, nevertheless, carries out some analytical work. Gartner predicts that “citizen data scientists will surpass data scientists in amount of advanced analysis produced by 2019” [7]. Although we don’t believe data scientists can be replaced by tools, services offering forms of automated analytics, i.e. Analytics-as-a-Service (AaaS), can offer benefits to companies. In an experiment we performed, where both novices and experts solved an analytical task by means of AaaS, we found that experts still significantly outperform novices but that the service does add an extra advantage [8]. AaaS encourages the collaboration between business users and analytics experts in companies. As such, it facilitates the application of analytics and frees up scarce expert knowledge. Moreover, it allows for input from business experts, for example in the domain of marketing, finance or HR, to improve analytical models. This also improves analytics organization formats such as the consulting model or even the centralized model as business and analytics come closer together. In the end, thanks to the usage-based pricing concept of these cloud services, analytics becomes also more accessible and affordable.

To provide some concluding guidelines:

  • Carefully think about analytics leadership and consider the creation of an analytics champion and how s/he fits within management.
  • Some form of central analytics organization will help to leverage analytics talent and presents economies of scale.
  • Timely revaluate your analytics organization and how it fits within internal culture and the competitive environment.
  • Consider alternative formats, such as analytics-as-a-service, which might facilitate collaboration between data scientists and business experts. Moreover, analytics-as-a-service might further leverage your organizational format for analytics.

References

  1. Lismont, J., Vanthienen, J., Baesens, B., and Lemahieu, W. (2017). Defining analytics maturity indicators: A survey approach. International Journal of Information Management, 37: 114-124.
  2. Davenport, T. H., Harris, J. G., and Morison, R. (2010). Analytics at work: Smarter decisions, better results. Boston, MA: Harvard Business Review Press.
  3. Zorrilla, M., and García-Saiz, D. (2013). A service oriented architecture to provide data mining services for non-expert data miners. Decision Support Systems, 55(1): 399-411.
  4. Chen, H., Chiang, R. H., and Storey, V. C. (2012). Business intelligence and analytics: From big data to big impact. MIS Quarterly, 36(4): 1165-1188.
  5. Gartner. Magic quadrant for business intelligence and analytics platforms, 2015.
  6. Alpar, P., and Schulz, M. (2016). Self-service business intelligence. Business & Information Systems Engineering, 58(2): 151-155.
  7. Moore, S. (2017). Gartner Says More Than 40 Percent of Data Science Tasks Will Be Automated by 2020. Gartner.
  8. T. Van Calster, J. Lismont, M. Oskarsdottir, S. vanden Broucke, J. Vanthienen, W. Lemahieu, and B. Baesens. Automated analytics: the organizational impact of analytics-as-a-service. In: Proceedings of the EI-KDD16 workshop – 1st workshop on enterprise intelligence in conjunction with 22nd ACM SIGKDD international conference on knowledge discovery and data mining, forthcoming, San Francisco, USA, August 2016.