Predictive Analytics as a Service: a Poll

By: Jasmien Lismont, Tine Van CalsterMaría Óskarsdóttir, Jan VanthienenBart Baesens, Wilfried Lemahieu

This article first appeared in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to receive our feature articles, or follow us @DataMiningApps. Do you also wish to contribute to Data Science Briefings? Shoot us an e-mail over at briefings@dataminingapps.com and let’s get in touch!


During April 2015, we conducted a on KDnuggets researching the application of machine learning APIs (ML APIs) for analytics. The poll received 53 answers which are presented in this report.

The chart below indicates which APIs are used by more than 10% of the respondents. Note that respondents could select more than one answer:

2015-09-03_15-11-33

Strikingly, a high use of Indico (41.3%) can be observed, which is a text and image analytics tool. This might indicate a lack of or high complexity level of text and image mining techniques in stand-alone tools. A similar tool is Alchemy API (10.9%). Next in line is Microsoft’s Azure (19.6%) which allows both for quick, user-friendly analytics as more elaborate analytics using R or Python.

Other tools focus on scalable advanced analytics by connecting to the cloud or parallel processing, such as GraphLab (17.4%), H2O (17.4%), PredictionIO (15.2%) and Googles Prediction API (10.9%). Noticeably, also BigML scores relatively high (13%). BigML focuses on rapid, easy to use analytics and is limited to a couple of problem settings and techniques.

Furthermore, we analyzed the top three preferred APIs:

Again, Indico scores well (47.6%) but also Graphlab (16.7%), H20 (16.7%), Azure (16.7%) and BigML (14.3%) score well. This might indicate that analysts are trying out different tools although they not always prefer those tools in the end. Moreover, it might also indicate, on the one hand, a strong need and preference for text and video mining tools, and, on the other hand, a need for easy-to-use, scalable tools.

2015-09-03_15-13-06

Usage of stand-alone tools next to ML APIs

We observe a high application of a combination of stand-alone tools and ML APIs (49%). Nevertheless, 20.4% of the 49 respondents use only one APIs and 16.3% combine multiple APIs. Another 14.3% only apply stand-alone tools.

Perceived security of ML APIs

The majority (53.1%) of 49 respondents believes the tools are secure enough, though almost one third of respondents (28.6%) believes that an API needs the option of a private cloud before they trust its security. 18.4% of respondents do not believe APIs are secure enough at this moment. This might suggest that API providers need to investigate how they can ensure the security of their APIs.

Motivation for ML APIs usage

Finally, we take a look at the motivation for applying certain APIs: almost 60% (57.8%) of 45 respondents use ML APIs because they are easy to use. This might indicate the preference for tools such as BigML for instance. Furthermore, almost 50% want access to known and proven algorithms and are motivated by fast development. The top five is completed with scalability (33.3%) and easy evaluation (31.1%) as motivation. Furthermore, budgeting reasons might come into play since APIs often demand lower initial investments and might require less management. Accessibility and good integration with other tools also explain the use of APIs. However, less important are end-to-end support for business problems (which might signal that APIs are often combined with other tools), pooled resources, data storage solutions, visualization, and sharing options.