Web Picks (week of 4 April 2016)

Every two weeks, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.


  • Druid
    Another promising data solution: Druid is a fast column-oriented distributed data store.
  • Caravel
    Caravel is a data exploration platform designed to be visual, intuitive and interactive, from Airbnb.
  • dimple
    dimple is an object-oriented API for business analytics powered by d3.
  • Five Lessons from AlphaGo’s Historic Victory
    News about AlphaGo is still going strong as well: as Google’s computer crushed one of humanity’s best Go players, we learned a lot about the software’s inner workings, and what it means for AI.
  • Containerized Data Science and Engineering (part 1 and part 2)
    Let’s admit it, data scientists are developing some pretty sweet models, optimizations, visualizations, etc. Unfortunately, many of these models will never actually be used because they cannot be “productionized.” In fact, much of the “data science” happening in industry is happening in isolation on data scientists’ laptops, and, in the case in which data science applications are actually deployed, they are often deployed as hacky python/R scripts uploaded AWS and run as a cron job.
  • The Real Value of Containers for Data Science
    This is the real value of containers in data science: the ability to capture an experiment’s state (data, code, results, package versions, parameters, etc) at a point in time, making it possible to reproduce an experiment at any stage in the research process.