Web Picks (week of 18 May 2015)

Every two weeks, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.

  • Project Jupyter: the new name for IPython notebooks
    The language-agnostic parts of IPython are getting a new home in Project Jupyter, meaning that it will become easier to use the popular web-based “data science notebook” with a variety of language kernels, including R and Julia.
  • Optimize hyperparameters with Spearmint
    Spearmint is a Python library which is able to optimize hyperparameter configurations using Bayesian Optimization, a more advanced alternative to simple grid based optimization, but better able to keep the number of experiments to be ran under control and more effectively reach an optimum configuration.
  • Google announces Google Cloud Bigtable
    Google has recently announced the availability of their “Google Cloud Bigtable” product, a high-performance, extremely scalable NoSQL database service accessible through the open-source Apache HBase API. According to Google, Bigtable already drives nearly all of the company’s largest applications, shows better performance than other NoSQL alternatives, and can be natively integrated with an Hadoop stack.
  • “The Data Science Handbook” released
    Not a technical handbook, but rather a “compilation of in-depth interviews with 25 remarkable data scientists, where they share their insights, stories, and advice.”
  • Cigarettes, damn cigarettes and statistics
    In his recent article, Tim Harford makes a case for correlation: “Large data sets can throw up intriguing correlations that may be good enough for some purposes. (Who cares why price cuts are most effective on a Tuesday? If it’s Tuesday, cut the price.) Andy Haldane, chief economist of the Bank of England, recently argued that economists might want to take mere correlations more seriously. He is not the first big-data enthusiast to say so.”