Web Picks (week of 17 October 2016)

Every two weeks, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.

  • When is data science a house of cards?
    “As data scientists, when we reach an answer, we often communicate that answer and move on. But what happens when there are multiple data scientists with varying answers?”
  • Deep Reinforcement Learning From Raw Pixels in Doom (paper)
    “Using current reinforcement learning methods, it has recently become possible to learn to play unknown 3D games from raw pixels. In this work, we study the challenges that arise in such complex environments, and summarize current methods to approach these.”
  • Open Sourcing 223GB of Driving Data
    A necessity in building an open source self-driving car is data. Lots and lots of data. Udacity now releases 223GB of image frames and log data from 70 minutes of driving in Mountain View on two separate days, with one day being sunny, and the other overcast.
  • How to Use t-SNE Effectively
    Amazing visualization! “Although extremely useful for visualizing high-dimensional data, t-SNE plots can sometimes be mysterious or misleading. By exploring how it behaves in simple cases, we can learn to use it more effectively.”
  • RStudio announces R Notebooks
    “Today we’re excited to announce R Notebooks, which add a powerful notebook authoring engine to R Markdown. Notebook interfaces for data analysis have compelling advantages including the close association of code and output and the ability to intersperse narrative with computation. Notebooks are also an excellent tool for teaching and a convenient way to share analyses.”
  • pandasql: Make python speak SQL
    This post is about pandasql, a Python package Yhat wrote that emulates the R package sqldf. It’s a small but mighty library comprised of just 358 lines of code. The idea of pandasql is to make Python speak SQL For those who come from a SQL-first background, pandasql is a nice way to take advantage of the strengths of both languages.