Web Picks (week of 23 July 2018)

Every two weeks, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.

  • Microsoft Urges Congress to Regulate Use of Facial Recognition
    “In a lengthy blog post about the potential uses and abuses of facial recognition, Bradford L. Smith, the company’s president, compared the technology to products like medicines and cars that are highly regulated, and he urged Congress to study it and oversee its use.”
  • Inside China’s Dystopian Dreams: A.I., Shame and Lots of Cameras
    “With millions of cameras and billions of lines of code, China is building a high-tech authoritarian future. Beijing is embracing technologies like facial recognition and artificial intelligence to identify and track 1.4 billion people. It wants to assemble a vast and unprecedented national surveillance system, with crucial help from its thriving technology industry.”
  • How to Survive America’s Kill List
    This article doesn’t seem to be related to data, or does it? “In 2014, former CIA and NSA director Michael Hayden said in a public debate, “We kill people based on metadata.” According to multiple reports and leaks, death-by-metadata could be triggered, without even knowing the target’s name, if too many derogatory checks appear on their profile.” Quite a terrifying longread.
  • An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution
    “Surprisingly, it turns out that convolution often has difficulty completing seemingly trivial tasks.” This work from Uber Engineering made quite the splash last week, though some have also argued that it exemplifies everything that is wrong with deep learning research currently.
  • Prediction versus Accommodation
    Great article going into such nice aspects as “the Endorsement of Novelty and the Confirmation of Background Beliefs”.
  • From shallow to deep learning in fraud
    “A Research Scientist’s journey through hand-coded regressors, pickled trees, and attentive neural networks.”
  • The Mythos of Model Interpretability
    “In machine learning, the concept of interpretability is both important and slippery.”
  • How should we evaluate progress in AI?
    “Can’t AI make up its mind about what it is trying to do? Can’t it just decide to be something respectable—science or engineering—and use a coherent set of evaluation criteria drawn from one of those disciplines?”
  • Measuring abstract reasoning in neural networks
    “Neural network-based models continue to achieve impressive results on longstanding machine learning problems, but establishing their capacity to reason about abstract concepts has proven difficult.”
  • Given a satellite image, machine learning creates the view on the ground
    Geographers could use the technique to determine how land is used.
  • Why businesses fail at machine learning
    “I’d like to let you in on a secret: when people say ‘machine learning’ it sounds like there’s only one discipline here. There are two, and if businesses don’t understand the difference, they can experience a world of trouble.”
  • Do Bayesians Overfit?
    “TLDR: Yes, and there are precise results, although they are not as well known as they perhaps should be.”
  • Why RL is flawed
    “In this essay, we are going to address the limitations of one of the core fields of AI. In the process, we will encounter a fun allegory, a set of methods of incorporating prior knowledge and instruction into deep learning, and a radical conclusion.”
  • Delayed Impact of Fair Machine Learning (paper)
    “Fairness in machine learning has predominantly been studied in static classification settings without concern for how decisions change the underlying population over time. Conventional wisdom suggests that fairness criteria promote the longterm well-being of those groups they aim to protect. We study how static fairness criteria interact with temporal indicators of well-being, such as
    long-term improvement, stagnation, and decline in a variable of interest. We demonstrate that even in a one-step feedback model, common fairness criteria in general do not promote improvement over time, and may in fact cause harm in cases where an unconstrained objective would not.”
  • A Practical Taxonomy of Reproducibility for Machine Learning Research (paper)
    “In this paper we discuss a taxonomy of reproducibility from this perspective of a practitioner. Low reproducibility studies are those which merely describe algorithms, medium reproducibility studies are those which provide the code and data but not the computational environment in which the code can be run, and high reproducibility studies are those which provide the code, data, and full computational environment necessary to reproduce the results of the study.”
  • Using Python to build an AI to play and win SNES StreetFighter II – PyCon 2018 (presentation)
    “Hear the story of how we used Python to build an AI that plays Super StreetFighter II on the Super NES. We’ll cover how Python provided the key glue between the SNES emulator and AI, and how the AI was built with `gym`, `keras-rl` and `tensorflow`. We’ll show examples of game play and training, and talk about which bot beat which bot in the bot-v-bot tournament we ran.”
  • Foundations of machine learning course
    Another day, another course — this time around by Bloomberg. Looks complete, however.
  • Seedbank: Collection of Interactive Machine Learning Examples
    Just run in the browser.
  • City Street Orientations around the World
    A fun visualization showcase.