Web Picks (week of 5 February 2018)

Every two weeks, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.

  • Lessons from Optics, The Other Deep Learning
    Pretty much the post you should read this week! Full of powerful thoughts. “There’s a mass influx of newcomers to our field and we’re equipping them with little more than folklore and pre-trained deep nets, then asking them to innovate. We can barely agree on the phenomena that we should be explaining away. I think we’re far from teaching this stuff in high schools.”
  • Why Are Data Science Leaders Running for the Exit?
    “You can’t tell a team to solve a problem in two sprints. If they don’t have the data or tools, it won’t happen. We have to understand that the economics of data products are different. A lot of large companies don’t even have this conversation, which can cause a lot of frustration for those in charge of data products. In essence, they have one hand tied behind their back.”
  • Facebook open sources Detectron
    “Facebook AI Research (FAIR) open sourced Detectron — our state-of-the-art platform for object detection research.”
  • …and is shutting down its standalone personal assistant “M”
    Seems like a lot of chatbot hype is dying out
  • Dutch sports analytics company SciSports uses emerging tech to innovate on the pitch
    “The sports analytics company uses streaming data and applies machine learning, deep learning and artificial intelligence to capture and analyze this data.”
  • Wise up, deep learning may never create a general purpose AI
    AI and deep learning have been subject to a huge amount of hype. In a new paper, Gary Marcus argues there’s been an “irrational exuberance” around deep learning.
  • PyTorch, a year in….
    A lot has happened in the world of PyTorch.
  • A.I. Has Arrived in Investing. Humans Are Still Dominating.
    “We know the markets are irrational, especially in the short term, but the machines aren’t going to know how to behave in that kind of environment.”
  • Family fun with deepfakes. Or how I got my wife onto the Tonight Show
    Deepfakes has caused a storm around the Internet, with everyone debating whether this is good or bad. At least, this author applies the technique towards a harmless joke that’s family safe for once.
  • How to solve 90% of NLP problems: a step-by-step guide
    Using Machine Learning to understand and leverage text.
  • How WeChat came to rule China
    The multipurpose messaging app is becoming the nation’s ID system.
  • In China, consumers are becoming more anxious about data privacy
    Will this impede the government’s snooping?
  • Asking the Right Questions About AI
    “In the past few years, we’ve been deluged with discussions of how artificial intelligence (AI) will either save or destroy the world. It’s probably pretty clear to you that some of this is nonsense, and that some of this is real. If we want to be able to have real discussions about this as a society, we need to talk about the realities of AI: what it can and can’t actually do, what it might be able to do in the future, and what some of the social, cultural, and ethical challenges it poses are.”
  • Like an AI Could Ever Spot Sarcasm
    Could a computer learn to detect this nuanced form of expression? Pushpak Bhattacharyya says they can — and he’s got the algorithms to prove it.
  • The UX of AI
    Google tries to woe us with clips of adorable pets and children to convince us that sharing all this data and using all this AI is totally sweet and fine…
  • Ethics in Machine Learning
    Interview with Dr. Hanie Sedghi, Research Scientist, Google Brain
  • PointCNN
    PointCNN is a simple and general framework for feature learning from point clouds
  • Practical Deep Learning for Coders 2018
    Fast AI launches its 2018 version of their deep learning course.
  • Faster R-CNN: Down the rabbit hole of modern object detection
    Detecting objects in images.
  • How to build your own AlphaZero AI using Python and Keras
    Teach a machine to learn Connect4 strategy through self-play and deep learning.
  • Dark Algorithms & UX in Health Insurance
    “The only thing scarier than maliciously designing a user experience that tricks a user into sharing their email contacts with you is designing an experience that tries to prevent them from accessing healthcare.”
  • How to make your machine learning model available as an API
    With the R plumber package.
  • Algorithmic decision making and the cost of fairness (paper)
    “Algorithms are now regularly used to decide whether defendants awaiting trial are too dangerous to be released back into the community. In some cases, black defendants are substantially more likely than white defendants to be incorrectly classified as high risk. To mitigate such disparities, several techniques recently have been proposed to achieve algorithmic fairness. Here we reformulate algorithmic fairness as constrained optimization: the objective is to maximize public safety while satisfying formal fairness constraints designed to reduce racial disparities. We show that for several past definitions of fairness, the optimal algorithms that result require detaining defendants above race-specific risk thresholds. We further show that the optimal unconstrained algorithm requires applying a single, uniform threshold to all defendants. The unconstrained algorithm thus maximizes public safety while also satisfying one important understanding of equality: that all individuals are held to the same standard, irrespective of race.”
  • Audio Adversarial Examples: Targeted Attacks on Speech-to-Text (paper)
    “We construct targeted audio adversarial examples on automatic speech recognition. Given any audio waveform, we can produce another that is over 99.9% similar, but transcribes as any phrase we choose (at a rate of up to 50 characters per second). We apply our iterative optimization-based attack to Mozilla’s implementation DeepSpeech end-to-end, and show it has a 100% success rate. The feasibility of this attack introduce a new domain to study adversarial examples.”