Web Picks (week of 15 January 2018)

Every two weeks, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.

  • The Meltdown bug and the KPTI patch: How does it impact ML performance?
    You’ve likely heard about the Meltdown and Spectre vulnerabilities that affect common CPUs. The patch is said to reduce performance by up to 35% and some studies have shown performance hits that are greater 50%. Here’s a look at how machine learning applications, in particular, will be affected.
  • How to Root Out Hidden Biases in AI
    Algorithms are making life-changing decisions like denying parole or granting loans. Cynthia Dwork, a computer scientist at Harvard, is developing ways of making sure the machines are operating fairly.
  • Google and Others Are Building AI Systems That Doubt Themselves
    The most powerful approach in AI, deep learning, is gaining a new capability: a sense of uncertainty. Researchers at Uber and Google are working on modifications to the two most popular deep-learning frameworks that will enable them to handle probability. This will provide a way for the smartest AI programs to measure their confidence in a prediction or a decision—essentially, to know when they should doubt themselves…
  • Google just released its internal tool to collaborate on AI
    We already reported on Colaboratory a while ago, but it seems the tool is now ready for use, with Python 3 and pip support – wow!
  • Cloud AutoML: Making AI accessible to every business
    To make AI accessible to every business, we’re introducing Cloud AutoML. Cloud AutoML helps businesses with limited ML expertise start building their own high-quality custom models by using advanced techniques like learning2learn and transfer learning from Google. We believe Cloud AutoML will make AI experts even more productive, advance new fields in AI and help less-skilled engineers build powerful AI systems they previously only dreamed of….
  • Google Has a New Plan for China (and It’s Not About Search)
    Instead, Google’s ingress is centered around artificial intelligence.
  • Facebook’s Virtual Assistant is Dead. So are Chatbots
    M’s core problem: Facebook put no bounds on what M could be asked to do. Alexa has proven adept at handling a narrower range of questions, many tied to facts, or Amazon’s core strength in shopping…
  • Psychedelic toasters fool image recognition tech
    A team of Google researchers has created psychedelic stickers that can fool image recognition software into seeing objects that are not there.
  • What AI can and can’t do (yet) for your business
    AI seems to be everywhere these days and it’s easy to think that your organization is somehow behind. This article from the McKinsey Quarterly explores the realities of AI from a practical perspective. It’s aimed at executives but decision-makers of all types will find value here. The article provides an assessment of AI limitations and it includes actionable steps for finding and taking advantage of opportunities.
  • The Google Brain Team — Looking Back on 2017 (Part 2 of 2)
    In this post, we’ll dive into the research we do in some specific domains such as healthcare, robotics, creativity, fairness and inclusion, as well as share a little more about us…
  • Data as jet fuel: An interview with Boeing’s CIO
    Boeing’s CIO Ted Colbert is something of an evangelist for the power of analytics. In this interview, he talks about how data science enables new opportunities at Boeing while at the same time, presents unique challenges for a 100 year old, industry giant. A lot is about finding the right talent.
  • Scaling The Analytics Team At Wish
    Starting with minimal resources, here’s how Wish went from zero to 30 analysts and built out the data engineering and analytics infrastructure to support its product while in the midst of massive growth.
  • Identifying churn drivers with Random Forests
    Nice introduction to estimating feature importance when using Random Forests. This post by Slav Ivanov succinctly describes common approaches and includes linked references for going deeper.
  • Deep Learning: A Critical Appraisal (paper)
    “Although deep learning has historical roots going back decades, neither the term “deep learning” nor the approach was popular just over five years ago, when the field was reignited by papers such as Krizhevsky, Sutskever and Hinton’s now classic (2012) deep network model of Imagenet. What has the field discovered in the five subsequent years? Against a background of considerable progress in areas such as speech recognition, image recognition, and game playing, and considerable enthusiasm in the popular press, I present ten concerns for deep learning, and suggest that deep learning must be supplemented by other techniques if we are to reach artificial general intelligence.”
  • Turning Design Mockups Into Code With Deep Learning
    In this tutorial, you’ll create a neural network that converts an image of a design mockup into the code for a simple website. This is an impressive tutorial with a Github repo of code to go along with it.
  • Alibaba neural network defeats human in global reading test
    Chinese tech giant’s research unit says its deep neural network model is the first to beat humans in the Stanford Question Answering Dataset, but is listed first alongside Microsoft on the latest rankings.
  • Neural Style Transfer for Musical Melodies
    “In spite of the success with images, the application of these techniques to other domains such as audio or music has been rather limited =and the results are far less convincing than those achieved using images. This suggests that this is a harder problem.”
  • Measuring the tendency of CNNs to Learn Surface Statistical Regularities (paper)
    “Deep CNNs are known to exhibit the following peculiarity: on the one hand they generalize extremely well to a test set, while on the other hand they are extremely sensitive to so-called adversarial perturbations. The extreme sensitivity of high performance CNNs to adversarial examples casts serious doubt that these networks are learning high level abstractions in the dataset. We are concerned with the following question: How can a deep CNN that does not learn any high level semantics of the dataset manage to generalize so well?”
  • Visualizing the Uncertainty in Data
    “Data is a representation of real life. It’s an abstraction, and it’s impossible to encapsulate everything in a spreadsheet, which leads to uncertainty in the numbers. Here are some visualization options for the uncertainties in your data, each with its pros, cons, and examples.”
  • Pytorch easy-to-follow Capsule Network tutorial
    An easy-to-follow Capsule Network tutorial with clean readable code.
  • Image augmentation for machine learning experiments
    This python library helps you with augmenting images for your machine learning projects. It converts a set of input images into a new, much larger set of slightly altered images.