Every two weeks, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.
In the past month, GPT-3 has been all over the AI and ML scene. As such, we start of with a good overview article from Gwern. “OpenAI releases the long-awaited followup to GPT-2, one model to rule them all: a 117× larger 175b-parameter model with far more powerful language generation, which lets it solve a wide variety of problems from arithmetic1 to English translation to unscrambling anagrams to SAT analogies—purely from being prompted with text examples, without any specialized training or finetuning whatsoever, merely next-word prediction training on a big Internet text corpus.” Also take a look at these GPT-3 examples.
- Why GPT-3 Matters
“Loading the entire model’s weights in fp16 would take up an absolutely preposterous 300GB of VRAM, not even including the gradients. But, with massive size comes massive generalization ability: GPT-3 is competitive in many benchmarks without even tuning on the target task.”
- OpenAI’s GPT-3 may be the biggest thing since bitcoin
“I share my early experiments with OpenAI’s new language prediction model (GPT-3) beta. I explain why I think GPT-3 has disruptive potential comparable to that of blockchain technology.”
- Giving GPT-3 a Turing Test
It’s possible to improve GPT-3’s performance on the specific tasks above by including a prompt solving similar problems.
- Are we in an AI overhang?
“GPT-3 has been estimated to cost $5m in compute to train, and – looking at the author list and OpenAI’s overall size – maybe another $10m in labour.”
- How GPT3 Works – Visualizations and Animations
“Massive language models (like GPT3) are starting to surprise us with their abilities. While not yet completely reliable for most businesses to put in front of their customers, these models are showing sparks of cleverness that are sure to accelerate the march of automation and the possibilities of intelligent computer systems.”
- GPT-3 examples
Developers have built an impressively diverse range of applications using the GPT-3 API, including an all purpose Excel function, a recipe generator, a layout generator (translates natural language to JSX), a search engine and several others.
- Awesome GPT-3
This evolving GPT-3 collection includes links to some of the best demos and tutorials around the web.
- Philosophers On GPT-3 (updated with replies by GPT-3)
Nine philosophers explore the various issues and questions raised by the newly released language model, GPT-3.
- Tempering Expectations for GPT-3 and OpenAI’s API
GPT-3 has been getting a lot of attention around the web lately and impressive examples abound. This post explores what all the fuss is about and it’s one of the best high-level explainers so far.
- We Have Already Let The Genie Out of The Bottle
Given how fast AI has been progressing, how can we be sure that it will be a force for good?
- The Cost of AI Training is Improving at 50x the Speed of Moore’s Law
Why it’s still early days for AI.
- Powerful AI Can Now Be Trained on a Single Computer
New machine learning training approach could help under-resourced academic labs catch up with big tech.
- Deepfake used to attack activist couple shows new disinformation frontier
“The Taylor persona is a rare in-the-wild example of a phenomenon that has emerged as a key anxiety of the digital age: The marriage of deepfakes and disinformation.”
- The Rise of Synthetic Audio Deepfakes
“Audio deepfakes are the new frontier for business compromise schemes and are becoming more common pathways for criminals to deceptively gain access to corporate funds.”
- Stopping deepfake news with an AI algorithm that can tell when a face doesn’t fit
A new artificial intelligence technique can automatically detect face-swapped videos of politicians.
- Don’t ask if artificial intelligence is good or fair, ask how it shifts power
Those who could be exploited by AI should be shaping its projects.
- Epoxy: Interactive Model Iteration with Weak Supervision and Pre-Trained Embeddings
Epoxy uses weak supervision and pre-trained embeddings to create models that can train at programmatically-interactive speeds (less than 1/2 second), but that can retain the performance of training deep networks.
- Curve Detectors
Every vision model we’ve explored in detail contains neurons which detect curves. “This article is the first part of a three article deep dive into curve detectors: their behavior, how they’re built from earlier neurons, and their prevalence across models. We’re doing this because we believe that the interpretability community disagrees on several crucial questions.”
- Mean Squared Terror
GridSearch is not enough.
- Floating-Point Formats and Deep Learning
Floating-point format is not a crucial consideration in deep learning, but it can make a significant difference. What is floating-point, why should you (a deep learning practictioner) care, and what can you do about it?
- AI in physics: are we facing a scientific revolution?
An AI reconstructs Newton’s second law and discovers a previously unknown formula for mass calculation of dark matter. Can AI automate science?
- Bilingual Evaluation Understudy (BLEU)
“BLEU is a standard algorithm for evaluating the machine translations against the human translations. At first I thought it should be very straightforward to use. However, it turns out that there are a lot of caveats.”
- Generative zoology with neural networks
“The bulk of the output is a huge variety of entirely recognisable silhouettes – birds, various quadrupeds, reams of little gracile theropod dinosaurs, sauropods, fish, bugs, arachnids and humanoids.”
- When data is messy
A great illustration of how AI gets the wrong idea about what problem we’re asking it to solve.
- Facebook & CMU Introduce TaBERT for Understanding Tabular Data Queries
“A team of researchers from Carnegie Mellon University and Facebook AI recently introduced the tabular data model TaBERT. Built on top of the popular BERT NLP model, TaBERT is the first model pretrained to learn representations for both natural language sentences and tabular data, and can be plugged into a neural semantic parser as a general-purpose encoder.”
- Why You Should Do NLP Beyond English
7000+ languages are spoken around the world but NLP research has mostly focused on English. This post outlines why you should work on languages other than English.
- DeepDream: How Alexander Mordvintsev Excavated the Computer’s Hidden Layers
A Google researcher looks into the mind of a computer.
This package provides a generic, simple and fast implementation of Deepmind’s AlphaZero algorithm for Julia.
- ScaNN (Scalable Nearest Neighbors) is a method for efficient vector similarity search at scale
Interesting to speed up e.g. k-NN searches.
- The Case for Causal AI
“Using artificial intelligence to predict behavior can lead to devastating policy mistakes. Health and development programs must learn to apply causal models that better explain why people behave the way they do to help identify the most effective levers for change.”
- Decentralized Reinforcement Learning
Global Decision-Making via Local Economic Transactions.
- Pulsar vs. Kafka
A more accurate perspective on performance, architecture, and features.
- Modes, Medians and Means: A Unifying Perspective
Older but beautifully revealing article.
- PP-YOLO Surpasses YOLOv4 – State of the Art Object Detection Techniques
Baidu publishes PP-YOLO and pushes the state of the art in object detection research by building on top of YOLOv3, the PaddlePaddle deep learning framework, and cutting edge computer vision research.
- IBM Fully Homomorphic Encryption Toolkit for Linux
The IBM Fully Homomorphic Encryption (FHE) Toolkit for Linux is packaged as Docker containers that make it easier to get started and experimenting with the Fully Homomorphic Encryption technology.
- pandera – Statistical Data Validation for Pandas
pandera is a data validation library for scientists, engineers, and analysts seeking correctness.
- Darts: Time Series Made Easy in Python
“In this article, we introduce Darts, our attempt at simplifying time series processing and forecasting in Python.”
- The Data Science Lifecycle Process
The Data Science Lifecycle Process is a set of prescriptive steps and best practices to enable data science teams to consistently deliver value. It includes issue templates for common data science work types, a branching strategy that fits the data science development flow, and prescriptive guidance on how to piece together all the various tools and workflows required to make data science work.
Deploy Tensorflow, Scikit, Keras and spaCy straight from your notebook with just one extra line.
Satellite imagery for dummies.
- Otto: Your friendly machine learning assistant.
Machine learning becomes an intuitive, natural language experience.
- Snorkel AI: Putting Data First in ML Development
“Snorkel AI, which spun out of the Stanford AI Lab in 2019, was founded on two simple premises: first, that the labeled training data machine learning models learn from is increasingly what determines the success or failure of AI applications. And second, that we can do much better than labeling this data entirely by hand.”
Kowl (previously known as Kafka Owl) is a web application that helps you to explore messages in your Apache Kafka cluster and get better insights on what is actually happening in your Kafka cluster in the most comfortable way.
- Apache Arrow 1.0.0 Release
The Apache Arrow team is pleased to announce the 1.0.0 release.
- 14 Best Data Science Books to Read Right Now
From textbooks to introductory tomes and mass-market nonfiction.
- Monitoring Machine Learning
Oren Razon, Co-Founder and CTO of SuperwiseAI, discusses the challenges of operationalizing machine learning, the different types of model drift that can occur, and who owns ML monitoring in organizations.
- The Overfitted Brain: Dreams evolved to assist generalization
“The goal of this paper is to argue that the brain faces a similar challenge of overfitting, and that nightly dreams evolved to combat the brain’s overfitting during its daily learning. That is, dreams are a biological mechanism for increasing generalizability.