Every so often, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.
- QLoRA: Efficient Finetuning of Quantized LLMs
A new, super efficient approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning task performance…
- Guanaco 33B
This demo showcases the Guanaco 33B model, released together with the paper QLoRA
- Gorilla: Large Language Model Connected with Massive APIs
Gorilla is a LLM that can provide the appropriate API calls. It is trained on three massive machine learning hub datasets: Torch Hub, TensorFlow Hub and HuggingFace. Zero-shot Gorilla outperforms GPT-4, Chat-GPT and Claude. Gorilla is extremely reliable, and significantly reduces the hallucination errors.
- 🚀 Falcon-40B
The best open-source LLM model currently available? Falcon-40B outperforms LLaMA, StableLM, RedPajama, MPT, etc.
- LIMA: Less Is More for Alignment
Researchers @MetaAI demonstrate that, given a strong pretrained language model, remarkably strong performance can be achieved by simply fine-tuning on just 1,000 carefully curated training examples.
- Meta AI Unleashes Megabyte
Meta’s research team unveils an innovative AI model architecture, capable of generating more than 1 million tokens across multiple formats and exceeding the capabilities of the existing Transformer architecture behind models like GPT-4.
- Finetuning Redpajama (OpenLlama)
“I’ll demonstrate finetuning the RedPJ model into an “Instruction Following” variant using the Alpaca dataset.” This is the variant we used in the article, so read on to see how this was done.
- Some Intuition on Attention and the Transformer
“Here, we’ll address some questions and try to provide intuition on the Transformer architecture.”
- Why the Original Transformer Figure Is Wrong, and Some Other Interesting Historical Tidbits About LLMs
- RWKV: Reinventing RNNs for the Transformer Era
“We propose a novel model architecture, Receptance Weighted Key Value (RWKV), that combines the efficient parallelizable training of Transformers with the efficient inference of RNNs.”
- Uncensored Models
“Why should anyone want to make or use an uncensored model? a few reasons.”
- How Language Model Hallucinations Can Snowball
“We refer to this phenomenon as hallucination snowballing: an LM over-commits to early mistakes, leading to more mistakes that it otherwise would not make.”
- From Data Engineering to Prompt Engineering
Solving data preparation tasks with ChatGPT
- ChatGPT: A Mental Model
“My current mental model of ChatGPT is that it’s akin to a “Maximum Likelihood Estimator for the Entirety of Human Knowledge”.”
- Against LLM maximalism
If you’re working on the same sort of Natural Language Processing (NLP) problems that businesses have been trying to solve for a long time, what’s the best way to use LLMs?
- AI and the Future of Programming
An interview with Replit co-founder Amjad Masad
- Cargo Cult AI
Is the ability to think scientifically the defining essence of intelligence?
- Dumber LLM Agents Need More Constraints and Better Tools
“We find that constraining agent interaction behavior, and giving them access to more tools that can more explicitly perform complex actions, can help improve query performance over these less sophisticated LLMs.”
- A PhD Student’s Perspective on Research in NLP in the Era of Very Large Language Models
“This document is a compilation of NLP research directions that are rich for exploration, reflecting the views of a diverse group of PhD students in an academic research lab.”
- LM 👾 Studio
Find, download, and run local LLMs!
- Scikit-LLM: Sklearn Meets Large Language Models
Seamlessly integrate powerful language models like ChatGPT into scikit-learn for enhanced text analysis tasks.
- VanillaNet: the Power of Minimalism in Deep Learning
By avoiding high depth, shortcuts, and intricate operations like self-attention, VanillaNet is refreshingly concise yet remarkably powerful.
- Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Yet another one tries to dethrone Adam…
- A Comprehensive Guide to Vector Databases
“Vector databases are a new wave of data management designed for generative AI, IoT and time-series applications.” Also see What is a Vector Database? To read up on distance metrics see this post and this thorough reference.
- eBay’s Blazingly Fast Billion-Scale Vector Similarity Engine
“The eBay CoreAI team launched an “Approximate Nearest Neighbor” (ANN) vector similarity engine that provides tooling to build use cases that match semantically similar items and personalize recommendations.”
- Modding Age of Empires II with a Sprite-Diffuser
“Below are some thoughts and process on how to create a versatile prompt-based image generator. For beginners I would recommend Alpaca, and for those comfortable with coding – Stable Diffusion Web UI and Python.”
- Constitutional AI: RLHF On Steroids
What if the AI gives feedback to itself?
- AI boom could expose investors’ natural stupidity
Indeed, enthusiasm about AI has become the one ray of light piercing the stock market gloom created by the record-breaking rise in U.S. interest rates.
- GPT detectors are biased against non-native English writers
“Our findings reveal that these detectors consistently misclassify non-native English writing samples as AI-generated”
My attempt at creating generative agent simulations for RPG games or research
- Simulated Hospital
Simulated Hospital is a tool that generates realistic and configurable hospital patient data in HL7v2 format.
- Optimization Without Using Derivatives: the PRIMA Package, its Fortran Implementation, and Its Inclusion in SciPy
“How to include PRIMA in SciPy as soon as possible? This is the question. The major Scipy maintainers are positive about the inclusion of PRIMA solvers in SciPy.”
- Google “We Have No Moat, And Neither Does OpenAI”
Leaked Internal Google Document Claims Open Source AI Will Outcompete Google and OpenAI
- The EU AI Act is coming, this time for real, probably
“Before negotiations with the Council on the final form of the law can begin, the draft negotiating mandate needs to be endorsed by the European Parliament, with a vote possibly taking place during the 12–15 June session”
- Mojo 🔥 — a new programming language for all AI developers.
Mojo combines the usability of Python with the performance of C, unlocking unparalleled programmability of AI hardware and extensibility of AI models.