Andreas Madsen

Andreas Madsen - Profile Picture

Andreas Madsen

PhD candidate at Mila
Interpretability, Machine Learning

I'm a PhD candidate at Mila, researching interpretability for Natural Language Processing, primarily focusing on ensuring interpretability methods provide valid explanations. My supervisors are prof. Sarath Chandar and prof. Siva Reddy. Before that, I was known for being an independent researcher, also in interpretability.

Neural networks are very complex, and their logic is not transparent to users or developers. Providing explanations of neural networks is called interpretability, and I think that machine learning in some areas is socially irresponsible without this. Unfortunately, there is not enough research in this area, as most research revolves around beating well-defined benchmarks, and "good" explanation is ambiguous. I want to change that. My compass is to ground my research in real-world settings based on my past experiences as a freelancer in machine learning.

During my PhD I have been published in ACM, EMNLP, etc. and performed invited talks regarding my work. In particular, I was invited by Sara Hooker to do the inaugural talk at Cohere for AI.

Before starting my PhD I published first in and later at ICLR 2020, where I received a spotlight award. Both of these works received a lot of attention, and I wrote a blog post about my life as an Independent Researcher that went quite viral. All of this also resulted in several interviews and invited talks.

In the past, I worked as a Freelancer in Machine Learning for 3 years. One of my projects was implementing clinic.js, which has become the de-facto profiling tool in JavaScript and won awards. Additionally, I was also a very active open-source contributor in JavaScript. I have helped develop Node.js since 2011 such as: major core components, infrastructure, and I was part of several steering committees. Finally, my own open-source modules were downloaded 173 million times in just 2023.



Are self-explanations from Large Language Models faithful? – Pre-print
Jan 2024
Are self-explanations from Large Language Models faithful?

Large language models are increasingly being used by the public, in the form of chat models. These chat systems often provide detailed and highly convincing explanations for their answers, even when not explicitly prompted to do so. This makes users more confident in these models. However, are the explanations true? If not true, this confidence is unsupported which can be dangerous.

We measure the truthfulness (i.e. interpretability-faithfulness) of the explanations that LLMs provide, so called self-explanations. We do so by holding the models accountable to their own explanations, using self-consistency checks. We find that the truthfulness is highly dependent on the model and the specific task. Suggesting we should not have general confidence in these explanations.


Faithfulness Measurable Masked Language Models – Pre-print
Oct 2023
Faithfulness Measurable Masked Language Models

Interpretability have two paradigms, post-hoc or intrinsic explanations. This paper propose a new paradigm, where models intrinsically provides the means to measure faithfulness of any explanation. We call these inherently faithfulness measurable models (FMMs).

Because measuring faithfulness is now trivial, it is possible to optimize explanations with respect to the faithfulness. As a result, the model becomes indirectly inherently explainable and we get explanations with state-of-the-art faithfulness scores.

We demonstrate this general idea using masked language model (MLM), by simply fine-tuning an MLM such that masking any tokens are in-distribution. We thoroughly validate our claims on 16 datasets and use out-of-distribution tests.


Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining - EMNLP Findings 2022 & BlackboxNLP 2022
Dec 2022
Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining

Attention is a widely used component in neural networks and is often treated as an explanation in the interpretability field. However, since 2019 there has been much discussion on whether attention is actually a valid explanation. Much of this discussion exists because it is inherently impossible to say what a correct explanation is.

This paper proposes a new indirect method for measuring if explanations are valid. It is based on a previous method called ROAR which was used for computer vision. In this paper, we adapt ROAR to natural language and solve previously known issues with a new version we call Recursive ROAR. Finally, we develop a scalar benchmark that will make it easy to compare explanations between future papers.

Inaugural Talk

Independent Research & Interpretability – Cohere for AI
Aug 2022
Textual Interpretability slide

The Inaugural talk for the research organization Cohere for AI, a great honor! My talk is a narrative about my path from independent research to PhD in interpretability. I cover many of the lessons I learned during my time as an independent research, as well as some of the reasons I got started with interpretability.

In the second act, I discuss contemporary challenges in interpretability to motivate new researchers.


Post-hoc Interpretability for Neural NLP: A Survey – ACM Computing Surveys
Jul 2022
Post-hoc Interpretability for Neural NLP: A Survey

A survey on post-hoc interpretability methods for Natural Language Processing (NLP). The survey covers 19 specific interpretability methods and cites more than 100 works. Each method is categorized by how it communicates, visualized in a comparative format, and its evaluation methodology is discussed.

Beyond interpretability methods, the survey covers topics on motivation for interpretability and measures of interpretability. At last, we provide general insights, as well as our opinions on future directions and challenges.


Importance of Textual Interpretability – LiveAI
Jul 2020
Textual Interpretability slide

Invited talk at LiveAI, on the importance of being able to explain natural language models. Covering legal perspectives, the social impact of machine learning, and my work on textual interpretability. From my Distill publication to my python module.

Panel Discussion

NewInML workshop – ICML
Jul 2020
ICML workshop logo

Taking part in a panel discussion, on how to navigate ML academia when you are new. The other panelists were Chelsea Finn (Stanford University), Shakir Mohamed (DeepMind), Tong Zhang (Hong Kong SciTech), Ashley Edwards (ML Collective), and Edward Raff (Booz Allen Hamilton).


Neural Arithmetic Units & Independent Researcher – TWIML AI Podcast
Jun 2020
TWIML podcast thumbnail

Interview by TWIML, on the process of developing my paper Neural Arithmetic Units. How to work with limited resources, the importance of collaboration, and the struggles of being an independent researcher by necessity.

Publication & Talk

Neural Arithmetic Units – ICLR 2020
Apr 2020
Neural Arithmetic Units

Proposes two new arithmetic units (addition and multiplication), that improves the state-of-the-art by 3x to 20x, over existing units such as the ``Neural Arithmetic Logic Unit'' (NALU). The improvements were achieved by rigorous theoretical analysis. The new units allow for more interpretable models and potentially perfect extrapolation.

This received a spotlight award at ICLR, as it was among the 5% best-reviewed publications.

Open Source

Textual Heatmap – pip package
Mar 2020

A python library for creating the interactive textual heatmap visualization, as I demonstrated in my Distill paper. This library works with Jupyter and Google Colab making it easy for researchers to apply in their interpretability research.


Measuring Arithmetic Extrapolation Performance – SEDL|NeurIPS 2019
Dec 2019
Measuring Arithmetic Extrapolation Performance

Proposes a new evaluation-criteria, with special confidence intervals, for extrapolation tasks. It uses these criteria in a reproduction study of the ``Neural Arithmetic Logic Unit'' (NALU), and shows that in some cases its performance is drastically worse than previously assumed.

Open Source

lrcurve – pip package
Nov 2019

Creates a learning-curve plot for Jupyter/Colab notebooks that is updated in real-time. This was first developed for a workshop at NodeConfEU, I later made it into its own pip package.


AI smartwatch badge for NodeConf EU - NearForm Research
Nov 2019
AI smartwatch for NodeConf EU

Developed the hand-gesture recognition machine learning model for the IoT smartwatch badge, given out at NodeConfEU 2019. The model ran on a system-on-chip using TensorFlow Lite for Microprocessors. This was done in collaboration with the TensorFlow team.


Probability in TensorFlow.js – CopenhagenJS
Aug 2019
TensorFlow.js Special

Talked about my TensorFlow.js implementation of the special functions, and especially how to survive a really difficult programming project, with lots of unknowns.

Open Source

TensorFlow.js Special Functions
Jul 2019
TensorFlow.js Special

Implementation for TensorFlow.js of the special functions used in probability, calculus, differential equations, and more. Such as the beta, gamma, zeta, and Bessel functions.


Visualizing and understanding RNNs – PracticalAI
Jun 2019

I was interviewed by PracticalAI on my Distill publication that became highly acclaimed. I discuss the importance of interpretability and visualization. Such as how we develop our intuition though interaction and the importance of testing that intuition.


Visualizing memorization in RNNs – Distill Journal
Mar 2019
Visualizing memorization in RNNs

Proposes a visualization method for qualitatively comparing different RNN architectures' ability to memorize and understand what parts of an input-sentence make a prediction, which is great for interpretability.

Distill is a peer-reviewed journal, chaired by Chris Olah from OpenAI, and other famous researchers.

Open Source & Product

Node.js Cephes library – NearForm Research
Sep 2018

By compiling the cephes library to WebAssembly, this module allows JavaScript developers to use mathematical special functions.

Cephes.js have become a backbone for many of the Machine Learning projects at NearForm Research.

Open Source & Product

Hidden Markov Model in TensorFlow.js – NearForm Research
Aug 2018

TensorFlow.js Implementation of Hidden Markov Model, that is now used filter background noise from V8 runtime in Node.js from general CPU usage signal, leaving just the main application CPU usage.

Open Source & Product

Clinic.js Bubbleprof – NearForm Research
Jul 2018

Implemented the collection runtime and analysis backend of Clinic.js Bubbleprof. The currently most advanced tool for profiling and debugging asynchronous delays in Node.js.

Open Source & Product

First release of Clinic.js (Doctor) – NearForm Research
Jan 2018

Implemented the collection runtime, analysis backend, and frontend of Clinic.js Doctor. Clinic.js Doctor collects runtime usage data from the application runtime and uses machine learning and advanced non-parametric statistics to classify data into a recommendation for what tool to use next.

I was later involved in hiring and managing the team that now maintains it.

MSc. Thesis

Semi-supervised neural machine translation
Aug 2017
MSc thesis

A semi-supervised neural machine translation model for small bilingual datasets. The model used the ByteNet model (Kalchbrenner, et. al.) together with a beam-search marginalization approach for semi-supervised learning.

Open Source

Node.js core - async_hooks
May 2017
Node.js async_hooks

I was a critical part of getting the async_hooks module implemented in the Node.js core runtime. This module allows users to monitor all asynchronous operations happening in the application.

Open Source

Official TensorFlow implementation of sparsemax
Feb 2017
TensorFlow sparsemax

Implemented the sparsemax operator in the TensorFlow core, as part of a course project. This involved Python, C++, and CUDA.

Open Source

Dprof, asynchronous I/O profiling tool
Dec 2016

Implemented interactive profiling software for monitoring all asynchronous operations in a node.js application. This used, at the time, an internal version of async_hooks, and the tool was instrumental in debugging the async_hooks implementation in Node.js.


Benchmarking with statistics in Node.js – NodeConf EU
Nov 2016
benchmarking nodejs

After having introduced statistics into the Node.JS open source project for their benchmarking suite, I was invited to speak at NodeConf EU in Ireland.

The challenge was to communicate both how a Welch's t-test works to people that often dislike mathematics, and provide the psychological background of why statistics is necessary.

Open Source

Node.js benchmark suite
Jun 2016
Node.js async_hooks

Complete refactor if the benchmark pipeline and tooling used in Node.js. This was done to add proper statistics to the micro benchmarks used in Node.js. A big challenge was to communicate statistical concepts to programmers, particularly for a large-scale open-source project which gets new contributors very frequently.

BSc. Thesis

Story-level semantic clustering
Aug 2015
skip-gram paragraph model

A comparison of paragraph2vec (a word2vec variant) and an LSTM encoder-decoder (Sutskever el al.), for generating semantic vectors that are precise enough to cluster documents according to the story.

The thesis also proposes a quasi-linear-time clustering algorithm, useful for dated documents such as new articles.

Open Source

Node.js core - cluster module
Jun 2012
Node.js async_hooks

I was the main implementer of the cluster module for Node.js. This allowed developers to run a server on multiple CPU cores. Because JavaScript is single-threaded that was big news. Today this is less relevant because load balancers and containers have become the default scaling strategy.