Andreas Madsen

Andreas Madsen - Profile Picture

Andreas Madsen

Independent Researcher
MSc Eng. – Machine Learning

I'm an independent reseracher, interested in developing interpretable machine learning models and methodology & mathematics for understanding existing models in general.

I believe that without interpretability, machine learning in some areas is socially irresponsible. Unfortunately, I don't think there is enough research in this area, as most research revolves around beating the state-of-the-art. I want to change that, to do good.

I've published 1) At ICLR 2020, where I received a spotlight award. 2) In the SEDL workshop at NeurIPS 2019. 3) In the journal. I've been interviewed several times about my publications and work.

In the past, I have made major open-source contributions to JavaScript, such as implementing clinic.js, that has become the de-facto profiling tool and won awards. I have helped developed Node.js core modules and infrastructure.

I've written a blog post about my life as an Independent Researcher that went quit viral.



Importance of Textual Interpretability – LiveAI
Jul 2020
Textual Interpretability slide

Invited talk at LiveAI, on the importance of being able to explain natural language models. Covering legal perspectives, the social impact of machine learning, and my work on textual interpretability. From my Distill publication to my python module.

Panel Discussion

NewInML workshop – ICML
Jul 2020
ICML workshop logo

Taking part in a panel discussion, on how to navigate ML academia when you are new. The other panelists were Chelsea Finn (Stanford University), Shakir Mohamed (DeepMind), Tong Zhang (Hong Kong SciTech), Ashley Edwards (ML Collective), and Edward Raff (Booz Allen Hamilton).


Neural Arithmetic Units & Independent Researcher – TWIML AI Podcast
Jun 2020
TWIML podcast thumbnail

Interview by TWIML, on the process of developing my paper Neural Arithmetic Units. How to work with limited resources, the importance of collaboration, and the struggles of being an independent researcher by necessity.

Open Source

Textual Heatmap – pip package
Mar 2020

A python library for creating the interactive textual heatmap visualization, as I demonstrated in my Distill paper. This library works with Jupyter and Google Colab making it easy for researchers to apply in their interpretability research.

Publication & Talk

Neural Arithmetic Units – ICLR 2020
Dec 2019
Neural Arithmetic Units

Proposes two new arithmetic units (addition and multiplication), that improves the state-of-the-art by 3x to 20x, over existing units such as the ``Neural Arithmetic Logic Unit'' (NALU). The improvements were achieved by rigorous theoretical analysis. The new units allow for more interpretable models and potentially perfect extrapolation.

This received a spotlight award at ICLR, as it was among the 5% best-reviewed publications.

Open Source

lrcurve – pip package
Nov 2019

Creates a learning-curve plot for Jupyter/Colab notebooks that is updated in real-time. This was first developed for a workshop at NodeConfEU, I later made it into its own pip package.


AI smartwatch badge for NodeConf EU - NearForm Research
Nov 2019
AI smartwatch for NodeConf EU

Developed the hand-gesture recognition machine learning model for the IoT smartwatch badge, given out at NodeConfEU 2019. The model ran on a system-on-chip using TensorFlow Lite for Microprocessors. This was done in collaboration with the TensorFlow team.


Measuring Arithmetic Extrapolation Performance – SEDL|NeurIPS 2019
Oct 2019
Measuring Arithmetic Extrapolation Performance

Proposes a new evaluation-criteria, with special confidence intervals, for extrapolation tasks. It uses these criteria in a reproduction study of the ``Neural Arithmetic Logic Unit'' (NALU), and shows that in some cases its performance is drastically worse than previously assumed.


Probability in TensorFlow.js – CopenhagenJS
Aug 2019
TensorFlow.js Special

Talked about my TensorFlow.js implementation of the special functions, and especially how to survive a really difficult programming project, with lots of unknowns.

Open Source

TensorFlow.js Special Functions
Jul 2019
TensorFlow.js Special

Implementation for TensorFlow.js of the special functions used in probability, calculus, differential equations, and more. Such as the beta, gamma, zeta, and bessel functions.


Visualizing and understanding RNNs – PracticalAI
Jun 2019

I was interviewed by PractialAI on my Distill publication that became highly acclaimed. I discuss the importance of interpretability and visualization. Such as how we develop our intuition though interaction and the importance of testing that intuition.


Visualizing memorization in RNNs – Distill Journal
Mar 2019
Visualizing memorization in RNNs

Proposes a visualization method for qualitatively comparing different RNN architectures' ability to memorize and understand what parts of an input-sentence make a prediction, which is great for interpretability.

Distill is a peer-reviewed journal, chaired by Chris Olah from OpenAI, and other famous researchers.

Open Source & Product

Node.js Cephes library – NearForm Research
Sep 2018

By compiling the cephes library to WebAssembly, this module allows JavaScript developers to use mathematical special functions.

Cephes.js have become a backbone for many of the Machine Learning projects at NearForm Research.

Open Source & Product

Hidden Markov Model in TensorFlow.js – NearForm Research
Aug 2018

TensorFlow.js Implementation of Hidden Markov Model, that is now used filter background noise from V8 runtime in Node.js from general CPU usage signal, leaving just the main application CPU usage.

Open Source & Product

Clinic.js Bubbleprof – NearForm Research
Jul 2018

Implemented the collection runtime and analysis backend of Clinic.js Bubbleprof. The currently most advanced tool for profiling and debugging asynchronous delays in Node.js.

Open Source & Product

First release of Clinic.js (Doctor) – NearForm Research
Jan 2018

Implemented the collection runtime, analysis backend, and frontend of Clinic.js Doctor. Clinic.js Doctor collects runtime usage data from the application runtime and uses machine learning and advanced non-parametric statistics to classify data into a recommendation for what tool to use next.

I was later involved in hiring and managing the team that now maintains it.

MSc. Thesis

Semi-supervised neural machine translation
Aug 2017
MSc thesis

A semi-supervised neural machine translation model for small bilingual datasets. The model used the ByteNet model (Kalchbrenner, et. al.) together with a beam-search marginalization approach for semi-supervised learning.

Open Source

Node.js core - async_hooks
May 2017
Node.js async_hooks

I was a critical part of getting the async_hooks module implemented in the Node.js core runtime. This module allows users to monitor all asynchronous operations happening in the application.

Open Source

Official TensorFlow implementation of sparsemax
Feb 2017
TensorFlow sparsemax

Implemented the sparsemax operator in the TensorFlow core, as part of a course project. This involved Python, C++, and CUDA.

Open Source

Dprof, asynchronous I/O profiling tool
Dec 2016

Implemented interactive profiling software for monitoring all asynchronous operations in a node.js application. This used, at the time, an internal version of async_hooks, and the tool was instrumental in debugging the async_hooks implementation in Node.js.


Benchmarking with statistics in Node.js – NodeConf EU
Nov 2016
benchmarking nodejs

After having introduced statistics into the Node.JS open source project for their benchmarking suite, I was invited to speak at NodeConf EU in Ireland.

The challenge was to communicate both how a Welch's t-test works to people that often dislike mathematics, and provide the psychological background of why statistics is necessary.

BSc. Thesis

Story-level semantic clustering
Aug 2015
skip-gram paragraph model

A comparison of paragraph2vec (a word2vec variant) and an LSTM encoder-decoder (Sutskever el al.), for generating semantic vectors that are precise enough to cluster documents according to the story.

The thesis also proposes a quasi-linear-time clustering algorithm, useful for dated documents such as new articles.