Textual Interpretability

Making Machine Learning
more socially responsible,
one step at a time.


"We have developed an amazing machine learning model, for predicting if our customers will be able to pay back their loan" – Bank* Representative

*: You properly shouldn't trust anything said.

"One small issue, our customer-facing advisors don't understand or trust the model and sometimes goes against it." – Bank Spokesperson

"Oh, we just did some visual discretizations and told them to trust it." – Bank Representative

Needless to say, they missed the point.

GDPR, Article 13

§2: "... the controller shall, provide the data subject with the information necessary to ensure fair and transparent processing".

§2.f: such as "the existence of automated decision-making, and at least in those cases, meaningful information about the logic involved ...".


Nested LSTM

You can't say something about memorization from a single hidden unit.

But then how?

Textual Saliency

"Intuition is developed from a
feedback loop."

"Interactive visualization proivides a feedback loop."

Confirmation Bias

"Confirmation bias is when we interpret or favor information that confirms one's personal beliefs."

Transform your intuition into a quantifiable hypothesis!

Computer Vision

tl;dr: most methods are not better than a random baseline.

Why where so many papers published, all claiming improved methods?

Confirmation bias!


  • Information density?
  • Noise expressions?
  • Discreteness?

Tools for research


Use these tools to explain models,
but be hyper-aware of confirmation bias.

Always transform your intuition into
a quantifiable hypothesis.