LIT-language Interpretability Tool for Explaining NLP Models

Language Interpretability Tool (LIT) is a browser based UI & toolkit for model interpretability .It is an open-source platform for visualization and understanding of NLP models developed by Google Research Team.

LIT supports models like Regression, Classification, seq2seq,language modelling and structured predictions.

It is Framework-agnostic and compatible with TensorFlow, PyTorch, and more.

Components of LIT are portable, and can easily be used in a Jupyter notebook or standalone script.

Image for post
Image for post
Example LIT structure for classification model
pip install lit-nlp

Refer further installation and demo run guide ,how to set up and run LIT on your local server LIT — Setup Guide (

How to use LIT to analyze different type of Models ?

General Layout : LIT consists of a Python backend and a TypeScript frontend.

UI main components:-

Global Settings :- for multiple model selection ,Dataset selection & layout configuration.

Main menu bar:- for changing color, slice selection, dataset comparison.

Top module section:- for data selection

Bottom module section: for local explanation ,interpretability ,new datapoint generation.

DataTable/Datapoint Selections-LIT displays a loaded dataset and its model results across the set of selected models. Users can dive into detailed results by selecting datapoints from the dataset.

Image for post
Image for post

Datapoint Editor-With datapoint editor we can edit selected datapoint, if selected. Also can create new datapoint with “Make new datapoint” button. Any edit to an existing datapoint must be saved as a new datapoint to be explored, to keep datapoints immutable for simplicity of use.

Slice Editor- You can save selected datapoint/datapoints in Slice editor component for comparison with other models/datapoints , Its like bookmark which you can revisit in future to compare results.

There are different type of modules in LIT for model explanation.

Modules will automatically display if they are applicable to the current model and dataset; for example, the module that shown below referred to classification results only.

Embedding Projection- The embedding projector will show all data points by their embeddings projected down to 3 dimension. This is useful for exploring and understanding clusters of data points.

LIT provides 2 types of embedding projection :-

UMAP- Just like t-SNE, UMAP is a dimensionality reduction specifically designed for visualizing complex data in low dimensions (2D or 3D). As the number of data points increase, UMAP becomes more time efficient compared to TSNE.

PCA- Implemented to reduce the dimensionality of word embedding. In short, it is a feature extraction technique — it combines the variables, and then it drops the least important variables while still retains the valuable parts of the variables.

Note - color of embedding points can be modify either using predicted class/original class or labels on main menu bar.

Image for post
Image for post

Performance Tab

Metrics Table-It shows measures such as accuracy (for classifiers), error (for regression tasks), and BLEU score (for translation tasks) on selected model.

Metrics module shows model metrics not just across the entire dataset, but also for the current selection of datapoints (selected from data table or from slicer).

Confusion Matrix-The confusion matrix shown for all data from the dataset (or the current selected datapoint) for classification model only.

Predictions Tab

Classification Results:-It will show result of a model on the selected datapoint.

Scalars: It will show overall predicted result and selected datapoint probabilities.

Image for post
Image for post
binary classification probability with threshold setting

Saliency Maps: Salience maps show the influence of different parts of inputs features on a model’s prediction on the primary selection.

The background of each text piece is colored by the salience of that piece on the prediction, and hovering on any piece will display the exact value calculated for that piece.

Image for post
Image for post
Saliency map for positive & negative Sentiments

Grad norm Saliency-It processes the Token Gradients and Tokens returned by a model and produces a list of scores for each token.

Grad dot Saliency- It requires a Token Embedding input and corresponding output, as well as a label field Target to pin the gradient target to the same class as an input and corresponding output.

Integrated Gradients -The feature attribution values are calculated by taking the integral of gradients along a straight path from a baseline to the input being analyzed. It also requires same input as grad dot Saliency.

Lime Saliency: It check each word based on how much it influenced the prediction positively or negatively.

Refer below link for more detail explanation and mathematics behind Saliency calculation -Axiomatic Attribution for Deep Networks (

Refer -Lime

Attention : Displays attention visualization for each layer and head. It shows which tokens are attended to between layers of a model

Image for post
Image for post

Counterfactual: Data Point generation via manual edits or auto generated text from existing data to dynamically create and evaluate new examples.

Image for post
Image for post

Scrambler: Scrambles the words in a text feature randomly.

Word Replacer: Provides a text box to define a comma-separated set of replacements to perform (such as “great -> terrible, hi -> hello”).

Word replacer also supports multiple targets per word with “|” separator. For example, “great -> terrible | bad” will produce two outputs where “great” is replaced with “terrible” and “bad”.

Counterfactual Explanation: This is similar to Saliency map but the influence here is calculated by looking at the model results on this datapoint, compared to the results on the rest of the selected datapoints.

Image for post
Image for post

Note -Above components for model interpretability are specific to classification problem only.


-Easy to use -User can switch between visualizations and analysis to test hypothesis and validate those hypothesis over a data point.

-Dynamic Data Generation-New data points can be added using counter fact generation/datapoint editor and their effect on the model can be visualized side-by-side.

-Multi Model Comparison -It allows comparison for two similar kind of models or two data points to be visualized simultaneously. By loading more than one model in the global settings controls, LIT can compare multiple models.

-Explain model behavior - e.g. Questions like -What happen if I change things like textual style, verb tense in my text. How it will change the prediction? What kind of examples does my model perform poorly on? Is linguistic knowledge learned or ignored?

-Configurable-There is a brief overview of how to run LIT with your own models and datasets. LIT is easy to extend with new interpretability components, generators, and more, both on the frontend or the backend .For more details, see the documentation by LIT in code repository.


PAIR-code/lit: The Language Interpretability Tool: Interactively analyze NLP models for model understanding in an extensible and framework agnostic interface. (

LIT — Demos (

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store