text | token_short | token_long | p_short | p_long | JS |
---|---|---|---|---|---|
Loading... (need help?) |
A catalog of several million tasks Pythia can do.
We’re sharing datasets that we hope will be useful for language model interpretability.
- Token-bigram and token-trigram prediction: a dataset of n-gram statistics from The Pile (Gao et al. 2020) including tables of one and two token prompts with their most likely completions. One of the simplest “tasks” for a language model is bigram completion.
- for example, during training, 99.8% of the time the model sees
" telome"
, the correct next token is"res"
.
- for example, during training, 99.8% of the time the model sees
- First token deletion: a dataset constructed by differencing the outputs of Pythia-2.8B (Biderman et al. 2023) between four and five token prompts. This method highlights tokens that are extremely predictive in context.
- for example, when prompted with
", or common table"
, the model predicts" expression"
(CTE) with probability 0.37. But, if we prompt with" chloride, or common table"
, then the model predicts" salt"
with probability 0.99.
- for example, when prompted with
The data
In following sections we will give details on the construction and statistics of these datasets. Before continuing, we share some interactive data previews:
- Deletion: the first 25000 rows of pile_scan_4.
- Bigrams: the entirety of pile_top_bigrams, which contains bigrams with suffix probability greater than 50%.
- Trigrams: the first 25000 rows of pile_top_trigrams, which contains trigrams with suffix probability greater than 50% and count greater than 1000.
The columns of the table below:
text
: the two prompts provided. The additional token of backwards context is surrounded by square brackets. The example in the introduction would be written"[_chloride],_or_common_table"
.token_short
: the most likely next token predicted by Pythia-2.8B for the four token prompt.token_long
: the most likely next token predicted by Pythia-2.8B for the five token prompt.p_short
: the probability Pythia-2.8B assigns totoken_short
.p_long
: the probability Pythia-2.8B assigns totoken_long
.JS
: the Jensen-Shannon divergence between the model’s output distributions for the four and five token prompts.
Note:
- in the table, spaces are replaced with underscores for clarity.
- there are offensive tokens in the dataset. We have not removed them.