Testing the Meta In-Context Learning Capabilities of Transformers

This repository explores the behaviour of in-context learning, the same mechanism that GPT models display by adjusting their predictions based on the additional data given in context. Some papers compare the behavior of in-context learning to gradient descent (von Oswald et al., 2022a), where each transformer layer corresponds to a gradient descent step which is implicitly performed in the model. This behavior shows up when the transformer is trained in a meta-learning fashion, by optimizing on a distribution of regression datasets.

In in-context learning mechanism.ipynb I explore a reimplemented a simple version the transformer in von Oswald et al., 2022a. To explore the similarities to gradient descent and explore the structures of the different projections (von Oswald et al., 2022b).

In general-purpose in-context learner.ipynb I continue the exploration of in-context learning to see how it can be used more explicitly for meta-learning (Kirsch et al., 2022). The interesting element of this method is that by augmenting the training set and meta-learning it generalizes to out-of-distribution datasets in few-shot learning setting. Finishing up the latest experiments

In recent work (Minegishi et al. 2025), they extend the analysis of small-scale transformer models for meta in-context learning and hypothesize that the induction head alone can not explain all of the (meta) in-context learning that happens within LLMs. There are also works exploring in-context learning on larger transformers, most prominent example is the Titans architecture (Behrouz, Zhong, and Mirrokni 2024).

Setup

Install the packages using the requirements.txt file.

# using conda
conda create --name icl python=3.11
conda activate icl
# Install the package for meta_icl imports
pip install -e .
# Or run a notebook directly

References

Kirsch, L., Harrison, J., Sohl-Dickstein, J., & Metz, L. (2022). General-Purpose In-Context Learning by Meta-Learning Transformers (arXiv:2212.04458). arXiv. http://arxiv.org/abs/2212.04458

Han, S., Song, J., Gore, J., & Agrawal, P. (2024). Emergence of Abstractions: Concept Encoding and Decoding Mechanism for In-Context Learning in Transformers (arXiv:2412.12276). arXiv. https://doi.org/10.48550/arXiv.2412.12276

von Oswald, J., Niklasson, E., Randazzo, E., Sacramento, J., Mordvintsev, A., Zhmoginov, A., & Vladymyrov, M. (2022a, December 15). Transformers learn in-context by gradient descent. arXiv.Org. https://arxiv.org/abs/2212.07677v2

Olsson, et al., "In-context Learning and Induction Heads", Transformer Circuits Thread, 2022b. https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html.

Minegishi, Gouki, Hiroki Furuta, Shohei Taniguchi, Yusuke Iwasawa, and Yutaka Matsuo. 2025. “In-Context Meta Learning Induces Multi-Phase Circuit Emergence.” https://openreview.net/forum?id=LNMfzv8TNb&referrer=%5Bthe%20profile%20of%20Yutaka%20Matsuo%5D(%2Fprofile%3Fid%3D~Yutaka_Matsuo1) (April 17, 2025).

Behrouz, Ali, Peilin Zhong, and Vahab Mirrokni. 2024. “Titans: Learning to Memorize at Test Time.” doi:10.48550/arXiv.2501.00663.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
assets		assets
meta_icl		meta_icl
output		output
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
general-purpose in-context learner.ipynb		general-purpose in-context learner.ipynb
in-context learning mechanism.ipynb		in-context learning mechanism.ipynb
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Testing the Meta In-Context Learning Capabilities of Transformers

Setup

References

About

Releases

Packages

Languages

License

RobvanGastel/meta-in-context-learning

Folders and files

Latest commit

History

Repository files navigation

Testing the Meta In-Context Learning Capabilities of Transformers

Setup

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages