This repository contains some of the matrices as described in * Alexander Yom Din, Taelin Karidi, Leshem Choshen, Mor Geva. 2023. Jump to Conclusions: Short-Cutting Transformers With Linear Transformations. ([arXiv:2303.09435](https://arxiv.org/abs/2303.09435)) please cite the paper as: ```bibtex @article{din2023jump, title={Jump to Conclusions: Short-Cutting Transformers With Linear Transformations}, author={Yom Din, Alexander and Karidi, Taelin and Choshen, Leshem and Geva, Mor}, journal={arXiv preprint arXiv:2303.09435}, year={2023}, } ``` For example, the file in `gpt2-medium/wikipedia/6_9.pickle` contains the matrix trained, on the wikipedia dataset, to transform 6th-layer hidden representations of tokens into 9th-layer hidden representations, for the Huggingface transformers `gpt2-medium` model. One loads and multiplies as follows: ``` import pickle import torch with open(file_name, 'rb') as f: mat = pickle.load(f) assert(isinstance(mat, torch.Tensor)) assert(len(mat.shape) == 2) assert(mat.shape[0] == mat.shape[1]) v = torch.rand(mat.shape[1]) w = mat @ v assert(w.shape == v.shape) ``` Some more information is in [https://github.com/sashayd/mat](https://github.com/sashayd/mat).