Model description

SiT is a self-supervised learning model that combines masked image modeling and contrastive learning. The model is trained on ImageNet-1K.

Model Sources

Model Card Authors

Sara Atito, Muhammad Awais, Josef Kittler

How to use

from modeling_sit import ViTSiTForPreTraining
# reload
model = ViTSiTForPreTraining.from_pretrained("erow/SiT")

BibTeX entry and citation info

@inproceedings{atito2023sit,
  title={SiT is all you need},
  author={Atito, Sara and Awais, Muhammed and Nandam, Srinivasa and Kittler, Josef},
  booktitle={2023 IEEE International Conference on Image Processing (ICIP)},
  pages={2125--2129},
  year={2023},
  organization={IEEE}
}

Downloads last month: 5

Safetensors

Model size

41.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for erow/SiT

SiT: Self-supervised vIsion Transformer

Paper • 2104.03602 • Published Apr 8, 2021