Yysrc
/

SSV2-Pretrained

Model card Files Files and versions

Yysrc commited on Dec 3, 2025

Commit

bf4e17a

·

verified ·

1 Parent(s): 0fa2c72

Create README.md

Files changed (1) hide show

README.md +32 -0

README.md ADDED Viewed

	@@ -0,0 +1,32 @@

+---
+license: apache-2.0
+pipeline_tag: robotics
+library_name: transformers
+---
+# Mantis
+> This is the official checkpoint of **Mantis: A Versatile Vision-Language-Action Model
+with Disentangled Visual Foresight**
+- **Paper:** https://arxiv.org/pdf/2511.16175
+- **Code:** https://github.com/zhijie-group/Mantis
+### 🔥 Highlights
+- **Disentangled Visual Foresight** augments action learning without overburdening the backbone.
+- **Progressive Training** preserves the understanding capabilities of the backbone.
+- **Adaptive Temporal Ensemble** reduces inference cost while maintaining stable control.
+### How to use
+This is the Mantis model pretrained on the [SSV2 dataset](https://www.qualcomm.com/developer/software/something-something-v-2-dataset). For detailed usage please refer to [our repository](https://github.com/zhijie-group/Mantis).
+### 📝 Citation
+If you find our code or models useful in your work, please cite [our paper](https://arxiv.org/pdf/2511.16175):
+```
+@article{yang2025mantis,
+  title={Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight},
+  author={Yang, Yi and Li, Xueqi and Chen, Yiyang and Song, Jin and Wang, Yihan and Xiao, Zipeng and Su, Jiadi and Qiaoben, You and Liu, Pengfei and Deng, Zhijie},
+  journal={arXiv preprint arXiv:2511.16175},
+  year={2025}
+}
+```