arxiv:2510.07654

Once Is Enough: Lightweight DiT-Based Video Virtual Try-On via One-Time Garment Appearance Injection

Published on Oct 9, 2025

Authors:

Abstract

A novel video virtual try-on method called OIE uses first-frame clothing replacement with content control and pose/mask guidance for efficient temporal video synthesis.

AI-generated summary

Video virtual try-on aims to replace the clothing of a person in a video with a target garment. Current dual-branch architectures have achieved significant success in diffusion models based on the U-Net; however, adapting them to diffusion models built upon the Diffusion Transformer remains challenging. Initially, introducing latent space features from the garment reference branch requires adding or modifying the backbone network, leading to a large number of trainable parameters. Subsequently, the latent space features of garments lack inherent temporal characteristics and thus require additional learning. To address these challenges, we propose a novel approach, OIE (Once is Enough), a virtual try-on strategy based on first-frame clothing replacement: specifically, we employ an image-based clothing transfer model to replace the clothing in the initial frame, and then, under the content control of the edited first frame, utilize pose and mask information to guide the temporal prior of the video generation model in synthesizing the remaining frames sequentially. Experiments show that our method achieves superior parameter efficiency and computational efficiency while still maintaining leading performance under these constraints.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2510.07654 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2510.07654 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2510.07654 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.