YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
TSP3D: Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding
This repo contains the models for paper Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding. Code is available at: https://github.com/GWxuan/TSP3D
Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding
Wenxuan Guo*, Xiuwei Xu*, Ziwei Wang, Jianjiang Fengโ , Jie Zhou, Jiwen Lu
* Equal contribution โ Corresponding author
In this work, we propose an efficient multi-level convolution architecture for 3D visual grounding. TSP3D achieves superior performance compared to previous approaches in both inference speed and accuracy.
Main Results
We provide the checkpoints for quick reproduction of the results reported in the paper.
Benchmark Pipeline [email protected] [email protected] Inference Speed (FPS) Downloads ScanRefer Single-stage 56.45 46.71 12.43 model Benchmark Pipeline [email protected] [email protected] Downloads Nr3d Single-stage 48.7 37.0 model Sr3d Single-stage 57.1 44.1 model Comparison of 3DVG methods on ScanRefer dataset:
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support