SII Text 2 Image

university

https://inst-it.github.io/

AI & ML interests

None defined yet.

Row11n

updated a model 6 months ago

SII-T2I/util_models

Updated Jun 29, 2025

Row11n

published a model 6 months ago

SII-T2I/util_models

Updated Jun 29, 2025

wjpoom

authored a paper 9 months ago

CoMP: Continual Multimodal Pre-training for Vision Foundation Models

Paper • 2503.18931 • Published Mar 24, 2025 • 30

Row11n

authored 3 papers 9 months ago

Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning

Paper • 2412.03565 • Published Dec 4, 2024 • 10

Comprehensive Multi-Modal Prototypes are Simple and Effective Classifiers for Vast-Vocabulary Object Detection

Paper • 2412.17800 • Published Dec 23, 2024

CoMP: Continual Multimodal Pre-training for Vision Foundation Models

Paper • 2503.18931 • Published Mar 24, 2025 • 30

wjpoom

authored 2 papers about 1 year ago

Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning

Paper • 2412.03565 • Published Dec 4, 2024 • 10

Synthesize, Diagnose, and Optimize: Towards Fine-Grained Vision-Language Understanding

Paper • 2312.00081 • Published Nov 30, 2023 • 2