Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Konstantinos Kakkavas's picture
6 9

Konstantinos Kakkavas

kkakkavas
·
  • kkakkavas

AI & ML interests

- NLP - CV - docVQA

Organizations

Captonomy team's profile picture

upvoted 3 papers 12 months ago

Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments

Paper • 2501.10893 • Published Jan 18, 2025 • 26

Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks

Paper • 2501.11733 • Published Jan 20, 2025 • 28

UI-TARS: Pioneering Automated GUI Interaction with Native Agents

Paper • 2501.12326 • Published Jan 21, 2025 • 64
upvoted 2 papers about 1 year ago

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1, 2025 • 109

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Paper • 2501.03895 • Published Jan 7, 2025 • 52
upvoted a collection over 1 year ago

Table Transformer

Collection
The Table Transformer (TATR) is a series of object detection models useful for table extraction from PDF images. • 5 items • Updated May 1, 2025 • 26
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs