SOTAVerified

Descriptive

Papers

Showing 125 of 1477 papers

TitleStatusHype
Visually Descriptive Language Model for Vector Graphics ReasoningCode9
T-Rex2: Towards Generic Object Detection via Text-Visual Prompt SynergyCode7
AudioGen: Textually Guided Audio GenerationCode6
Fundamental Components of Deep Learning: A category-theoretic approachCode5
ReMEmbR: Building and Reasoning Over Long-Horizon Spatio-Temporal Memory for Robot NavigationCode3
Descriptive Image Quality Assessment in the WildCode3
Ultra-High-Resolution Image Synthesis: Data, Method and EvaluationCode3
Remote Sensing Temporal Vision-Language Models: A Comprehensive SurveyCode3
Tokenization, Fusion, and Augmentation: Towards Fine-grained Multi-modal Entity RepresentationCode3
Fine-Tuning Language Models from Human PreferencesCode3
A Survey on Self-Supervised Learning for Non-Sequential Tabular DataCode3
PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world LearningCode2
ReID5o: Achieving Omni Multi-modal Person Re-identification in a Single ModelCode2
Q-Insight: Understanding Image Quality via Visual Reinforcement LearningCode2
MedCalc-Bench: Evaluating Large Language Models for Medical CalculationsCode2
K-LITE: Learning Transferable Visual Models with External KnowledgeCode2
GRiT: A Generative Region-to-text Transformer for Object UnderstandingCode2
Language-driven Semantic SegmentationCode2
Fine-grained Image Captioning with CLIP RewardCode2
FlashSloth: Lightning Multimodal Large Language Models via Embedded Visual CompressionCode2
An Item is Worth a Prompt: Versatile Image Editing with Disentangled ControlCode2
DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image ClassificationCode2
FlashSloth : Lightning Multimodal Large Language Models via Embedded Visual CompressionCode2
Customization Assistant for Text-to-image GenerationCode2
Composed Image Retrieval for Remote SensingCode2
Show:102550
← PrevPage 1 of 60Next →

No leaderboard results yet.