SOTAVerified|Agents Browse Leaderboard About Blog

Descriptive

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–25 of 1477 papers

Title	Date	Tasks	Status	Hype
Visually Descriptive Language Model for Vector Graphics Reasoning	Apr 9, 2024	DescriptiveLanguage Modeling	CodeCode Available	9
T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy	Mar 21, 2024	Contrastive LearningDescriptive	CodeCode Available	7
AudioGen: Textually Guided Audio Generation	Sep 30, 2022	Audio GenerationDescriptive	CodeCode Available	6
Fundamental Components of Deep Learning: A category-theoretic approach	Mar 13, 2024	Deep LearningDescriptive	CodeCode Available	5
ReMEmbR: Building and Reasoning Over Long-Horizon Spatio-Temporal Memory for Robot Navigation	Sep 20, 2024	DescriptiveQuestion Answering	CodeCode Available	3
Descriptive Image Quality Assessment in the Wild	May 29, 2024	DescriptiveImage Quality Assessment	CodeCode Available	3
Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation	Jun 2, 2025	4kDescriptive	CodeCode Available	3
Remote Sensing Temporal Vision-Language Models: A Comprehensive Survey	Dec 3, 2024	Change DetectionDescriptive	CodeCode Available	3
Tokenization, Fusion, and Augmentation: Towards Fine-grained Multi-modal Entity Representation	Apr 15, 2024	Contrastive LearningDescriptive	CodeCode Available	3
Fine-Tuning Language Models from Human Preferences	Sep 18, 2019	DescriptiveLanguage Modelling	CodeCode Available	3
A Survey on Self-Supervised Learning for Non-Sequential Tabular Data	Feb 2, 2024	Contrastive LearningDescriptive	CodeCode Available	3
PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning	Nov 21, 2022	3D Classification3D Object Detection	CodeCode Available	2
ReID5o: Achieving Omni Multi-modal Person Re-identification in a Single Model	Jun 11, 2025	cross-modal alignmentDescriptive	CodeCode Available	2
Q-Insight: Understanding Image Quality via Visual Reinforcement Learning	Mar 28, 2025	DescriptiveImage Quality Assessment	CodeCode Available	2
MedCalc-Bench: Evaluating Large Language Models for Medical Calculations	Jun 17, 2024	DescriptiveMedical Diagnosis	CodeCode Available	2
K-LITE: Learning Transferable Visual Models with External Knowledge	Apr 20, 2022	BenchmarkingDescriptive	CodeCode Available	2
GRiT: A Generative Region-to-text Transformer for Object Understanding	Dec 1, 2022	DecoderDense Captioning	CodeCode Available	2
Language-driven Semantic Segmentation	Jan 10, 2022	DescriptiveFew-Shot Semantic Segmentation	CodeCode Available	2
Fine-grained Image Captioning with CLIP Reward	May 26, 2022	Caption GenerationDescriptive	CodeCode Available	2
FlashSloth: Lightning Multimodal Large Language Models via Embedded Visual Compression	Dec 5, 2024	DescriptiveVisual Question Answering	CodeCode Available	2
An Item is Worth a Prompt: Versatile Image Editing with Disentangled Control	Mar 7, 2024	Descriptive	CodeCode Available	2
DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification	Jul 4, 2024	DescriptiveDiversity	CodeCode Available	2
FlashSloth : Lightning Multimodal Large Language Models via Embedded Visual Compression	Jan 1, 2025	Descriptive	CodeCode Available	2
Customization Assistant for Text-to-image Generation	Dec 5, 2023	DescriptiveImage Generation	CodeCode Available	2
Composed Image Retrieval for Remote Sensing	May 24, 2024	Composed Image Retrieval (CoIR)Descriptive	CodeCode Available	2

Show:10 25 50

← PrevPage 1 of 60Next →

No leaderboard results yet.