SOTAVerified

Descriptive

Papers

Showing 101150 of 1477 papers

TitleStatusHype
Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via Text-Only TrainingCode1
Mixture of Low-rank Experts for Transferable AI-Generated Image DetectionCode1
A Fine-tuning Dataset and Benchmark for Large Language Models for Protein UnderstandingCode1
Deep Graph Matching under Quadratic ConstraintCode1
A Bi-directional Transformer for Musical Chord RecognitionCode1
Modeling the Complexity and Descriptive Adequacy of Construction GrammarsCode1
Deep Implicit Statistical Shape Models for 3D Medical Image DelineationCode1
Deep learning based geometric registration for medical images: How accurate can we get without visual features?Code1
A Foundation Language-Image Model of the Retina (FLAIR): Encoding Expert Knowledge in Text SupervisionCode1
MultiFace: A Generic Training Mechanism for Boosting Face Recognition PerformanceCode1
JAMMIN-GPT: Text-based Improvisation using LLMs in Ableton LiveCode1
Hybrid Symbolic-Numeric Library for Power System Modeling and AnalysisCode1
Descriptive and Predictive Analysis of Euroleague Basketball Games and the Wisdom of Basketball CrowdsCode1
Describe Anything Model for Visual Question Answering on Text-rich ImagesCode1
Human-like Controllable Image Captioning with Verb-specific Semantic RolesCode1
Text-Guided Neural Image InpaintingCode1
HYDRA: A multimodal deep learning framework for malware classificationCode1
CiteTracker: Correlating Image and Text for Visual TrackingCode1
A Good Foundation is Worth Many Labels: Label-Efficient Panoptic SegmentationCode1
ANNdotNET -- deep learning tool on .NET PlatformCode1
DEER: Descriptive Knowledge Graph for Explaining Entity RelationshipsCode1
Dual-Level Collaborative Transformer for Image CaptioningCode1
OpenCQA: Open-ended Question Answering with ChartsCode1
OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video RecognitionCode1
Enhancing Monocular 3D Scene Completion with Diffusion ModelCode1
DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For DrivingCode1
Comprehensive Information Integration Modeling Framework for Video TitlingCode1
IDAS: Intent Discovery with Abstractive SummarizationCode1
Learning Transferable Spatiotemporal Representations from Natural Script KnowledgeCode1
Grounded Entity-Landmark Adaptive Pre-training for Vision-and-Language NavigationCode1
Hallucination-Aware Multimodal Benchmark for Gastrointestinal Image Analysis with Large Vision-Language ModelsCode1
A Linear Time and Space Local Point Cloud Geometry Encoder via Vectorized Kernel Mixture (VecKM)Code1
Causal Modeling of Twitter Activity During COVID-19Code1
GraphXAIN: Narratives to Explain Graph Neural NetworksCode1
HDCC: A Hyperdimensional Computing compiler for classification on embedded systems and high-performance computingCode1
GOAL: Global-local Object Alignment LearningCode1
Generating Parametric BRDFs from Natural Language DescriptionsCode1
Graph BackdoorCode1
What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable InsightsCode1
Natural scene reconstruction from fMRI signals using generative latent diffusionCode1
Bias Loss for Mobile Neural NetworksCode1
Can Knowledge Graphs Simplify Text?Code1
Generating images from caption and vice versa via CLIP-Guided Generative Latent Space SearchCode1
GraphLIME: Local Interpretable Model Explanations for Graph Neural NetworksCode1
Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading BooksCode1
High-Fidelity 3D Face Generation from Natural Language DescriptionsCode1
Aligning LLM Agents by Learning Latent Preference from User EditsCode1
A Sparse and Locally Coherent Morphable Face Model for Dense Semantic Correspondence Across Heterogeneous 3D FacesCode1
GL-RG: Global-Local Representation Granularity for Video CaptioningCode1
FontCLIP: A Semantic Typography Visual-Language Model for Multilingual Font ApplicationsCode1
Show:102550
← PrevPage 3 of 30Next →

No leaderboard results yet.