SOTAVerified

Attribute

Papers

Showing 201250 of 5387 papers

TitleStatusHype
Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio ReasoningCode1
Zero-shot Generalist Graph Anomaly Detection with Unified Neighborhood PromptsCode1
Tree of Attributes Prompt Learning for Vision-Language ModelsCode1
When Graph meets Multimodal: Benchmarking on Multimodal Attributed Graphs LearningCode1
Entering Real Social World! Benchmarking the Social Intelligence of Large Language Models from a First-person PerspectiveCode1
From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample EfficiencyCode1
MARs: Multi-view Attention Regularizations for Patch-based Feature Recognition of Space TerrainCode1
Image Watermarks are Removable Using Controllable Regeneration from Clean NoiseCode1
Towards Fairness and Privacy: A Novel Data Pre-processing Optimization Framework for Non-binary Protected AttributesCode1
CliMB: An AI-enabled Partner for Clinical Predictive ModelingCode1
Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models FunctionCode1
Domain Consistency Representation Learning for Lifelong Person Re-IdentificationCode1
ComiCap: A VLMs pipeline for dense captioning of Comic PanelsCode1
Finetuning CLIP to Reason about Pairwise DifferencesCode1
AnyMatch -- Efficient Zero-Shot Entity Matching with a Small Language ModelCode1
MARS: Matching Attribute-aware Representations for Text-based Sequential RecommendationCode1
UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban ScenariosCode1
Unlocking Attributes' Contribution to Successful Camouflage: A Combined Textual and VisualAnalysis StrategyCode1
Toward Enhancing Vehicle Color Recognition in Adverse Conditions: A Dataset and BenchmarkCode1
Navigating Spatio-Temporal Heterogeneity: A Graph Transformer Approach for Traffic ForecastingCode1
Reefknot: A Comprehensive Benchmark for Relation Hallucination Evaluation, Analysis and Mitigation in Multimodal Large Language ModelsCode1
Layerwise Recurrent Router for Mixture-of-ExpertsCode1
What Ails Generative Structure-based Drug Design: Expressivity is Too Little or Too Much?Code1
Diffusion Guided Language ModelingCode1
Concept Conductor: Orchestrating Multiple Personalized Concepts in Text-to-Image SynthesisCode1
Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed InputsCode1
Nested Music Transformer: Sequentially Decoding Compound Tokens in Symbolic Music and Audio GenerationCode1
Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text RetrievalCode1
LADDER: Language Driven Slice Discovery and Error RectificationCode1
Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme DetectionCode1
Dynamic Language Group-Based MoE: Enhancing Code-Switching Speech Recognition with Hierarchical RoutingCode1
MLLM-CompBench: A Comparative Reasoning Benchmark for Multimodal LLMsCode1
TimeInf: Time Series Data Contribution via Influence FunctionsCode1
A Benchmark for Gaussian Splatting Compression and Quality Assessment StudyCode1
Length-Aware Motion Synthesis via Latent DiffusionCode1
Multi-Modal and Multi-Attribute Generation of Single Cells with CFGenCode1
CiteME: Can Language Models Accurately Cite Scientific Claims?Code1
MARS: Paying more attention to visual attributes for text-based person searchCode1
Learning Action and Reasoning-Centric Image Editing from Videos and SimulationsCode1
LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image GenerationCode1
Towards Learning Abductive Reasoning using VSA Distributed RepresentationsCode1
TinyStyler: Efficient Few-Shot Text Style Transfer with Authorship EmbeddingsCode1
Odyssey 2024 - Speech Emotion Recognition Challenge: Dataset, Baseline Framework, and ResultsCode1
AITTI: Learning Adaptive Inclusive Token for Text-to-Image GenerationCode1
RS-GPT4V: A Unified Multimodal Instruction-Following Dataset for Remote Sensing Image UnderstandingCode1
Composing Object Relations and Attributes for Image-Text MatchingCode1
When Reasoning Meets Information Aggregation: A Case Study with Sports NarrativesCode1
Synthesizing Efficient Data with Diffusion Models for Person Re-Identification Pre-TrainingCode1
CMamba: Channel Correlation Enhanced State Space Models for Multivariate Time Series ForecastingCode1
Efficient 3D-Aware Facial Image Editing via Attribute-Specific Prompt LearningCode1
Show:102550
← PrevPage 5 of 108Next →

No leaderboard results yet.