SOTAVerified

Attribute

Papers

Showing 5175 of 5387 papers

TitleStatusHype
GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement LearningCode2
DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion ModelingCode2
Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D GenerationCode2
Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMsCode2
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group LearningCode2
Objaverse++: Curated 3D Object Dataset with Quality AnnotationsCode2
OpenFACADES: An Open Framework for Architectural Caption and Attribute Data Enrichment via Street View ImageryCode2
Exploring CLIP's Dense Knowledge for Weakly Supervised Semantic SegmentationCode2
Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language ModelsCode2
Is CLIP ideal? No. Can we fix it? Yes!Code2
Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous DrivingCode2
Analyzing and Boosting the Power of Fine-Grained Visual Recognition for Multi-modal Large Language ModelsCode2
EmbodiedEval: Evaluate Multimodal LLMs as Embodied AgentsCode2
MagicFace: High-Fidelity Facial Expression Editing with Action-Unit ControlCode2
DynRefer: Delving into Region-level Multimodal Tasks via Dynamic ResolutionCode2
LAION-SG: An Enhanced Large-Scale Dataset for Training Complex Image-Text Models with Structural AnnotationsCode2
QUEEN: QUantized Efficient ENcoding of Dynamic Gaussians for Streaming Free-viewpoint VideosCode2
DreamMix: Decoupling Object Attributes for Enhanced Editability in Customized Image InpaintingCode2
ResCLIP: Residual Attention for Training-free Dense Vision-language InferenceCode2
Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet EncodingsCode2
Token Merging for Training-Free Semantic Binding in Text-to-Image SynthesisCode2
On the Role of Attention Heads in Large Language Model SafetyCode2
TRESTLE: A Model of Concept Formation in Structured DomainsCode2
LLMEmb: Large Language Model Can Be a Good Embedding Generator for Sequential RecommendationCode2
PerCo (SD): Open Perceptual CompressionCode2
Show:102550
← PrevPage 3 of 216Next →

No leaderboard results yet.