SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 71017150 of 661570 papers

TitleStatusHype
SemiEvol: Semi-supervised Fine-tuning for LLM AdaptationCode2
A Comparative Study on Reasoning Patterns of OpenAI's o1 ModelCode2
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data RewardsCode2
LLMOPT: Learning to Define and Solve General Optimization Problems from ScratchCode2
ARKit LabelMaker: A New Scale for Indoor 3D Scene UnderstandingCode2
PUMA: Empowering Unified MLLM with Multi-granular Visual GenerationCode2
VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual GroundingCode2
UniDrive: Towards Universal Driving Perception Across Camera ConfigurationsCode2
CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language ModelsCode2
Latent Space Chain-of-Embedding Enables Output-free LLM Self-EvaluationCode2
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMsCode2
On the Role of Attention Heads in Large Language Model SafetyCode2
SimLayerKV: A Simple Framework for Layer-Level KV Cache ReductionCode2
Local Off-Grid Weather Forecasting with Multi-Modal Earth Observation DataCode2
CATCH: Channel-Aware multivariate Time Series Anomaly Detection via Frequency PatchingCode2
Stabilize the Latent Space for Image Autoregressive Modeling: A Unified PerspectiveCode2
Explanation-Preserving Augmentation for Semi-Supervised Graph Representation LearningCode2
JudgeBench: A Benchmark for Evaluating LLM-based JudgesCode2
LoD-Loc: Aerial Visual Localization using LoD 3D Map with Neural Wireframe AlignmentCode2
Evaluating Morphological Compositional Generalization in Large Language ModelsCode2
A Prompt-Based Knowledge Graph Foundation Model for Universal In-Context ReasoningCode2
SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video GenerationCode2
GS^3: Efficient Relighting with Triple Gaussian SplattingCode2
WeatherDG: LLM-assisted Diffusion Model for Procedural Weather Generation in Domain-Generalized Semantic SegmentationCode2
Improving Long-Text Alignment for Text-to-Image Diffusion ModelsCode2
VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AICode2
nvTorchCam: An Open-source Library for Camera-Agnostic Differentiable Geometric VisionCode2
Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model DisentanglementCode2
Process Reward Model with Q-Value RankingsCode2
MLLM can see? Dynamic Correction Decoding for Hallucination MitigationCode2
Open World Object Detection: A SurveyCode2
It Takes Two to Tango: Directly Optimizing for Constrained Synthesizability in Generative Molecular DesignCode2
Contrastive learning of cell state dynamics in response to perturbationsCode2
MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language UnderstandingCode2
Multiview Scene GraphCode2
MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language ModelsCode2
When Attention Sink Emerges in Language Models: An Empirical ViewCode2
A Consistency-Aware Spot-Guided Transformer for Versatile and Hierarchical Point Cloud RegistrationCode2
Tex4D: Zero-shot 4D Scene Texturing with Video Diffusion ModelsCode2
Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D ScenesCode2
GraphCLIP: Enhancing Transferability in Graph Foundation Models for Text-Attributed GraphsCode2
A Scalable Communication Protocol for Networks of Large Language ModelsCode2
Locality Alignment Improves Vision-Language ModelsCode2
High-Precision Dichotomous Image Segmentation via Probing Diffusion CapacityCode2
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family ExpertsCode2
Adaptive Probabilistic ODE Solvers Without Adaptive Memory RequirementsCode2
LVD-2M: A Long-take Video Dataset with Temporally Dense CaptionsCode2
Free Video-LLM: Prompt-guided Visual Perception for Efficient Training-free Video LLMsCode2
Simplifying, Stabilizing and Scaling Continuous-Time Consistency ModelsCode2
Beyond Fixed Topologies: Unregistered Training and Comprehensive Evaluation Metrics for 3D Talking HeadsCode2
Show:102550
← PrevPage 143 of 13232Next →