SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 55015550 of 661570 papers

TitleStatusHype
ForesightNav: Learning Scene Imagination for Efficient ExplorationCode2
Text-based Animatable 3D Avatars with Morphable Model AlignmentCode2
WASP: Benchmarking Web Agent Security Against Prompt Injection AttacksCode2
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM AgentsCode2
DyFo: A Training-Free Dynamic Focus Visual Search for Enhancing LMMs in Fine-Grained Visual UnderstandingCode2
MARFT: Multi-Agent Reinforcement Fine-TuningCode2
Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMsCode2
Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for ReasoningCode2
FlowReasoner: Reinforcing Query-Level Meta-AgentsCode2
Learning Adaptive Parallel Reasoning with Language ModelsCode2
Retrieval Augmented Generation Evaluation in the Era of Large Language Models: A Comprehensive SurveyCode2
Vision6D: 3D-to-2D Interactive Visualization and Annotation Tool for 6D Pose EstimationCode2
Roll the dice & look before you leap: Going beyond the creative limits of next-token predictionCode2
Seurat: From Moving Points to DepthCode2
Generative Auto-Bidding with Value-Guided ExplorationsCode2
NTIRE 2025 Challenge on Image Super-Resolution (4): Methods and ResultsCode2
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group LearningCode2
SG-Reg: Generalizable and Efficient Scene Graph RegistrationCode2
SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent RepresentationCode2
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative ReasonersCode2
CLIP-Powered Domain Generalization and Domain Adaptation: A Comprehensive SurveyCode2
Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at ScaleCode2
LangCoop: Collaborative Driving with LanguageCode2
EyecareGPT: Boosting Comprehensive Ophthalmology Understanding with Tailored Dataset, Benchmark and ModelCode2
NoisyRollout: Reinforcing Visual Reasoning with Data AugmentationCode2
Taccel: Scaling Up Vision-based Tactile Robotics via High-performance GPU SimulationCode2
GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic TasksCode2
Digital Twin Generation from Visual Data: A SurveyCode2
An All-Atom Generative Model for Designing Protein ComplexesCode2
Embodied-R: Collaborative Framework for Activating Embodied Spatial Reasoning in Foundation Models via Reinforcement LearningCode2
TongUI: Building Generalized GUI Agents by Learning from Multimodal Web TutorialsCode2
NTIRE 2025 Challenge on Day and Night Raindrop Removal for Dual-Focused Images: Methods and ResultsCode2
Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective ResamplingCode2
Real-time High-fidelity Gaussian Human Avatars with Position-based Interpolation of Spatially Distributed MLPsCode2
Enhancing Person-to-Person Virtual Try-On with Multi-Garment Virtual Try-OffCode2
Sleep-time Compute: Beyond Inference Scaling at Test-timeCode2
Representation Learning for Tabular Data: A Comprehensive SurveyCode2
MobilePoser: Real-Time Full-Body Pose Estimation and 3D Human Translation from IMUs in Mobile Consumer DevicesCode2
Logits DeConfusion with CLIP for Few-Shot LearningCode2
Autoregressive Distillation of Diffusion TransformersCode2
Enhancing Autonomous Driving Systems with On-Board Deployed Large Language ModelsCode2
TransST: Transfer Learning Embedded Spatial Factor Modeling of Spatial Transcriptomics DataCode2
Distillation-Supervised Convolutional Low-Rank Adaptation for Efficient Image Super-ResolutionCode2
Multi-scale convolutional transformer network for motor imagery brain-computer interfaceCode2
HypoBench: Towards Systematic and Principled Benchmarking for Hypothesis GenerationCode2
3DAffordSplat: Efficient Affordance Reasoning with 3D GaussiansCode2
An Efficient and Mixed Heterogeneous Model for Image RestorationCode2
MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement LearningCode2
A Survey of Personalization: From RAG to AgentCode2
FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal UnderstandingCode2
Show:102550
← PrevPage 111 of 13232Next →