SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1220112250 of 474278 papers

TitleStatusHype
How secure is AI-generated Code: A Large-Scale Comparison of Large Language ModelsCode2
InterGen: Diffusion-based Multi-human Motion Generation under Complex InteractionsCode2
Where a Strong Backbone Meets Strong Features -- ActionFormer for Ego4D Moment Queries ChallengeCode2
Reconstructing People, Places, and CamerasCode2
HazyDet: Open-source Benchmark for Drone-view Object Detection with Depth-cues in Hazy ScenesCode2
Deep Unrestricted Document Image RectificationCode2
Compositional Flows for 3D Molecule and Synthesis Pathway Co-designCode2
One Thousand and One Pairs: A "novel" challenge for long-context language modelsCode2
Detecting CSV File Dialects by Table Uniformity Measurement and Data Type InferenceCode2
Hybrid Internal Model: Learning Agile Legged Locomotion with Simulated Robot ResponseCode2
EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning PerformanceCode2
GenLoco: Generalized Locomotion Controllers for Quadrupedal RobotsCode2
Leopard: A Vision Language Model For Text-Rich Multi-Image TasksCode2
Less is More: Mitigating Multimodal Hallucination from an EOS Decision PerspectiveCode2
Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved with TextCode2
ProbPose: A Probabilistic Approach to 2D Human Pose EstimationCode2
CARD: Classification and Regression Diffusion ModelsCode2
DeTPP: Leveraging Object Detection for Robust Long-Horizon Event PredictionCode2
TorchSpatial: A Location Encoding Framework and Benchmark for Spatial Representation LearningCode2
OrthoPlanes: A Novel Representation for Better 3D-Awareness of GANsCode2
I2MoE: Interpretable Multimodal Interaction-aware Mixture-of-ExpertsCode2
Box2Mask: Box-supervised Instance Segmentation via Level-set EvolutionCode2
GSO: Challenging Software Optimization Tasks for Evaluating SWE-AgentsCode2
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic ControlCode2
OcclusionFusion: Occlusion-aware Motion Estimation for Real-time Dynamic 3D ReconstructionCode2
Motion-X: A Large-scale 3D Expressive Whole-body Human Motion DatasetCode2
Routoo: Learning to Route to Large Language Models EffectivelyCode2
Diff2Lip: Audio Conditioned Diffusion Models for Lip-SynchronizationCode2
Objects as PointsCode2
Do You Remember? Dense Video Captioning with Cross-Modal Memory RetrievalCode2
CVE-Bench: A Benchmark for AI Agents' Ability to Exploit Real-World Web Application VulnerabilitiesCode2
Crab: A Unified Audio-Visual Scene Understanding Model with Explicit CooperationCode2
Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion ModelsCode2
Dataset Regeneration for Sequential RecommendationCode2
CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language ModelsCode2
M^2SNet: Multi-scale in Multi-scale Subtraction Network for Medical Image SegmentationCode2
VQF: Highly Accurate IMU Orientation Estimation with Bias Estimation and Magnetic Disturbance RejectionCode2
REAL-Colon: A dataset for developing real-world AI applications in colonoscopyCode2
SODA: Million-scale Dialogue Distillation with Social Commonsense ContextualizationCode2
Context-Aware Video Instance SegmentationCode2
Benchmarking Graph Neural NetworksCode2
PyMAF-X: Towards Well-aligned Full-body Model Regression from Monocular ImagesCode2
Path-RAG: Knowledge-Guided Key Region Retrieval for Open-ended Pathology Visual Question AnsweringCode2
MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World ControlCode2
DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language ModelsCode2
PerCo (SD): Open Perceptual CompressionCode2
Frequency Decoupling for Motion Magnification via Multi-Level Isomorphic ArchitectureCode2
MINERVA: Evaluating Complex Video ReasoningCode2
RoboPianist: Dexterous Piano Playing with Deep Reinforcement LearningCode2
Universal Guidance for Diffusion ModelsCode2
Show:102550
← PrevPage 245 of 9486Next →