SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1055110575 of 177340 papers

TitleStatusHype
Follow-Your-Canvas: Higher-Resolution Video Outpainting with Extensive Content GenerationCode2
CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal ModelsCode2
Uncertainty Modelling and Robust Observer Synthesis using the Koopman OperatorCode2
VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video GenerationCode2
Multimodal RewardBench: Holistic Evaluation of Reward Models for Vision Language ModelsCode2
Autoregressive Action Sequence Learning for Robotic ManipulationCode2
ZipAR: Accelerating Auto-regressive Image Generation through Spatial LocalityCode2
GNSS/GPS Spoofing and Jamming Identification Using Machine Learning and Deep LearningCode2
Towards Vision-Language Geo-Foundation Model: A SurveyCode2
PoseMamba: Monocular 3D Human Pose Estimation with Bidirectional Global-Local Spatio-Temporal State Space ModelCode2
Deep Learning for Cross-Domain Data Fusion in Urban Computing: Taxonomy, Advances, and OutlookCode2
DRO: A Python Library for Distributionally Robust Optimization in Machine LearningCode2
Model-Preserving Adaptive RoundingCode2
Learning Trajectory-Aware Transformer for Video Super-ResolutionCode2
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask InpaintingCode2
MWFormer: Multi-Weather Image Restoration Using Degradation-Aware TransformersCode2
Reasoning to Attend: Try to Understand How <SEG> Token WorksCode2
PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervisionCode2
Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMsCode2
TableBank: A Benchmark Dataset for Table Detection and RecognitionCode2
No Language Left Behind: Scaling Human-Centered Machine TranslationCode2
EraRAG: Efficient and Incremental Retrieval Augmented Generation for Growing CorporaCode2
SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software DevelopmentCode2
AvatarPoser: Articulated Full-Body Pose Tracking from Sparse Motion SensingCode2
FlipAttack: Jailbreak LLMs via FlippingCode2
Show:102550
← PrevPage 423 of 7094Next →