SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1170111750 of 474278 papers

TitleStatusHype
SPD Learning for Covariance-Based Neuroimaging Analysis: Perspectives, Methods, and ChallengesCode2
X-Ray: A Sequential 3D Representation For GenerationCode2
BridgeData V2: A Dataset for Robot Learning at ScaleCode2
Controlled Text Generation via Language Model ArithmeticCode2
Diff-BGM: A Diffusion Model for Video Background Music GenerationCode2
Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM AgentCode2
DVMSR: Distillated Vision Mamba for Efficient Super-ResolutionCode2
Open-Set Domain Adaptation for Semantic SegmentationCode2
EMIFF: Enhanced Multi-scale Image Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object DetectionCode2
Learning from Committee: Reasoning Distillation from a Mixture of Teachers with Peer-ReviewCode2
GenNBV: Generalizable Next-Best-View Policy for Active 3D ReconstructionCode2
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token RoutingCode2
MAmmoTH: Building Math Generalist Models through Hybrid Instruction TuningCode2
PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point TrackingCode2
ChinaTravel: A Real-World Benchmark for Language Agents in Chinese Travel PlanningCode2
Decoding speech perception from non-invasive brain recordingsCode2
Towards Explanation for Unsupervised Graph-Level Representation LearningCode2
Any-step Dynamics Model Improves Future Predictions for Online and Offline Reinforcement LearningCode2
Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based ApproachCode2
Pretraining Codomain Attention Neural Operators for Solving Multiphysics PDEsCode2
Rank1: Test-Time Compute for Reranking in Information RetrievalCode2
Generative AI Enables Medical Image Segmentation in Ultra Low-Data RegimesCode2
FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo LabelingCode2
SuperNOVA: Design Strategies and Opportunities for Interactive Visualization in Computational NotebooksCode2
InstantAvatar: Learning Avatars from Monocular Video in 60 SecondsCode2
SelfGNN: Self-Supervised Graph Neural Networks for Sequential RecommendationCode2
Large Language Models to Enhance Bayesian OptimizationCode2
Fourier Priors-Guided Diffusion for Zero-Shot Joint Low-Light Enhancement and DeblurringCode2
A Survey on Federated Fine-tuning of Large Language ModelsCode2
Large-Scale Data Selection for Instruction TuningCode2
AIN: The Arabic INclusive Large Multimodal ModelCode2
MM-REACT: Prompting ChatGPT for Multimodal Reasoning and ActionCode2
Oceanship: A Large-Scale Dataset for Underwater Audio Target RecognitionCode2
Scaling Data-Constrained Language ModelsCode2
Residual Denoising Diffusion ModelsCode2
Empower Structure-Based Molecule Optimization with Gradient Guided Bayesian Flow NetworksCode2
One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language ModelsCode2
SchurVINS: Schur Complement-Based Lightweight Visual Inertial Navigation SystemCode2
Exploring What Why and How: A Multifaceted Benchmark for Causation Understanding of Video AnomalyCode2
MMA: Multi-Modal Adapter for Vision-Language ModelsCode2
Blur-aware Spatio-temporal Sparse Transformer for Video DeblurringCode2
Learn From Zoom: Decoupled Supervised Contrastive Learning For WCE Image ClassificationCode2
Clifford-Steerable Convolutional Neural NetworksCode2
Optimizing Anytime Reasoning via Budget Relative Policy OptimizationCode2
UXsim: An open source macroscopic and mesoscopic traffic simulator in Python -- a technical overviewCode2
Elysium: Exploring Object-level Perception in Videos via MLLMCode2
LGS: A Light-weight 4D Gaussian Splatting for Efficient Surgical Scene ReconstructionCode2
Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language ModelsCode2
Think Twice Before You Act: Enhancing Agent Behavioral Safety with Thought CorrectionCode2
Deep Learning and Foundation Models for Weather Prediction: A SurveyCode2
Show:102550
← PrevPage 235 of 9486Next →