SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 80268050 of 474278 papers

TitleStatusHype
MMSci: A Dataset for Graduate-Level Multi-Discipline Multimodal Scientific UnderstandingCode2
SCSA: Exploring the Synergistic Effects Between Spatial and Channel AttentionCode2
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM AgentsCode2
RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic ManipulationCode2
PartCraft: Crafting Creative Objects by PartsCode2
Multi-Branch Auxiliary Fusion YOLO with Re-parameterization Heterogeneous Convolutional for accurate object detectionCode2
Discovering symbolic expressions with parallelized tree searchCode2
SH17: A Dataset for Human Safety and Personal Protective Equipment Detection in Manufacturing IndustryCode2
AWT: Transferring Vision-Language Models via Augmentation, Weighting, and TransportationCode2
Associative Recurrent Memory TransformerCode2
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language ModelsCode2
Pretraining End-to-End Keyword Search with Automatically Discovered Acoustic UnitsCode2
AnySR: Realizing Image Super-Resolution as Any-Scale, Any-ResourceCode2
RPN: Reconciled Polynomial Network Towards Unifying PGMs, Kernel SVMs, MLP and KANCode2
Isomorphic Pruning for Vision ModelsCode2
Benchmarking Complex Instruction-Following with Multiple Constraints CompositionCode2
Mixture of A Million ExpertsCode2
Occupancy as Set of PointsCode2
DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image ClassificationCode2
Unraveling Molecular Structure: A Multimodal Spectroscopic Dataset for ChemistryCode2
TongGu: Mastering Classical Chinese Understanding with Knowledge-Grounded Large Language ModelsCode2
VoxAct-B: Voxel-Based Acting and Stabilizing Policy for Bimanual ManipulationCode2
ChartGemma: Visual Instruction-tuning for Chart Reasoning in the WildCode2
MiniGPT-Med: Large Language Model as a General Interface for Radiology DiagnosisCode2
Craftium: An Extensible Framework for Creating Reinforcement Learning EnvironmentsCode2
Show:102550
← PrevPage 322 of 18972Next →