SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 82018225 of 177340 papers

TitleStatusHype
Bias and Unfairness in Information Retrieval Systems: New Challenges in the LLM EraCode2
Generating Benchmarks for Factuality Evaluation of Language ModelsCode2
Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?Code2
LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction TuningCode2
WavJourney: Compositional Audio Creation with Large Language ModelsCode2
OpenNRE: An Open and Extensible Toolkit for Neural Relation ExtractionCode2
Temporal Action Detection with Structured Segment NetworksCode2
Flash normalization: fast RMSNorm for LLMsCode2
Adaptive Graph of Thoughts: Test-Time Adaptive Reasoning Unifying Chain, Tree, and Graph StructuresCode2
PivotNet: Vectorized Pivot Learning for End-to-end HD Map ConstructionCode2
An Open-Source American Sign Language Fingerspell Recognition and Semantic Pose Retrieval InterfaceCode2
Neptune: The Long Orbit to Benchmarking Long Video UnderstandingCode2
Unified Vision-Language Pre-Training for Image Captioning and VQACode2
Simulation to Scaled City: Zero-Shot Policy Transfer for Traffic Control via Autonomous VehiclesCode2
SpecReason: Fast and Accurate Inference-Time Compute via Speculative ReasoningCode2
MatMamba: A Matryoshka State Space ModelCode2
High-dimensional Convolutional Networks for Geometric Pattern RecognitionCode2
Boosting Neural Representations for Videos with a Conditional DecoderCode2
A Pilot Study for Chinese SQL Semantic ParsingCode2
Differentiable Convex Optimization LayersCode2
Thought Cloning: Learning to Think while Acting by Imitating Human ThinkingCode2
DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion ModelsCode2
Transferability of Adversarial Examples to Attack Cloud-based Image Classifier ServiceCode2
A Little Fog for a Large TurnCode2
Torch-Struct: Deep Structured Prediction LibraryCode2
Show:102550
← PrevPage 329 of 7094Next →