SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1047610500 of 177340 papers

TitleStatusHype
Dynamic Spatial Propagation Network for Depth CompletionCode2
OctoThinker: Mid-training Incentivizes Reinforcement Learning ScalingCode2
Perception Test: A Diagnostic Benchmark for Multimodal Video ModelsCode2
RITA: a Study on Scaling Up Generative Protein Sequence ModelsCode2
Multi-target stain normalization for histology slidesCode2
MedS^3: Towards Medical Small Language Models with Self-Evolved Slow ThinkingCode2
Long-term Frame-Event Visual Tracking: Benchmark Dataset and BaselineCode2
ChaCha for Online AutoMLCode2
Graph-based Topology Reasoning for Driving ScenesCode2
SegFix: Model-Agnostic Boundary Refinement for SegmentationCode2
Geo-Neus: Geometry-Consistent Neural Implicit Surfaces Learning for Multi-view ReconstructionCode2
TrafficGPT: An LLM Approach for Open-Set Encrypted Traffic ClassificationCode2
Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language ModelsCode2
DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language ModelsCode2
Map It Anywhere (MIA): Empowering Bird's Eye View Mapping using Large-scale Public DataCode2
Probability density estimation for sets of large graphs with respect to spectral information using stochastic block modelsCode2
MTAD: Tools and Benchmarks for Multivariate Time Series Anomaly DetectionCode2
One Model to Rule them All: Towards Universal Segmentation for Medical Images with Text PromptsCode2
OctGPT: Octree-based Multiscale Autoregressive Models for 3D Shape GenerationCode2
LeanDojo: Theorem Proving with Retrieval-Augmented Language ModelsCode2
LongVLM: Efficient Long Video Understanding via Large Language ModelsCode2
Geometry-Informed Neural NetworksCode2
MOROCCO: Model Resource Comparison FrameworkCode2
DeepCache: Accelerating Diffusion Models for FreeCode2
Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-PolygraphCode2
Show:102550
← PrevPage 420 of 7094Next →