SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 73767400 of 177340 papers

TitleStatusHype
Medical Hallucinations in Foundation Models and Their Impact on HealthcareCode2
Towards Robust Multi-tab Website FingerprintingCode2
Self-Training with Direct Preference Optimization Improves Chain-of-Thought ReasoningCode2
Tip-Adapter: Training-free Adaption of CLIP for Few-shot ClassificationCode2
Accurate RNA 3D structure prediction using a language model-based deep learning approachCode2
Guidance with Spherical Gaussian Constraint for Conditional DiffusionCode2
TACO: Topics in Algorithmic COde generation datasetCode2
Data-Driven Parametrization of Molecular Mechanics Force Fields for Expansive Chemical Space CoverageCode2
Steerable Scene Generation with Post Training and Inference-Time SearchCode2
VFIMamba: Video Frame Interpolation with State Space ModelsCode2
CodeSteer: Symbolic-Augmented Language Models via Code/Text GuidanceCode2
PyGRF: An improved Python Geographical Random Forest model and case studies in public health and natural disastersCode2
geomstats: a Python Package for Riemannian Geometry in Machine LearningCode2
XCube: Large-Scale 3D Generative Modeling using Sparse Voxel HierarchiesCode2
PAM: Prompting Audio-Language Models for Audio Quality AssessmentCode2
Geometry-Complete Diffusion for 3D Molecule Generation and OptimizationCode2
GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian GenerationCode2
Libra: Building Decoupled Vision System on Large Language ModelsCode2
ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm EngineeringCode2
EFFOcc: A Minimal Baseline for EFficient Fusion-based 3D Occupancy NetworkCode2
Towards Open Vocabulary Learning: A SurveyCode2
SuperFusion: Multilevel LiDAR-Camera Fusion for Long-Range HD Map GenerationCode2
BBTv2: Towards a Gradient-Free Future with Large Language ModelsCode2
SciInstruct: a Self-Reflective Instruction Annotated Dataset for Training Scientific Language ModelsCode2
QFFT, Question-Free Fine-Tuning for Adaptive ReasoningCode2
Show:102550
← PrevPage 296 of 7094Next →