SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 47514800 of 661570 papers

TitleStatusHype
Is Value Learning Really the Main Bottleneck in Offline RL?Code3
DANA: Domain-Aware Neurosymbolic Agents for Consistency and AccuracyCode3
Compact 3D Gaussian Splatting for Static and Dynamic Radiance FieldsCode3
MAGiC-SLAM: Multi-Agent Gaussian Globally Consistent SLAMCode3
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2Code3
DPLM-2: A Multimodal Diffusion Protein Language ModelCode3
Automated Formulaic Alpha Generation for Quantitative Investing using Evolutionary AlgorithmsCode3
The False Promise of Imitating Proprietary LLMsCode3
Visual Geometry Grounded Deep Structure From MotionCode3
A Foundation Model for the Earth SystemCode3
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement LearningCode3
Human-level play in the game of Diplomacy by combining language models with strategic reasoningCode3
Improving Text Embeddings with Large Language ModelsCode3
Performance Analysis of Open Source Machine Learning Frameworks for Various Parameters in Single-Threaded and Multi-Threaded ModesCode3
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation ModelsCode3
RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal ControlCode3
Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action ModelsCode3
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon GenerationCode3
Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse AutoencodersCode3
DataDecide: How to Predict Best Pretraining Data with Small ExperimentsCode3
The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax MimicryCode3
Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object DetectionCode3
Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACsCode3
DRCT: Saving Image Super-resolution away from Information BottleneckCode3
TopoX: A Suite of Python Packages for Machine Learning on Topological DomainsCode3
OSUM: Advancing Open Speech Understanding Models with Limited Resources in AcademiaCode3
Emu3: Next-Token Prediction is All You NeedCode3
Multi-SWE-bench: A Multilingual Benchmark for Issue ResolvingCode3
InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation2
MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data2
HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images2
Stochastic Self-Guidance for Training-Free Enhancement of Diffusion Models2
Phi-4-reasoning-vision-15B Technical Report2
Text-to-3D by Stitching a Multi-view Reconstruction Network to a Video Generator2
NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation2
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding2
From Word to World: Can Large Language Models be Implicit Text-based World Models?2
Latent Particle World Models: Self-supervised Object-centric Stochastic Dynamics Modeling2
Physical Simulator In-the-Loop Video Generation2
ArcFlow: Unleashing 2-Step Text-to-Image Generation via High-Precision Non-Linear Flow Distillation2
Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight2
Endless Terminals: Scaling RL Environments for Terminal Agents2
Experiential Reinforcement Learning2
PyVision-RL: Forging Open Agentic Vision Models via RL2
TOPReward: Token Probabilities as Hidden Zero-Shot Rewards for Robotics2
AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents2
Should We Still Pretrain Encoders with Masked Language Modeling?2
Streaming Autoregressive Video Generation via Diagonal Distillation2
Accelerating Streaming Video Large Language Models via Hierarchical Token Compression2
LLM2Vec-Gen: Generative Embeddings from Large Language Models2
Show:102550
← PrevPage 96 of 13232Next →