SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 12511275 of 659983 papers

TitleStatusHype
On the Entropy Dynamics in Reinforcement Fine-Tuning of Large Language Models4
Closing the Loop: Universal Repository Representation with RPG-Encoder4
MOVA: Towards Scalable and Synchronized Video-Audio Generation4
Cautious Weight Decay4
AgentSociety: Large-Scale Simulation of LLM-Driven Generative Agents Advances Understanding of Human Behaviors and SocietyCode4
Expressive Whole-Body 3D Gaussian AvatarCode4
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text RecognitionCode4
SiamMask: A Framework for Fast Online Object Tracking and SegmentationCode4
RewardBench 2: Advancing Reward Model EvaluationCode4
VLN-R1: Vision-Language Navigation via Reinforcement Fine-TuningCode4
HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and ManipulationCode4
The Era of 1-bit LLMs: All Large Language Models are in 1.58 BitsCode4
SAT: Dynamic Spatial Aptitude Training for Multimodal Language ModelsCode4
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RLCode4
UniWorld-V1: High-Resolution Semantic Encoders for Unified Visual Understanding and GenerationCode4
Unified Reward Model for Multimodal Understanding and GenerationCode4
TorchRL: A data-driven decision-making library for PyTorchCode4
What Makes Good In-Context Examples for GPT-3?Code4
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language ModelsCode4
AGS-Mesh: Adaptive Gaussian Splatting and Meshing with Geometric Priors for Indoor Room Reconstruction Using SmartphonesCode4
TOFU: A Task of Fictitious Unlearning for LLMsCode4
Sundial: A Family of Highly Capable Time Series Foundation ModelsCode4
FP8 Formats for Deep LearningCode4
Gaussian Splatting SLAMCode4
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality TeachersCode4
Show:102550
← PrevPage 51 of 26400Next →