SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 14011425 of 659983 papers

TitleStatusHype
Atom of Thoughts for Markov LLM Test-Time ScalingCode4
A-MEM: Agentic Memory for LLM AgentsCode4
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse AttentionCode4
SkyReels-A1: Expressive Portrait Animation in Video Diffusion TransformersCode4
KernelBench: Can LLMs Write Efficient GPU Kernels?Code4
SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language ModelsCode4
Light-A-Video: Training-free Video Relighting via Progressive Light FusionCode4
AgentSociety: Large-Scale Simulation of LLM-Driven Generative Agents Advances Understanding of Human Behaviors and SocietyCode4
Enhance-A-Video: Better Generated Video for FreeCode4
Training Sparse Mixture Of Experts Text Embedding ModelsCode4
CodeI/O: Condensing Reasoning Patterns via Code Input-Output PredictionCode4
Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLMCode4
Accelerating Data Processing and Benchmarking of AI Models for PathologyCode4
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought TemplatesCode4
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth ApproachCode4
Latent Swap Joint Diffusion for 2D Long-Form Latent GenerationCode4
Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and SoundCode4
Self-Supervised Prompt OptimizationCode4
Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech SynthesisCode4
Identify Critical KV Cache in LLM Inference from an Output Perturbation PerspectiveCode4
Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented GenerationCode4
Sundial: A Family of Highly Capable Time Series Foundation ModelsCode4
LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language ModelsCode4
Transcoders Beat Sparse Autoencoders for InterpretabilityCode4
Molecular-driven Foundation Model for Oncologic PathologyCode4
Show:102550
← PrevPage 57 of 26400Next →