SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 651700 of 659983 papers

TitleStatusHype
EvoScientist: Towards Multi-Agent Evolving AI Scientists for End-to-End Scientific Discovery5
OpenTSLM: Time-Series Language Models for Reasoning over Multivariate Medical Text- and Time-Series Data5
MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE5
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length5
InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery5
FireRed-Image-Edit-1.0 Technical Report5
SAMTok: Representing Any Mask with Two Words5
CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning5
World Action Models are Zero-shot Policies5
Helios: Real Real-Time Long Video Generation Model5
Rethinking the Design of Reinforcement Learning-Based Deep Research Agents5
Kimi K2.5: Visual Agentic Intelligence5
Training Large Language Models to Reason in a Continuous Latent SpaceCode5
YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual PerceptionCode5
YOLOv6: A Single-Stage Object Detection Framework for Industrial ApplicationsCode5
FasterDiT: Towards Faster Diffusion Transformers Training without Architecture ModificationCode5
OminiControl2: Efficient Conditioning for Diffusion TransformersCode5
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8BCode5
Semantic Operators: A Declarative Model for Rich, AI-based Data ProcessingCode5
OMG-Seg: Is One Model Good Enough For All Segmentation?Code5
Ferret: Refer and Ground Anything Anywhere at Any GranularityCode5
TimeMixer: Decomposable Multiscale Mixing for Time Series ForecastingCode5
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGICode5
SoftHGNN: Soft Hypergraph Neural Networks for General Visual RecognitionCode5
Masked Completion via Structured Diffusion with White-Box TransformersCode5
Inpaint Anything: Segment Anything Meets Image InpaintingCode5
Extreme Compression of Large Language Models via Additive QuantizationCode5
Structural Generalization in Autonomous Cyber Incident Response with Message-Passing Neural Networks and Reinforcement LearningCode5
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient FinetuningCode5
CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological CounselingCode5
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and OpportunitiesCode5
Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and EvaluationCode5
MarS: a Financial Market Simulation Engine Powered by Generative Foundation ModelCode5
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree SearchCode5
CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion ModelsCode5
Arbitrary-steps Image Super-resolution via Diffusion InversionCode5
SQUAT: Stateful Quantization-Aware Training in Recurrent Spiking Neural NetworksCode5
SymbolicAI: A framework for logic-based approaches combining generative models and solversCode5
That Chip Has Sailed: A Critique of Unfounded Skepticism Around AI for Chip DesignCode5
GAPartManip: A Large-scale Part-centric Dataset for Material-Agnostic Articulated Object ManipulationCode5
Very Low Complexity Speech Synthesis Using Framewise Autoregressive GAN (FARGAN) with Pitch PredictionCode5
A quantum semantic framework for natural language processingCode5
Single-seed generation of Brownian paths and integrals for adaptive and high order SDE solversCode5
The Path To Autonomous Cyber DefenseCode5
CityGaussian: Real-time High-quality Large-Scale Scene Rendering with GaussiansCode5
pyvene: A Library for Understanding and Improving PyTorch Models via InterventionsCode5
Re-imagine the Negative Prompt Algorithm: Transform 2D Diffusion into 3D, alleviate Janus problem and BeyondCode5
Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and ValuesCode5
Magic Clothing: Controllable Garment-Driven Image SynthesisCode5
MedCare: Advancing Medical LLMs through Decoupling Clinical Alignment and Knowledge AggregationCode5
Show:102550
← PrevPage 14 of 13200Next →