SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 14011450 of 659983 papers

TitleStatusHype
Atom of Thoughts for Markov LLM Test-Time ScalingCode4
A-MEM: Agentic Memory for LLM AgentsCode4
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse AttentionCode4
SkyReels-A1: Expressive Portrait Animation in Video Diffusion TransformersCode4
KernelBench: Can LLMs Write Efficient GPU Kernels?Code4
SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language ModelsCode4
Light-A-Video: Training-free Video Relighting via Progressive Light FusionCode4
AgentSociety: Large-Scale Simulation of LLM-Driven Generative Agents Advances Understanding of Human Behaviors and SocietyCode4
Enhance-A-Video: Better Generated Video for FreeCode4
Training Sparse Mixture Of Experts Text Embedding ModelsCode4
CodeI/O: Condensing Reasoning Patterns via Code Input-Output PredictionCode4
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought TemplatesCode4
Accelerating Data Processing and Benchmarking of AI Models for PathologyCode4
Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLMCode4
Self-Supervised Prompt OptimizationCode4
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth ApproachCode4
Latent Swap Joint Diffusion for 2D Long-Form Latent GenerationCode4
Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and SoundCode4
Identify Critical KV Cache in LLM Inference from an Output Perturbation PerspectiveCode4
Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech SynthesisCode4
Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented GenerationCode4
Sundial: A Family of Highly Capable Time Series Foundation ModelsCode4
Transcoders Beat Sparse Autoencoders for InterpretabilityCode4
LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language ModelsCode4
Molecular-driven Foundation Model for Oncologic PathologyCode4
A foundation model for human-AI collaboration in medical literature miningCode4
Diffusion-Based Planning for Autonomous Driving with Flexible GuidanceCode4
Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by StepCode4
TabularARGN: A Flexible and Efficient Auto-Regressive Framework for Generating High-Fidelity Synthetic DataCode4
Eagle 2: Building Post-Training Data Strategies from Scratch for Frontier Vision-Language ModelsCode4
A New Formulation of Lipschitz Constrained With Functional Gradient Learning for GANsCode4
Generating Structured Outputs from Language Models: Benchmark and StudiesCode4
DiffuEraser: A Diffusion Model for Video InpaintingCode4
Beyond Reward Hacking: Causal Rewards for Large Language Model AlignmentCode4
MonSter: Marry Monodepth to Stereo Unleashes PowerCode4
Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion ModelsCode4
ReARTeR: Retrieval-Augmented Reasoning with Trustworthy Process RewardingCode4
Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video UnderstandingCode4
3DGS-to-PC: Convert a 3D Gaussian Splatting Scene into a Dense Point Cloud or MeshCode4
EdgeTAM: On-Device Track Anything ModelCode4
Motion-X++: A Large-Scale Multimodal 3D Whole-body Human Motion DatasetCode4
The GAN is dead; long live the GAN! A Modern GAN BaselineCode4
RSAR: Restricted State Angle Resolver and Rotated SAR BenchmarkCode4
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation ControlCode4
Strip R-CNN: Large Strip Convolution for Remote Sensing Object DetectionCode4
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision TokenCode4
TransPixeler: Advancing Text-to-Video Generation with TransparencyCode4
A Survey of State of the Art Large Vision Language Models: Alignment, Benchmark, Evaluations and ChallengesCode4
GPT4Scene: Understand 3D Scenes from Videos with Vision-Language ModelsCode4
SVFR: A Unified Framework for Generalized Video Face RestorationCode4
Show:102550
← PrevPage 29 of 13200Next →