SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 75017525 of 177340 papers

TitleStatusHype
UniVST: A Unified Framework for Training-free Localized Video Style TransferCode2
Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario AnalysisCode2
Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against JailbreaksCode2
SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion ModelsCode2
Offline Reinforcement Learning for LLM Multi-Step ReasoningCode2
Image Restoration with Mean-Reverting Stochastic Differential EquationsCode2
Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization ApproachCode2
MixFormer: End-to-End Tracking with Iterative Mixed AttentionCode2
Neuro-Symbolic Language Modeling with Automaton-augmented RetrievalCode2
AdaMixer: A Fast-Converging Query-Based Object DetectorCode2
Rethinking Depth Estimation for Multi-View Stereo: A Unified RepresentationCode2
CORE4D: A 4D Human-Object-Human Interaction Dataset for Collaborative Object REarrangementCode2
Mechanistic understanding and validation of large AI models with SemanticLensCode2
FinEval: A Chinese Financial Domain Knowledge Evaluation Benchmark for Large Language ModelsCode2
SPIdepth: Strengthened Pose Information for Self-supervised Monocular Depth EstimationCode2
ThingTalk: An Extensible, Executable Representation Language for Task-Oriented DialoguesCode2
N-BVH: Neural ray queries with bounding volume hierarchiesCode2
Understanding The Robustness in Vision TransformersCode2
TaleCrafter: Interactive Story Visualization with Multiple CharactersCode2
Extreme Video Compression with Pre-trained Diffusion ModelsCode2
Tiny Object Tracking: A Large-scale Dataset and A BaselineCode2
Progressive-Hint Prompting Improves Reasoning in Large Language ModelsCode2
Scaling the leading accuracy of deep equivariant models to biomolecular simulations of realistic sizeCode2
Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMsCode2
Hungry Hungry Hippos: Towards Language Modeling with State Space ModelsCode2
Show:102550
← PrevPage 301 of 7094Next →