SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 64016450 of 177340 papers

TitleStatusHype
Efficient Online Reinforcement Learning with Offline DataCode2
Language Models are Multilingual Chain-of-Thought ReasonersCode2
UniVST: A Unified Framework for Training-free Localized Video Style TransferCode2
Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario AnalysisCode2
Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against JailbreaksCode2
SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion ModelsCode2
Offline Reinforcement Learning for LLM Multi-Step ReasoningCode2
Image Restoration with Mean-Reverting Stochastic Differential EquationsCode2
Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization ApproachCode2
MixFormer: End-to-End Tracking with Iterative Mixed AttentionCode2
Neuro-Symbolic Language Modeling with Automaton-augmented RetrievalCode2
AdaMixer: A Fast-Converging Query-Based Object DetectorCode2
Rethinking Depth Estimation for Multi-View Stereo: A Unified RepresentationCode2
CORE4D: A 4D Human-Object-Human Interaction Dataset for Collaborative Object REarrangementCode2
Mechanistic understanding and validation of large AI models with SemanticLensCode2
FinEval: A Chinese Financial Domain Knowledge Evaluation Benchmark for Large Language ModelsCode2
SPIdepth: Strengthened Pose Information for Self-supervised Monocular Depth EstimationCode2
ThingTalk: An Extensible, Executable Representation Language for Task-Oriented DialoguesCode2
N-BVH: Neural ray queries with bounding volume hierarchiesCode2
Understanding The Robustness in Vision TransformersCode2
TaleCrafter: Interactive Story Visualization with Multiple CharactersCode2
Extreme Video Compression with Pre-trained Diffusion ModelsCode2
Tiny Object Tracking: A Large-scale Dataset and A BaselineCode2
Progressive-Hint Prompting Improves Reasoning in Large Language ModelsCode2
Scaling the leading accuracy of deep equivariant models to biomolecular simulations of realistic sizeCode2
Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMsCode2
Hungry Hungry Hippos: Towards Language Modeling with State Space ModelsCode2
Chameleon: Fast-slow Neuro-symbolic Lane Topology ExtractionCode2
Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral ConstraintsCode2
Learning Deep Time-index Models for Time Series ForecastingCode2
VEXIR2Vec: An Architecture-Neutral Embedding Framework for Binary SimilarityCode2
A Semi-supervised Nighttime Dehazing Baseline with Spatial-Frequency Aware and Realistic Brightness ConstraintCode2
AMT: All-Pairs Multi-Field Transforms for Efficient Frame InterpolationCode2
Queryable Prototype Multiple Instance Learning with Vision-Language Models for Incremental Whole Slide Image ClassificationCode2
HumanBench: Towards General Human-centric Perception with Projector Assisted PretrainingCode2
Teola: Towards End-to-End Optimization of LLM-based ApplicationsCode2
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian LanguagesCode2
Orientation-Independent Chinese Text Recognition in Scene ImagesCode2
Human Motion Diffusion as a Generative PriorCode2
Gradient-Driven 3D Segmentation and Affordance Transfer in Gaussian Splatting Using 2D MasksCode2
ResT V2: Simpler, Faster and StrongerCode2
Beyond Generalization: A Survey of Out-Of-Distribution Adaptation on GraphsCode2
TCTrack: Temporal Contexts for Aerial TrackingCode2
HAHA: Highly Articulated Gaussian Human Avatars with Textured Mesh PriorCode2
Cross-Scale MAE: A Tale of Multi-Scale Exploitation in Remote SensingCode2
Robust Dynamic Facial Expression RecognitionCode2
PUGS: Zero-shot Physical Understanding with Gaussian SplattingCode2
MAGE: A Multi-Agent Engine for Automated RTL Code GenerationCode2
ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single ImageCode2
OneLLM: One Framework to Align All Modalities with LanguageCode2
Show:102550
← PrevPage 129 of 3547Next →