SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 71517200 of 661570 papers

TitleStatusHype
Adaptive Probabilistic ODE Solvers Without Adaptive Memory RequirementsCode2
Free Video-LLM: Prompt-guided Visual Perception for Efficient Training-free Video LLMsCode2
TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance ControlCode2
Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D ScenesCode2
Tex4D: Zero-shot 4D Scene Texturing with Video Diffusion ModelsCode2
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For FreeCode2
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family ExpertsCode2
A Scalable Communication Protocol for Networks of Large Language ModelsCode2
Learning to Optimize for Mixed-Integer Non-linear Programming with Feasibility GuaranteesCode2
Queryable Prototype Multiple Instance Learning with Vision-Language Models for Incremental Whole Slide Image ClassificationCode2
TRESTLE: A Model of Concept Formation in Structured DomainsCode2
Text4Seg: Reimagining Image Segmentation as Text GenerationCode2
Large Scale Longitudinal Experiments: Estimation and InferenceCode2
Bayesian Enhancement Models for One-to-Many Mapping in Image EnhancementCode2
LLM-Based Multi-Agent Systems are Scalable Graph Generative ModelsCode2
Training-Free Adaptive Diffusion with Bounded Difference Approximation StrategyCode2
Learning Pattern-Specific Experts for Time Series Forecasting Under Patch-level Distribution ShiftCode2
LibEER: A Comprehensive Benchmark and Algorithm Library for EEG-based Emotion RecognitionCode2
SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement LearningCode2
Reconstructive Visual Instruction TuningCode2
ESVO2: Direct Visual-Inertial Odometry with Stereo Event CamerasCode2
Toward General Instruction-Following Alignment for Retrieval-Augmented GenerationCode2
Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent SystemCode2
Look Gauss, No Pose: Novel View Synthesis using Gaussian Splatting without Accurate Pose InitializationCode2
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information StructurizationCode2
pyhgf: A neural network library for predictive codingCode2
On the State of NLP Approaches to Modeling Depression in Social Media: A Post-COVID-19 OutlookCode2
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function OptimizationCode2
JAILJUDGE: A Comprehensive Jailbreak Judge Benchmark with Multi-Agent Enhanced Explanation Evaluation FrameworkCode2
Retriever-and-Memory: Towards Adaptive Note-Enhanced Retrieval-Augmented GenerationCode2
radarODE-MTL: A Multi-Task Learning Framework with Eccentric Gradient Alignment for Robust Radar-Based ECG ReconstructionCode2
DelTA: An Online Document-Level Translation Agent Based on Multi-Level MemoryCode2
Deconstructing equivariant representations in molecular systemsCode2
IncEventGS: Pose-Free Gaussian Splatting from a Single Event CameraCode2
From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven InteractionsCode2
Poison-splat: Computation Cost Attack on 3D Gaussian SplattingCode2
VibeCheck: Discover and Quantify Qualitative Differences in Large Language ModelsCode2
MotionGS: Exploring Explicit Motion Guidance for Deformable 3D Gaussian SplattingCode2
Heating Up Quasi-Monte Carlo Graph Random Features: A Diffusion Kernel PerspectiveCode2
PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object DetectionCode2
Progressive Autoregressive Video Diffusion ModelsCode2
TurboRAG: Accelerating Retrieval-Augmented Generation with Precomputed KV Caches for Chunked TextCode2
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical CodeCode2
OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring ModelingCode2
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMsCode2
Reversible Decoupling Network for Single Image Reflection RemovalCode2
Doob's Lagrangian: A Sample-Efficient Variational Approach to Transition Path SamplingCode2
Benchmarking Agentic Workflow GenerationCode2
COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence ActCode2
VoxelPrompt: A Vision-Language Agent for Grounded Medical Image AnalysisCode2
Show:102550
← PrevPage 144 of 13232Next →