SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 25012550 of 659983 papers

TitleStatusHype
Retrieval-augmented generation in multilingual settingsCode3
Robust Neural Information Retrieval: An Adversarial and Out-of-distribution PerspectiveCode3
A Comprehensive Survey on Human Video Generation: Challenges, Methods, and InsightsCode3
LightenDiffusion: Unsupervised Low-Light Image Enhancement with Latent-Retinex Diffusion ModelsCode3
An Actionable Framework for Assessing Bias and Fairness in Large Language Model Use CasesCode3
Learning Dynamics of LLM FinetuningCode3
Reinforcement Learning Meets Visual OdometryCode3
Comgra: A Tool for Analyzing and Debugging Neural NetworksCode3
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language ModelsCode3
VisualAgentBench: Towards Large Multimodal Models as Visual Foundation AgentsCode3
SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable MannersCode3
VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series ForecastersCode3
Image Over Text: Transforming Formula Recognition Evaluation with Character Detection MatchingCode3
SpatialBot: Precise Spatial Understanding with Vision Language ModelsCode3
Colorful Diffuse Intrinsic Image Decomposition in the WildCode3
Generative Modeling of Molecular Dynamics TrajectoriesCode3
SonicSim: A customizable simulation platform for speech processing in moving sound source scenariosCode3
Multi-Level Speaker Representation for Target Speaker ExtractionCode3
PDL: A Declarative Prompt Programming LanguageCode3
Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse AutoencodersCode3
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context TrainingCode3
OSDFace: One-Step Diffusion Model for Face RestorationCode3
CityWalker: Learning Embodied Urban Navigation from Web-Scale VideosCode3
Time Travel is Cheating: Going Live with DeepFund for Real-Time Fund Investment BenchmarkingCode3
Prithvi-EO-2.0: A Versatile Multi-Temporal Foundation Model for Earth Observation ApplicationsCode3
Reloc3r: Large-Scale Training of Relative Camera Pose Regression for Generalizable, Fast, and Accurate Visual LocalizationCode3
Attentive Eraser: Unleashing Diffusion Model's Object Removal Potential via Self-Attention Redirection GuidanceCode3
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers UpCode3
UAVs Meet LLMs: Overviews and Perspectives Toward Agentic Low-Altitude MobilityCode3
LLMs can see and hear without any trainingCode3
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMsCode3
PETR: Position Embedding Transformation for Multi-View 3D Object DetectionCode3
EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language ModelsCode3
Improved Denoising Diffusion Probabilistic ModelsCode3
Pareto Front Approximation for Multi-Objective Session-Based Recommender SystemsCode3
Goedel-Prover: A Frontier Model for Open-Source Automated Theorem ProvingCode3
Stonefish: Supporting Machine Learning Research in Marine RoboticsCode3
Soundwave: Less is More for Speech-Text Alignment in LLMsCode3
Slamming: Training a Speech Language Model on One GPU in a DayCode3
AlphaAgent: LLM-Driven Alpha Mining with Regularized Exploration to Counteract Alpha DecayCode3
Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMsCode3
Baichuan-Audio: A Unified Framework for End-to-End Speech InteractionCode3
CrossOver: 3D Scene Cross-Modal AlignmentCode3
Harnessing Multiple Large Language Models: A Survey on LLM EnsembleCode3
BatteryLife: A Comprehensive Dataset and Benchmark for Battery Life PredictionCode3
GoalFlow: Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous DrivingCode3
Reinforcement Learning Outperforms Supervised Fine-Tuning: A Case Study on Audio Question AnsweringCode3
Falcon: A Remote Sensing Vision-Language Foundation ModelCode3
A Survey on Latent ReasoningCode3
Vision-Speech Models: Teaching Speech Models to Converse about ImagesCode3
Show:102550
← PrevPage 51 of 13200Next →