SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 74767500 of 177340 papers

TitleStatusHype
DuPL: Dual Student with Trustworthy Progressive Learning for Robust Weakly Supervised Semantic SegmentationCode2
AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local AttentionCode2
Equivariant 3D-Conditional Diffusion Models for Molecular Linker DesignCode2
NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and MergingCode2
InFoBench: Evaluating Instruction Following Ability in Large Language ModelsCode2
ForecastBench: A Dynamic Benchmark of AI Forecasting CapabilitiesCode2
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained DiffusionCode2
Improving Text-guided Object Inpainting with Semantic Pre-inpaintingCode2
INQUIRE: A Natural World Text-to-Image Retrieval BenchmarkCode2
Adaptive Personalized Federated LearningCode2
CodeBERTScore: Evaluating Code Generation with Pretrained Models of CodeCode2
A real-time dynamic obstacle tracking and mapping system for UAV navigation and collision avoidance with an RGB-D cameraCode2
Predictive Dynamic FusionCode2
MetaFormer: A Unified Meta Framework for Fine-Grained RecognitionCode2
Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training FrameworkCode2
Deep Patch Visual OdometryCode2
PKU-DyMVHumans: A Multi-View Video Benchmark for High-Fidelity Dynamic Human ModelingCode2
MemEngine: A Unified and Modular Library for Developing Advanced Memory of LLM-based AgentsCode2
LingoQA: Visual Question Answering for Autonomous DrivingCode2
MemLong: Memory-Augmented Retrieval for Long Text ModelingCode2
A Survey on Test-Time Scaling in Large Language Models: What, How, Where, and How Well?Code2
AdaFisher: Adaptive Second Order Optimization via Fisher InformationCode2
Graph4Rec: A Universal Toolkit with Graph Neural Networks for Recommender SystemsCode2
Efficient Online Reinforcement Learning with Offline DataCode2
Language Models are Multilingual Chain-of-Thought ReasonersCode2
Show:102550
← PrevPage 300 of 7094Next →