SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1605116100 of 474278 papers

TitleStatusHype
Must Read: A Systematic Survey of Computational PersuasionCode1
Ophora: A Large-Scale Data-Driven Text-Guided Ophthalmic Surgical Video Generation ModelCode1
ISAC: An Invertible and Stable Auditory Filter Bank with Customizable Kernels for ML IntegrationCode1
Asynchronous Multi-Object Tracking with an Event CameraCode1
Overflow Prevention Enhances Long-Context Recurrent LLMsCode1
Finite-Sample-Based Reachability for Safe Control with Gaussian Process DynamicsCode1
Guiding Data Collection via Factored Scaling CurvesCode1
Towards Actionable Pedagogical Feedback: A Multi-Perspective Analysis of Mathematics Teaching and Tutoring DialogueCode1
Measuring General Intelligence with Generated GamesCode1
Kalman Filter Enhanced GRPO for Reinforcement Learning-Based Language Model ReasoningCode1
A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language ModelsCode1
FLUXSynID: A Framework for Identity-Controlled Synthetic Face Generation with Document and Live ImagesCode1
Symbolic Regression with Multimodal Large Language Models and Kolmogorov Arnold NetworksCode1
Chronocept: Instilling a Sense of Time in MachinesCode1
Codifying Character Logic in Role-PlayingCode1
Neural Brain: A Neuroscience-inspired Framework for Embodied AgentsCode1
DocVXQA: Context-Aware Visual Explanations for Document Question AnsweringCode1
Can LLM-based Financial Investing Strategies Outperform the Market in Long Run?Code1
Non-Stationary Time Series Forecasting Based on Fourier Analysis and Cross Attention MechanismCode1
Unsupervised Learning for Class Distribution MismatchCode1
Semantic-Guided Diffusion Model for Single-Step Image Super-ResolutionCode1
MELLM: Exploring LLM-Powered Micro-Expression Understanding Enhanced by Subtle Motion PerceptionCode1
Learning Soft Sparse Shapes for Efficient Time-Series ClassificationCode1
BioProBench: Comprehensive Dataset and Benchmark in Biological Protocol Understanding and ReasoningCode1
Hallucination-Aware Multimodal Benchmark for Gastrointestinal Image Analysis with Large Vision-Language ModelsCode1
Multimodal Fake News Detection: MFND Dataset and Shallow-Deep Multitask LearningCode1
JaxRobotarium: Training and Deploying Multi-Robot Policies in 10 MinutesCode1
Quadrupedal Robot Skateboard Mounting via Reverse Curriculum LearningCode1
Reproducing and Improving CheXNet: Deep Learning for Chest X-ray Disease ClassificationCode1
FNBench: Benchmarking Robust Federated Learning against Noisy LabelsCode1
TS-SUPERB: A Target Speech Processing Benchmark for Speech Self-Supervised Learning ModelsCode1
Edge-Enabled VIO with Long-Tracked Features for High-Accuracy Low-Altitude IoT NavigationCode1
M3CAD: Towards Generic Cooperative Autonomous Driving BenchmarkCode1
Causal Prompt Calibration Guided Segment Anything Model for Open-Vocabulary Multi-Entity SegmentationCode1
MacRAG: Compress, Slice, and Scale-up for Multi-Scale Adaptive Context RAGCode1
SmartPilot: A Multiagent CoPilot for Adaptive and Intelligent ManufacturingCode1
Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language UnderstandingCode1
MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-DesignCode1
FastDup: a scalable duplicate marking tool using speculation-and-test mechanismCode1
RefRef: A Synthetic Dataset and Benchmark for Reconstructing Refractive and Reflective ObjectsCode1
PYRREGULAR: A Unified Framework for Irregular Time Series, with Classification BenchmarksCode1
A Survey on Bridging VLMs and Synthetic DataCode1
Fast Differentiable Modal Simulation of Non-linear Strings, Membranes, and PlatesCode1
Accelerating Diffusion Transformer via Increment-Calibrated Caching with Channel-Aware Singular Value DecompositionCode1
Physics-informed Temporal Difference Metric Learning for Robot Motion PlanningCode1
Cost-Effective, Low Latency Vector Search with Azure Cosmos DBCode1
DiGIT: Multi-Dilated Gated Encoder and Central-Adjacent Region Integrated Decoder for Temporal Action Detection TransformerCode1
MM-Skin: Enhancing Dermatology Vision-Language Model with an Image-Text Dataset Derived from TextbooksCode1
LAPSO: A Unified Optimization View for Learning-Augmented Power System OperationsCode1
Building-Guided Pseudo-Label Learning for Cross-Modal Building Damage MappingCode1
Show:102550
← PrevPage 322 of 9486Next →