SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 74267450 of 474278 papers

TitleStatusHype
VADB: A Large-Scale Video Aesthetic Database with Professional and Multi-Dimensional AnnotationsCode0
LPLC: A Dataset for License Plate Legibility ClassificationCode0
Rethinking Text-to-SQL: Dynamic Multi-turn SQL Interaction for Real-world Database ExplorationCode0
Torch-Uncertainty: A Deep Learning Framework for Uncertainty QuantificationCode0
Beyond Elicitation: Provision-based Prompt Optimization for Knowledge-Intensive TasksCode0
Panda: Test-Time Adaptation with Negative Data AugmentationCode0
SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic ManipulationCode0
DESS: DeBERTa Enhanced Syntactic-Semantic Aspect Sentiment Triplet ExtractionCode0
SSR: Socratic Self-Refine for Large Language Model ReasoningCode0
Towards Personalized Treatment Plan: Geometrical Model-Agnostic Approach to Counterfactual ExplanationsCode0
Bias-Restrained Prefix Representation Finetuning for Mathematical ReasoningCode0
PISanitizer: Preventing Prompt Injection to Long-Context LLMs via Prompt SanitizationCode0
IDOL: Meeting Diverse Distribution Shifts with Prior Physics for Tropical Cyclone Multi-Task EstimationCode0
VisualMimic: Visual Humanoid Loco-Manipulation via Motion Tracking and Generation0
The Markovian Thinker: Architecture-Agnostic Linear Scaling of Reasoning0
PressTrack-HMR: Pressure-Based Top-Down Multi-Person Global Human Mesh RecoveryCode0
WaterMod: Modular Token-Rank Partitioning for Probability-Balanced LLM WatermarkingCode0
LISA: A Layer-wise Integration and Suppression Approach for Hallucination Mitigation in Multimodal Large Language ModelsCode0
FHIR-AgentBench: Benchmarking LLM Agents for Realistic Interoperable EHR Question AnsweringCode0
Compensating Distribution Drifts in Class-incremental Learning of Pre-trained Vision TransformersCode0
MultiTab: A Scalable Foundation for Multitask Learning on Tabular DataCode0
fastbmRAG: A Fast Graph-Based RAG Framework for Efficient Processing of Large-Scale Biomedical LiteratureCode0
From Static Structures to Ensembles: Studying and Harnessing Protein Structure TokenizationCode0
When Eyes and Ears Disagree: Can MLLMs Discern Audio-Visual Confusion?Code0
MatchAttention: Matching the Relative Positions for High-Resolution Cross-View MatchingCode0
Show:102550
← PrevPage 298 of 18972Next →