SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 80018025 of 474278 papers

TitleStatusHype
Buffer layers for Test-Time AdaptationCode0
StructLayoutFormer:Conditional Structured Layout Generation via Structure Serialization and DisentanglementCode0
Agent Skills Enable a New Class of Realistic and Trivially Simple Prompt InjectionsCode0
SA^2Net: Scale-Adaptive Structure-Affinity Transformation for Spine Segmentation from Ultrasound Volume Projection ImagingCode0
Curly Flow Matching for Learning Non-gradient Field DynamicsCode0
Accurate Target Privacy Preserving Federated Learning Balancing Fairness and UtilityCode0
Towards Realistic Earth-Observation Constellation Scheduling: Benchmark and MethodologyCode0
TEXT2DB: Integration-Aware Information Extraction with Large Language Model AgentsCode0
ALMGuard: Safety Shortcuts and Where to Find Them as Guardrails for Audio-Language ModelsCode0
SAFE: A Novel Approach to AI Weather Evaluation through Stratified Assessments of Forecasts over EarthCode0
Angular Steering: Behavior Control via Rotation in Activation SpaceCode0
MisSynth: Improving MISSCI Logical Fallacies Classification with Synthetic DataCode0
UnifiedFL: A Dynamic Unified Learning Framework for Equitable FederationCode0
Simulating and Experimenting with Social Media Mobilization Using LLM AgentsCode0
Emu3.5: Native Multimodal Models are World LearnersCode0
BRIQA: Balanced Reweighting in Image Quality Assessment of Pediatric Brain MRICode0
BhashaBench V1: A Comprehensive Benchmark for the Quadrant of Indic Domains0
GMTRouter: Personalized LLM Router over Multi-turn User InteractionsCode0
MaGNet: A Mamba Dual-Hypergraph Network for Stock Prediction via Temporal-Causal and Global Relational LearningCode0
H3M-SSMoEs: Hypergraph-based Multimodal Learning with LLM Reasoning and Style-Structured Mixture of ExpertsCode0
Seeing Clearly and Deeply: An RGBD Imaging Approach with a Bio-inspired Monocentric DesignCode0
Modular Linear Tokenization (MLT)Code0
Precise In-Parameter Concept Erasure in Large Language Models0
OpenGuardrails: A Configurable, Unified, and Scalable Guardrails Platform for Large Language Models0
NoisyGRPO: Incentivizing Multimodal CoT Reasoning via Noise Injection and Bayesian Estimation0
Show:102550
← PrevPage 321 of 18972Next →