SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1965119700 of 474278 papers

TitleStatusHype
Model-Preserving Adaptive RoundingCode2
Pseudo Multi-Source Domain Generalization: Bridging the Gap Between Single and Multi-Source Domain GeneralizationCode0
Don't Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language ModelsCode0
Bounded-Abstention Pairwise Learning to Rank0
PhysicsNeRF: Physics-Guided 3D Reconstruction from Sparse ViewsCode0
ScEdit: Script-based Assessment of Knowledge EditingCode0
Efficiently Access Diffusion Fisher: Within the Outer Product Span SpaceCode0
DyePack: Provably Flagging Test Set Contamination in LLMs Using BackdoorsCode0
Document-Level Text Generation with Minimum Bayes Risk Decoding using Optimal TransportCode0
Gibbs randomness-compression proposition: An efficient deep learningCode0
Diverse Prototypical Ensembles Improve Robustness to Subpopulation ShiftCode0
Efficient Parameter Estimation for Bayesian Network Classifiers using Hierarchical Linear SmoothingCode0
Model Immunization from a Condition Number PerspectiveCode1
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers0
Data-efficient Meta-models for Evaluation of Context-based Questions and Answers in LLMs0
UrbanCraft: Urban View Extrapolation via Hierarchical Sem-Geometric Priors0
Does Machine Unlearning Truly Remove Model Knowledge? A Framework for Auditing Unlearning in LLMs0
ToolHaystack: Stress-Testing Tool-Augmented Language Models in Realistic Long-Term InteractionsCode0
Augment or Not? A Comparative Study of Pure and Augmented Large Language Model RecommendersCode0
Autoformalization in the Era of Large Language Models: A SurveyCode5
Hyperbolic-PDE GNN: Spectral Graph Neural Networks in the Perspective of A System of Hyperbolic Partial Differential EquationsCode0
Discriminative Policy Optimization for Token-Level Reward ModelsCode0
Automatic classification of stop realisation with wav2vec2.0Code0
Translation in the Wild0
CLDTracker: A Comprehensive Language Description for Visual TrackingCode0
Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought0
UAQFact: Evaluating Factual Knowledge Utilization of LLMs on Unanswerable QuestionsCode0
To Trust Or Not To Trust Your Vision-Language Model's PredictionCode1
VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-TuningCode2
Diffusion Guidance Is a Controllable Policy Improvement OperatorCode2
Directed Graph Grammars for Sequence-based LearningCode1
Puzzled by Puzzles: When Vision-Language Models Can't Take a HintCode1
Label-Guided In-Context Learning for Named Entity RecognitionCode1
MathArena: Evaluating LLMs on Uncontaminated Math CompetitionsCode3
CrossLinear: Plug-and-Play Cross-Correlation Embedding for Time Series Forecasting with Exogenous VariablesCode1
FreRA: A Frequency-Refined Augmentation for Contrastive Learning on Time Series ClassificationCode1
AutoSchemaKG: Autonomous Knowledge Graph Construction through Dynamic Schema Induction from Web-Scale CorporaCode4
Foundation Molecular Grammar: Multi-Modal Foundation Models Induce Interpretable Molecular Graph LanguagesCode1
Graph Random Walk with Feature-Label Space Alignment: A Multi-Label Feature Selection MethodCode0
Boosting Domain Incremental Learning: Selecting the Optimal Parameters is All You NeedCode0
Infinite-Instruct: Synthesizing Scaling Code instruction Data with Bidirectional Synthesis and Static VerificationCode0
URWKV: Unified RWKV Model with Multi-state Perspective for Low-light Image RestorationCode1
Distributed Federated Learning for Vehicular Network Security: Anomaly Detection Benefits and Multi-Domain Attack Threats0
Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software EngineeringCode1
Zero-to-Hero: Zero-Shot Initialization Empowering Reference-Based Video Appearance EditingCode1
Towards Privacy-Preserving Fine-Grained Visual Classification via Hierarchical Learning from Label Proportions0
Position Dependent Prediction Combination For Intra-Frame Video Coding0
Neural Interpretable PDEs: Harmonizing Fourier Insights with Attention for Scalable and Interpretable Physics DiscoveryCode1
Adversarial Semantic and Label Perturbation Attack for Pedestrian Attribute Recognition0
CURVE: CLIP-Utilized Reinforcement Learning for Visual Image Enhancement via Simple Image Processing0
Show:102550
← PrevPage 394 of 9486Next →