SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 1125111300 of 661570 papers

TitleStatusHype
Vibe Code Bench: Evaluating AI Models on End-to-End Web Application Development0
Using Vision + Language Models to Predict Item Difficulty0
Category-Level Object Shape and Pose Estimation in Less Than a MillisecondCode0
LadderSym: A Multimodal Interleaved Transformer for Music Practice Error DetectionCode0
LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI AssistanceCode0
PTOPOFL: Privacy-Preserving Personalised Federated Learning via Persistent HomologyCode0
BeyondBench: Contamination-Resistant Evaluation of Reasoning in Language ModelsCode0
Beyond Prefixes: Graph-as-Memory Cross-Attention for Knowledge Graph Completion with Large Language ModelsCode0
EgoCampus: Egocentric Pedestrian Eye Gaze Model and DatasetCode0
Still Fresh? Evaluating Temporal Drift in Retrieval BenchmarksCode0
Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement LearningCode0
On Imbalanced Regression with Hoeffding TreesCode0
Parallel Token Prediction for Language ModelsCode0
AbAffinity: A Large Language Model for Predicting Antibody Binding Affinity against SARS-CoV-2Code0
Do We Need All the Synthetic Data? Targeted Image Augmentation via Diffusion ModelsCode0
NeuCLIP: Efficient Large-Scale CLIP Training with Neural Normalizer OptimizationCode0
Generalization of RLVR Using Causal Reasoning as a TestbedCode0
DeNuC: Decoupling Nuclei Detection and Classification in HistopathologyCode0
MOO: A Multi-view Oriented Observations Dataset for Viewpoint Analysis in Cattle Re-IdentificationCode0
Hold-One-Shot-Out (HOSO) for Validation-Free Few-Shot CLIP AdaptersCode0
TabStruct: Measuring Structural Fidelity of Tabular DataCode0
Optimizing Language Models for Crosslingual Knowledge ConsistencyCode0
Improving Multi-View Reconstruction via Texture-Guided Gaussian-Mesh Joint OptimizationCode0
EmbodiedSplat: Online Feed-Forward Semantic 3DGS for Open-Vocabulary 3D Scene Understanding2
RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies2
V_1: Unifying Generation and Self-Verification for Parallel Reasoners1
Helios: Real Real-Time Long Video Generation Model5
CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video2
Discovering mathematical concepts through a multi-agent system0
SELDON: Supernova Explosions Learned by Deep ODE Networks0
Code Fingerprints: Disentangled Attribution of LLM-Generated CodeCode0
Scriboora: Rethinking Human Pose Forecasting0
A Review of Reward Functions for Reinforcement Learning in the context of Autonomous Driving0
Implicit Bias of Per-sample Adam on Separable Data: Departure from the Full-batch Regime0
Reducing hyperparameter sensitivity in measurement-feedback based Ising machines0
NeuroProlog: Multi-Task Fine-Tuning for Neurosymbolic Mathematical Reasoning via the Cocktail Effect0
Learning Approximate Nash Equilibria in Cooperative Multi-Agent Reinforcement Learning via Mean-Field Subsampling0
Context Biasing for Pronunciation-Orthography Mismatch in Automatic Speech Recognition0
CaRe-BN: Precise Moving Statistics for Stabilizing Spiking Neural Networks in Reinforcement LearningCode0
Invariance-Based Dynamic Regret Minimization0
FastWave: Optimized Diffusion Model for Audio Super-Resolution0
Towards Generalized Multimodal Homography Estimation0
Learning in Markov Decision Processes with Exogenous Dynamics0
On the Limits of Sparse Autoencoders: A Theoretical Framework and Reweighted Remedy0
Enhancing Feature Fusion of U-like Networks with Dynamic Skip Connections0
InfinityStory: Unlimited Video Generation with World Consistency and Character-Aware Shot Transitions0
Machine Pareidolia: Protecting Facial Image with Emotional Editing0
Structure-aware Prompt Adaptation from Seen to Unseen for Open-Vocabulary Compositional Zero-Shot Learning0
Generalized non-exponential Gaussian splatting0
Index-Preserving Lightweight Token Pruning for Efficient Document Understanding in Vision-Language Models0
Show:102550
← PrevPage 226 of 13232Next →