SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 72017250 of 661570 papers

TitleStatusHype
Mechanistic Indicators of Steering Effectiveness in Large Language Models0
Model-Dowser: Data-Free Importance Probing to Mitigate Catastrophic Forgetting in Multimodal Large Language Models0
NBAvatar: Neural Billboards Avatars with Realistic Hand-Face Interaction0
Embed-RL: Reinforcement Learning for Reasoning-Driven Multimodal Embeddings1
PsihoRo: Depression and Anxiety Romanian Text Corpus0
IDSelect: A RL-Based Cost-Aware Selection Agent for Video-based Multi-Modal Person Recognition0
Reasoning Boosts Opinion Alignment in LLMs0
Scaling Machine Learning Interatomic Potentials with Mixtures of Experts0
Limited Reasoning Space: The cage of long-horizon reasoning in LLMs0
De novo molecular structure elucidation from mass spectra via flow matching0
SIMSPINE: A Biomechanics-Aware Simulation Framework for 3D Spine Motion Annotation and Benchmarking0
ECHOSAT: Estimating Canopy Height Over Space And TimeCode0
Extending Sequence Length is Not All You Need: Effective Integration of Multimodal Signals for Gene Expression Prediction0
Evaluating Zero-Shot and One-Shot Adaptation of Small Language Models in Leader-Follower Interaction0
Truncated Step-Level Sampling with Process Rewards for Retrieval-Augmented Reasoning0
GeoDiff4D: Geometry-Aware Diffusion for 4D Head Avatar Reconstruction0
Subliminal Signals in Preference Labels0
FastLightGen: Fast and Light Video Generation with Fewer Steps and Parameters0
LaST-VLA: Thinking in Latent Spatio-Temporal Space for Vision-Language-Action in Autonomous Driving0
Benchmark of Benchmarks: Unpacking Influence and Code Repository Quality in LLM Safety Benchmarks0
Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought0
From Toil to Thought: Designing for Strategic Exploration and Responsible AI in Systematic Literature Reviews0
Cross-Context Review: Improving LLM Output Quality by Separating Production and Review Sessions0
Structure-Aware Set Transformers: Temporal and Variable-Type Attention Biases for Asynchronous Clinical Time Series0
Conditional Unbalanced Optimal Transport Maps: An Outlier-Robust Framework for Conditional Generative Modeling0
SGG-R^ 3: From Next-Token Prediction to End-to-End Unbiased Scene Graph Generation0
Micro-Diffusion Compression - Binary Tree Tweedie Denoising for Online Probability Estimation0
Deep Tabular Research via Continual Experience-Driven Execution0
A Variational Latent Equilibrium for Learning in Neuronal Circuits0
EXPLORE-Bench: Egocentric Scene Prediction with Long-Horizon Reasoning0
AraModernBERT: Transtokenized Initialization and Long-Context Encoder Modeling for Arabic0
Beyond the Prompt in Large Language Models: Comprehension, In-Context Learning, and Chain-of-Thought0
Leveraging Wikidata for Geographically Informed Sociocultural Bias Dataset Creation: Application to Latin America0
The Epistemic Support-Point Filter: Jaynesian Maximum Entropy Meets Popperian Falsification0
Rethinking the Harmonic Loss via Non-Euclidean Distance Layers0
StyleGallery: Training-free and Semantic-aware Personalized Style Transfer from Arbitrary Image References0
Unlearning the Unpromptable: Prompt-free Instance Unlearning in Diffusion Models0
Beam-Plasma Collective Oscillations in Intense Charged-Particle Beams: Dielectric Response Theory, Langmuir Wave Dispersion, and Unsupervised Detection via Prometheus0
Less is More: Decoder-Free Masked Modeling for Efficient Skeleton Representation Learning0
Contract And Conquer: How to Provably Compute Adversarial Examples for a Black-Box Model?0
On the Reliability of Cue Conflict and Beyond0
Try, Check and Retry: A Divide-and-Conquer Framework for Boosting Long-context Tool-Calling Performance of LLMs0
Speak or Stay Silent: Context-Aware Turn-Taking in Multi-Party Dialogue0
Mango-GS: Enhancing Spatio-Temporal Consistency in Dynamic Scenes Reconstruction using Multi-Frame Node-Guided 4D Gaussian Splatting0
A Multi-Label Temporal Convolutional Framework for Transcription Factor Binding Characterization0
Deactivating Refusal Triggers: Understanding and Mitigating Overrefusal in Safety Alignment0
High-Precision 6DOF Pose Estimation via Global Phase Retrieval in Fringe Projection Profilometry for 3D Mapping0
Agentic AI for Embodied-enhanced Beam Prediction in Low-Altitude Economy Networks0
ARROW: Augmented Replay for RObust World models0
Harnessing Data Asymmetry: Manifold Learning in the Finsler World0
Show:102550
← PrevPage 145 of 13232Next →