SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 77517775 of 474278 papers

TitleStatusHype
Beyond Softmax: Dual-Branch Sigmoid Architecture for Accurate Class Activation MapsCode0
Jailbreaking in the Haystack0
Reinforcement Learning Foundations for Deep Research Systems: A Survey0
Hulu-Med: A Transparent Generalist Model towards Holistic Medical Vision-Language UnderstandingCode0
CLAX: Fast and Flexible Neural Click Models in JAXCode0
DE3S: Dual-Enhanced Soft-Sparse-Shape Learning for Medical Early Time-Series ClassificationCode0
Noise Injection: Improving Out-of-Distribution Generalization for Limited Size DatasetsCode0
Revisiting Multimodal Positional Encoding in Vision-Language ModelsCode0
CoPRIS: Efficient and Stable Reinforcement Learning via Concurrency-Controlled Partial Rollout with Importance SamplingCode0
Sketch-Augmented Features Improve Learning Long-Range Dependencies in Graph Neural NetworksCode0
TabDSR: Decompose, Sanitize, and Reason for Complex Numerical Reasoning in Tabular Data0
Diffusion Language Models are Super Data Learners0
Decentralized Aerial Manipulation of a Cable-Suspended Load using Multi-Agent Reinforcement Learning0
Towards Fine-Grained Text-to-3D Quality Assessment: A Benchmark and A Two-Stage Rank-Learning Metric0
Incorporating Quality of Life in Climate Adaptation Planning via Reinforcement LearningCode0
UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions0
SyMuPe: Affective and Controllable Symbolic Music Performance0
A Survey on Collaborating Small and Large Language Models for Performance, Cost-effectiveness, Cloud-edge Privacy, and TrustworthinessCode0
PhysicsEval: Inference-Time Techniques to Improve the Reasoning Proficiency of Large Language Models on Physics ProblemsCode0
CoTox: Chain-of-Thought-Based Molecular Toxicity Reasoning and PredictionCode0
FaStfact: Faster, Stronger Long-Form Factuality Evaluations in LLMsCode0
Scalable Evaluation and Neural Models for Compositional GeneralizationCode0
From Insight to Exploit: Leveraging LLM Collaboration for Adaptive Adversarial Text GenerationCode0
Cross-Modal Alignment via Variational Copula ModellingCode0
Climate Adaptation with Reinforcement Learning: Economic vs. Quality of Life Adaptation PathwaysCode0
Show:102550
← PrevPage 311 of 18972Next →