SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1850118550 of 474278 papers

TitleStatusHype
Unified Interference-Aware Water-Filling for QoS-Constrained Communication, Sensing, and JRC0
Benchmarking Neural Speech Codec Intelligibility with SITool0
SPACE: Your Genomic Profile Predictor is a Powerful DNA Foundation ModelCode1
High-gain MIMO Beamforming Antenna System for DSRC and mmwave 5G Integration in Autonomous Vehicles0
PMNO: A novel physics guided multi-step neural operator predictor for partial differential equations0
Life Sequence Transformer: Generative Modelling for Counterfactual Simulation0
Stock Market Telepathy: Graph Neural Networks Predicting the Secret Conversations between MINT and G7 Countries0
Pricing the Right to Renege in Search Markets: Evidence from Trucking0
Effect of Insecurity on Agricultural Output in Benue State, Nigeria0
A combined Machine Learning and Finite Element Modelling tool for the surgical planning of craniosynostosis correction0
Sensor Fusion for Track Geometry Monitoring: Integrating On-Board Data and Degradation Models via Kalman Filtering0
Incentivizing Reasoning for Advanced Instruction-Following of Large Language ModelsCode1
iQUEST: An Iterative Question-Guided Framework for Knowledge Base Question Answering0
Invariance Makes LLM Unlearning Resilient Even to Unanticipated Downstream Fine-TuningCode0
LLMs as World Models: Data-Driven and Human-Centered Pre-Event Simulation for Disaster Impact Assessment0
BehaviorBox: Automated Discovery of Fine-Grained Performance Differences Between Language Models0
Why Gradients Rapidly Increase Near the End of Training0
Q-ARDNS-Multi: A Multi-Agent Quantum Reinforcement Learning Framework with Meta-Cognitive Adaptation for Complex 3D Environments0
KDRL: Post-Training Reasoning LLMs via Unified Knowledge Distillation and Reinforcement Learning0
LongDWM: Cross-Granularity Distillation for Building a Long-Term Driving World Model0
PointT2I: LLM-based text-to-image generation via keypoints0
Self-Challenging Language Model Agents0
Fodor and Pylyshyn's Legacy -- Still No Human-like Systematic Compositionality in Neural Networks0
Temporal Variational Implicit Neural Representations0
ExpertLongBench: Benchmarking Language Models on Expert-Level Long-Form Generation Tasks with Structured Checklists0
WebChoreArena: Evaluating Web Browsing Agents on Realistic Tedious Web Tasks0
SAM2-LOVE: Segment Anything Model 2 in Language-aided Audio-Visual Scenes0
Implicit Deformable Medical Image Registration with Learnable Kernels0
Adversarial learning for nonparametric regression: Minimax rate and adaptive estimation0
TSRating: Rating Quality of Diverse Time Series Data by Meta-learning from LLM Judgment0
ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and UnderstandingCode4
OD3: Optimization-free Dataset Distillation for Object DetectionCode1
SAM-I2V: Upgrading SAM to Support Promptable Video Segmentation with Less than 0.2% Training CostCode1
The Surprising Effectiveness of Negative Reinforcement in LLM ReasoningCode2
Reinforcement Learning Tuning for VideoLLMs: Reward Design and Data EfficiencyCode2
SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis0
Optimization Strategies for Variational Quantum Algorithms in Noisy Landscapes0
A 2-Stage Model for Vehicle Class and Orientation Detection with Photo-Realistic Image Generation0
Stop Chasing the C-index: This Is How We Should Evaluate Our Survival Models0
ReconXF: Graph Reconstruction Attack via Public Feature Explanations on Privatized Node Features and Labels0
ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code0
Unlocking Aha Moments via Reinforcement Learning: Advancing Collaborative Visual Comprehension and Generation0
Large Language Models for EEG: A Comprehensive Survey and Taxonomy0
Exploring the Potential of LLMs as Personalized Assistants: Dataset, Evaluation, and AnalysisCode1
Ultra-High-Resolution Image Synthesis: Data, Method and EvaluationCode3
RewardBench 2: Advancing Reward Model EvaluationCode4
OmniV2V: Versatile Video Generation and Editing via Dynamic Content ManipulationCode5
Enhancing Diffusion-based Unrestricted Adversarial Attacks via Adversary Preferences AlignmentCode0
Propaganda and Information Dissemination in the Russo-Ukrainian War: Natural Language Processing of Russian and Western Twitter Narratives0
Balancing Beyond Discrete Categories: Continuous Demographic Labels for Fair Face Recognition0
Show:102550
← PrevPage 371 of 9486Next →