SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 89268950 of 474278 papers

TitleStatusHype
Knowledge-Level Consistency Reinforcement Learning: Dual-Fact Alignment for Long-Form Factuality0
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization0
Pharmacist: Safety Alignment Data Curation for Large Language Models against Harmful Fine-tuningCode0
BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data0
RLFR: Extending Reinforcement Learning for LLMs with Flow Environment0
X-VLA: Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model0
Native Hybrid Attention for Efficient Sequence ModelingCode0
Skill-Targeted Adaptive TrainingCode0
Translution: Unifying Self-attention and Convolution for Adaptive and Relative ModelingCode0
Cooperative Pseudo Labeling for Unsupervised Federated ClassificationCode0
ViConEx-Med: Visual Concept Explainability via Multi-Concept Token Transformer for Medical Image AnalysisCode0
INR-Bench: A Unified Benchmark for Implicit Neural Representations in Multi-Domain Regression and ReconstructionCode0
SGM: A Statistical Godel Machine for Risk-Controlled Recursive Self-ModificationCode0
Complementary and Contrastive Learning for Audio-Visual SegmentationCode0
EpiCache: Episodic KV Cache Management for Long Conversational Question AnsweringCode0
ADEPT: Continual Pretraining via Adaptive Expansion and Dynamic Decoupled TuningCode0
PermLLM: Learnable Channel Permutation for N:M Sparse Large Language ModelsCode0
Blind Video Super-Resolution based on Implicit KernelsCode0
EEG-MedRAG: Enhancing EEG-based Clinical Decision-Making via Hierarchical Hypergraph Retrieval-Augmented GenerationCode0
A Linguistics-Aware LLM Watermarking via Syntactic PredictabilityCode0
Informed Routing in LLMs: Smarter Token-Level Computation for Faster InferenceCode0
Stable Video Infinity: Infinite-Length Video Generation with Error Recycling0
Multimodal Policy Internalization for Conversational Agents0
Haystack Engineering: Context Engineering for Heterogeneous and Agentic Long-Context Evaluation0
Entropy Regularizing Activation: Boosting Continuous Control, Large Language Models, and Image Classification with Activation as Entropy Constraints0
Show:102550
← PrevPage 358 of 18972Next →