SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1935119400 of 474278 papers

TitleStatusHype
A Survey of Generative Categories and Techniques in Multimodal Large Language Models0
Composite Reward Design in PPO-Driven Adaptive FilteringCode0
Zero-Shot Adaptation of Parameter-Efficient Fine-Tuning in Diffusion Models0
Contextual Integrity in LLMs via Reasoning and Reinforcement Learning0
Deep Learning-Based Breast Cancer Detection in Mammography: A Multi-Center Validation Study in Thai Population0
FOLIAGE: Towards Physical Intelligence World Models Via Unbounded Surface Evolution0
Literature Review Of Multi-Agent Debate For Problem-Solving0
Reducing Latency in LLM-Based Natural Language Commands Processing for Robot Navigation0
Human sensory-musculoskeletal modeling and control of whole-body movements0
The End Of Universal Lifelong Identifiers: Identity Systems For The AI Era0
Leveraging Auxiliary Information in Text-to-Video Retrieval: A Review0
Point-MoE: Towards Cross-Domain Generalization in 3D Semantic Segmentation via Mixture-of-Experts0
Preemptive Hallucination Reduction: An Input-Level Approach for Multimodal Language Model0
Parameter-Free Bio-Inspired Channel Attention for Enhanced Cardiac MRI Reconstruction0
SafeCOMM: What about Safety Alignment in Fine-Tuned Telecom Large Language Models?0
Evaluating Prompt Engineering Techniques for Accuracy and Confidence Elicitation in Medical LLMs0
Comparative analysis of privacy-preserving open-source LLMs regarding extraction of diagnostic information from clinical CMR imaging reports0
Prompt Engineer: Analyzing Skill Requirements in the AI Job Market0
Hierarchical Bayesian Knowledge Tracing in Undergraduate Engineering Education0
Knowledge Graphs for Digitized Manuscripts in Jagiellonian Digital Library Application0
Evaluating the Efficacy of LLM-Based Reasoning for Multiobjective HPC Job Scheduling0
DGIQA: Depth-guided Feature Attention and Refinement for Generalizable Image Quality AssessmentCode0
Optimizing Storytelling, Improving Audience Retention, and Reducing Waste in the Entertainment Industry0
Semantics-Guided Generative Image CompressionCode0
The Automated but Risky Game: Modeling Agent-to-Agent Negotiations and Transactions in Consumer MarketsCode1
Cora: Correspondence-aware image editing using few step diffusionCode1
Leave it to the Specialist: Repair Sparse LLMs with Sparse Fine-Tuning via Sparsity EvolutionCode0
Representational Difference ExplanationsCode0
Multi-Group Proportional Representation for Text-to-Image Models0
DINO-R1: Incentivizing Reasoning Capability in Vision Foundation Models0
Enhancing LLM-Based Code Generation with Complexity Metrics: A Feedback-Driven Approach0
Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training0
Characterising the Inductive Biases of Neural Networks on Boolean Data0
LLM Agents Should Employ Security Principles0
Exploring Societal Concerns and Perceptions of AI: A Thematic Analysis through the Lens of Problem-Seeking0
TCM-Ladder: A Benchmark for Multimodal Question Answering on Traditional Chinese Medicine0
Large Language Model-Based Agents for Automated Research Reproducibility: An Exploratory Study in Alzheimer's Disease0
InterMT: Multi-Turn Interleaved Preference Alignment with Human Feedback0
MSQA: Benchmarking LLMs on Graduate-Level Materials Science Reasoning and Knowledge0
Multi-RAG: A Multimodal Retrieval-Augmented Generation System for Adaptive Video Understanding0
mRAG: Elucidating the Design Space of Multi-modal Retrieval-Augmented Generation0
Towards Understanding The Calibration Benefits of Sharpness-Aware Minimization0
DATD3: Depthwise Attention Twin Delayed Deep Deterministic Policy Gradient For Model Free Reinforcement Learning Under Output Feedback Control0
Revisiting Uncertainty Estimation and Calibration of Large Language Models0
Personalized Subgraph Federated Learning with Differentiable Auxiliary Projections0
Combining Deep Architectures for Information Gain estimation and Reinforcement Learning for multiagent field exploration0
Infi-Med: Low-Resource Medical MLLMs with Robust Reasoning Evaluation0
Noise-Robustness Through Noise: Asymmetric LoRA Adaption with Poisoning Expert0
MaCP: Minimal yet Mighty Adaptation via Hierarchical Cosine Projection0
A Benchmark Dataset for Graph Regression with Homogeneous and Multi-Relational Variants0
Show:102550
← PrevPage 388 of 9486Next →