SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1925119300 of 474278 papers

TitleStatusHype
Detecting Airborne Objects with 5G NR Radars0
ProofNet++: A Neuro-Symbolic System for Formal Proof Verification with Self-Correction0
LLMs Are Globally Multilingual Yet Locally Monolingual: Exploring Knowledge Transfer via Language and Thought Theory0
Improving Reliability and Explainability of Medical Question Answering through Atomic Fact Checking in Retrieval-Augmented LLMs0
Provably Improving Generalization of Few-Shot Models with Synthetic Data0
Searching Clinical Data Using Generative AI0
Cartan Networks: Group theoretical Hyperbolic Deep LearningCode0
Knockoff-Guided Compressive Sensing: A Statistical Machine Learning Framework for Support-Assured Signal RecoveryCode0
Circuit Stability Characterizes Language Model GeneralizationCode0
ERU-KG: Efficient Reference-aligned Unsupervised Keyphrase GenerationCode0
HardTests: Synthesizing High-Quality Test Cases for LLM Coding0
Pretraining Multi-Speaker Identification for Neural Speaker Diarization0
A Flat Minima Perspective on Understanding Augmentations and Model Robustness0
Category-aware EEG image generation based on wavelet transform and contrast semantic lossCode0
MetaFaith: Faithful Natural Language Uncertainty Expression in LLMsCode0
Tag-Evol: Achieving Efficient Instruction Evolving via Tag InjectionCode0
Diffusion-Based Symbolic Regression0
Binary Cumulative Encoding meets Time Series Forecasting0
Weisfeiler and Leman Follow the Arrow of Time: Expressive Power of Message Passing in Temporal Event Graphs0
Can Slow-thinking LLMs Reason Over Time? Empirical Studies in Time Series ForecastingCode1
Multilingual Gloss-free Sign Language Translation: Towards Building a Sign Language Foundation ModelCode0
SwiftEval: Developing a Language-Specific Benchmark for LLM-generated Code Evaluation0
When Humans Growl and Birds Speak: High-Fidelity Voice Conversion from Human to Animal and Designed Sounds0
A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource SettingsCode1
Interpreting Large Text-to-Image Diffusion Models with Dictionary LearningCode0
Mixpert: Mitigating Multimodal Learning Conflicts with Efficient Mixture-of-Vision-Experts0
Beyond Multiple Choice: Evaluating Steering Vectors for Adaptive Free-Form Summarization0
Feature Attribution from First PrinciplesCode0
Effects of Theory of Mind and Prosocial Beliefs on Steering Human-Aligned Behaviors of LLMs in Ultimatum GamesCode0
Diversify and Conquer: Open-set Disagreement for Robust Semi-supervised Learning with OutliersCode0
K^2IE: Kernel Method-based Kernel Intensity Estimators for Inhomogeneous Poisson ProcessesCode0
MMAFFBen: A Multilingual and Multimodal Affective Analysis Benchmark for Evaluating LLMs and VLMsCode0
Don't Erase, Inform! Detecting and Contextualizing Harmful Language in Cultural Heritage CollectionsCode0
Revisiting Epistemic Markers in Confidence Estimation: Can Markers Accurately Reflect Large Language Models' Uncertainty?Code0
CHIP: Chameleon Hash-based Irreversible Passport for Robust Deep Model Ownership Verification and Active Usage ControlCode0
Unsupervised Evolutionary Cell Type Matching via Entropy-Minimized Optimal TransportCode0
Chameleon: A Flexible Data-mixing Framework for Language Model Pretraining and FinetuningCode0
Drop Dropout on Single-Epoch Language Model PretrainingCode0
Biological Pathway Guided Gene Selection Through Collaborative Reinforcement LearningCode0
The Road to Generalizable Neuro-Symbolic Learning Should be Paved with Foundation ModelsCode0
Impact of Bottleneck Layers and Skip Connections on the Generalization of Linear Denoising Autoencoders0
Decoding Knowledge Attribution in Mixture-of-Experts: A Framework of Basic-Refinement Collaboration and Efficiency Analysis0
Random Rule Forest (RRF): Interpretable Ensembles of LLM-Generated Questions for Predicting Startup Success0
BIMA: Bijective Maximum Likelihood Learning Approach to Hallucination Prediction and Mitigation in Large Vision-Language Models0
On the Expressive Power of Mixture-of-Experts for Structured Complex Tasks0
Heterogeneous Graph Masked Contrastive Learning for Robust Recommendation0
On the Scaling of Robustness and Effectiveness in Dense Retrieval0
Anomaly Detection and Improvement of Clusters using Enhanced K-Means Algorithm0
Voice Conversion Improves Cross-Domain Robustness for Spoken Arabic Dialect Identification0
Speech-to-Text Translation with Phoneme-Augmented CoT: Enhancing Cross-Lingual Transfer in Low-Resource Scenarios0
Show:102550
← PrevPage 386 of 9486Next →