SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1930119350 of 474278 papers

TitleStatusHype
MSDA: Combining Pseudo-labeling and Self-Supervision for Unsupervised Domain Adaptation in ASR0
Dynamic Context-Aware Streaming Pretrained Language Model For Inverse Text Normalization0
Improving Multilingual Speech Models on ML-SUPERB 2.0: Fine-tuning with Data Augmentation and LID-Aware CTC0
CodeV-R1: Reasoning-Enhanced Verilog Generation0
Probing the Robustness Properties of Neural Speech CodecsCode0
GridRoute: A Benchmark for LLM-Based Route Planning with Cardinal Movement in Grid EnvironmentsCode0
Learning to Optimally Dispatch Power: Performance on a Nation-Wide Real-World DatasetCode0
VietMix: A Naturally Occurring Vietnamese-English Code-Mixed Corpus with Iterative Augmentation for Machine Translation0
Should I Share this Translation? Evaluating Quality Feedback for User Reliance on Machine TranslationCode0
EXP-Bench: Can AI Conduct AI Research Experiments?Code3
GeoVision Labeler: Zero-Shot Geospatial Classification with Vision and Language ModelsCode2
Efficient Neural and Numerical Methods for High-Quality Online Speech Spectrogram Inversion via Gradient Theorem0
Multiple LLM Agents Debate for Equitable Cultural AlignmentCode0
FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning EvaluationCode2
Bench4KE: Benchmarking Automated Competency Question GenerationCode1
Timing is Important: Risk-aware Fund Allocation based on Time-Series ForecastingCode1
R-KV: Redundancy-aware KV Cache Compression for Training-Free Reasoning Models AccelerationCode5
ScienceMeter: Tracking Scientific Knowledge Updates in Language ModelsCode1
A Perception-Based L2 Speech Intelligibility Indicator: Leveraging a Rater's Shadowing and Sequence-to-sequence Voice Conversion0
Towards Effective Code-Integrated ReasoningCode1
BPE Stays on SCRIPT: Structured Encoding for Robust Multilingual PretokenizationCode1
Logits-Based FinetuningCode2
TimeHC-RL: Temporal-aware Hierarchical Cognitive Reinforcement Learning for Enhancing LLMs' Social IntelligenceCode1
ProxyThinker: Test-Time Guidance through Small Visual ReasonersCode1
On Symmetric Losses for Robust Policy Optimization with Noisy PreferencesCode0
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language ModelsCode5
Model Unlearning via Sparse Autoencoder Subspace Guided Projections0
Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry PriorsCode0
SEAR: A Multimodal Dataset for Analyzing AR-LLM-Driven Social Engineering BehaviorsCode0
Large Language Models are Locally Linear MappingsCode1
INSIGHT: A Survey of In-Network Systems for Intelligent, High-Efficiency AI and Topology Optimization0
Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction0
Harnessing Large Language Models for Scientific Novelty Detection0
NexusSum: Hierarchical LLM Agents for Long-Form Narrative Summarization0
Running Conventional Automatic Speech Recognition on Memristor Hardware: A Simulated Approach0
From Invariant Representations to Invariant Data: Provable Robustness to Spurious Correlations via Noisy Counterfactual MatchingCode0
FABLE: A Novel Data-Flow Analysis Benchmark on Procedural Text for Large Language Model EvaluationCode0
AutoChemSchematic AI: A Closed-Loop, Physics-Aware Agentic Framework for Auto-Generating Chemical Process and Instrumentation Diagrams0
PRISM: A Framework for Producing Interpretable Political Bias Embeddings with Political-Aware Cross-EncoderCode0
Seeing is Not Reasoning: MVPBench for Graph-based Evaluation of Multi-path Visual Physical CoTCode1
Chameleon: A MatMul-Free Temporal Convolutional Network Accelerator for End-to-End Few-Shot and Continual Learning from Sequential DataCode1
RT-X Net: RGB-Thermal cross attention network for Low-Light Image EnhancementCode1
Non-collective Calibrating Strategy for Time Series ForecastingCode0
Dc-EEMF: Pushing depth-of-field limit of photoacoustic microscopy via decision-level constrained learning0
LLaMA-XR: A Novel Framework for Radiology Report Generation using LLaMA and QLoRA Fine Tuning0
Why is it easier to predict the epidemic curve than to reconstruct the underlying contact network?0
Super-temporal-resolution Photoacoustic Imaging with Dynamic Reconstruction through Implicit Neural Representation in Sparse-view0
Machine Learning-Based Anomaly Detection of Correlated Sensor Data: An Integrated Principal Component Analysis-Autoencoder Approach0
Toward Knowledge-Guided AI for Inverse Design in Manufacturing: A Perspective on Domain, Physics, and Human-AI Synergy0
Ultrafast High-Flux Single-Photon LiDAR Simulator via Neural Mapping0
Show:102550
← PrevPage 387 of 9486Next →