SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1785117900 of 474278 papers

TitleStatusHype
SASVi - Segment Any Surgical VideoCode1
HistoSmith: Single-Stage Histology Image-Label Generation via Conditional Latent Diffusion for Enhanced Cell Segmentation and ClassificationCode1
InTAR: Inter-Task Auto-Reconfigurable Accelerator Design for High Data Volume Variation in DNNsCode1
Bidirectional Diffusion Bridge ModelsCode1
Enhanced Load Forecasting with GAT-LSTM: Leveraging Grid and Temporal FeaturesCode1
IHEval: Evaluating Language Models on Following the Instruction HierarchyCode1
Composite Sketch+Text Queries for Retrieving Objects with Elusive Names and Complex InteractionsCode1
HDT: Hierarchical Discrete Transformer for Multivariate Time Series ForecastingCode1
LDC-MTL: Balancing Multi-Task Learning through Scalable Loss Discrepancy ControlCode1
Measuring Diversity in Synthetic DatasetsCode1
Hierarchical Learning-based Graph Partition for Large-scale Vehicle Routing ProblemsCode1
Heterogeneous Mixture of Experts for Remote Sensing Image Super-ResolutionCode1
Out-of-Distribution Detection on Graphs: A SurveyCode1
Spatial457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Multimodal ModelsCode1
SelfElicit: Your Language Model Secretly Knows Where is the Relevant EvidenceCode1
Hi-End-MAE: Hierarchical encoder-driven masked autoencoders are stronger vision learners for medical image segmentationCode1
From Brainwaves to Brain Scans: A Robust Neural Network for EEG-to-fMRI SynthesisCode1
Direct Ascent Synthesis: Revealing Hidden Generative Capabilities in Discriminative ModelsCode1
Navigating Semantic Drift in Task-Agnostic Class-Incremental LearningCode1
EventEgo3D++: 3D Human Motion Capture from a Head-Mounted Event CameraCode1
Time2Lang: Bridging Time-Series Foundation Models and Large Language Models for Health Sensing Beyond PromptingCode1
Explaining 3D Computed Tomography Classifiers with CounterfactualsCode1
EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question AnsweringCode1
Graph RAG-Tool FusionCode1
DarwinLM: Evolutionary Structured Pruning of Large Language ModelsCode1
TranSplat: Surface Embedding-guided 3D Gaussian Splatting for Transparent Object ManipulationCode1
Towards Efficient and Multifaceted Computer-assisted Pronunciation Training Leveraging Hierarchical Selective State Space Model and Decoupled Cross-entropy LossCode1
Revisiting Non-Acyclic GFlowNets in Discrete EnvironmentsCode1
Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data FilteringCode1
MGPATH: Vision-Language Model with Multi-Granular Prompt Learning for Few-Shot WSI ClassificationCode1
Stay-Positive: A Case for Ignoring Real Image Features in Fake Image DetectionCode1
BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language ModelsCode1
Small Language Model Makes an Effective Long Text ExtractorCode1
Generative Modeling with Bayesian Sample InferenceCode1
EIQP: Execution-time-certified and Infeasibility-detecting QP SolverCode1
Instance-dependent Early StoppingCode1
PlaySlot: Learning Inverse Latent Dynamics for Controllable Object-Centric Video Prediction and PlanningCode1
Principled Data Selection for Alignment: The Hidden Risks of Difficult ExamplesCode1
Joint Modelling Histology and Molecular Markers for Cancer ClassificationCode1
VINP: Variational Bayesian Inference with Neural Speech Prior for Joint ASR-Effective Speech Dereverberation and Blind RIR IdentificationCode1
Space-Aware Instruction Tuning: Dataset and Benchmark for Guide Dog Robots Assisting the Visually ImpairedCode1
Integrating Physics and Data-Driven Approaches: An Explainable and Uncertainty-Aware Hybrid Model for Wind Turbine Power PredictionCode1
Flow Matching for Collaborative FilteringCode1
On Iterative Evaluation and Enhancement of Code Quality Using GPT-4oCode1
MAGELLAN: Metacognitive predictions of learning progress guide autotelic LLM agents in large goal spacesCode1
Diffusion Suction Grasping with Large-Scale Parcel DatasetCode1
JamendoMaxCaps: A Large Scale Music-caption Dataset with Imputed MetadataCode1
MAAT: Mamba Adaptive Anomaly Transformer with association discrepancy for time seriesCode1
Bag of Tricks for Inference-time Computation of LLM ReasoningCode1
MiniF2F in Rocq: Automatic Translation Between Proof Assistants -- A Case StudyCode1
Show:102550
← PrevPage 358 of 9486Next →