SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1995120000 of 474278 papers

TitleStatusHype
Scaling Up Membership Inference: When and How Attacks Succeed on Large Language ModelsCode1
AlphaTrans: A Neuro-Symbolic Compositional Approach for Repository-Level Code Translation and ValidationCode1
Instruction-Tuning Llama-3-8B Excels in City-Scale Mobility PredictionCode1
MLLA-UNet: Mamba-like Linear Attention in an Efficient U-Shape Model for Medical Image SegmentationCode1
Pedestrian Trajectory Prediction with Missing Data: Datasets, Imputation, and BenchmarkingCode1
Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?Code1
Prospective Learning: Learning for a Dynamic FutureCode1
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority LanguagesCode1
Real-Time Personalization for LLM-based Recommendation with Customized In-Context LearningCode1
A Walsh Hadamard Derived Linear Vector Symbolic ArchitectureCode1
DataRec: A Python Library for Standardized and Reproducible Data Management in Recommender SystemsCode1
LGU-SLAM: Learnable Gaussian Uncertainty Matching with Deformable Correlation Sampling for Deep Visual SLAMCode1
TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation ModelsCode1
DAVINCI: A Single-Stage Architecture for Constrained CAD Sketch InferenceCode1
Diceplot: A package for high dimensional categorical data visualizationCode1
EchoFM: Foundation Model for Generalizable Echocardiogram AnalysisCode1
FuseAnyPart: Diffusion-Driven Facial Parts Swapping via Multiple Reference ImagesCode1
Fourier Amplitude and Correlation Loss: Beyond Using L2 Loss for Skillful Precipitation NowcastingCode1
Can Models Help Us Create Better Models? Evaluating LLMs as Data ScientistsCode1
Comparative Analysis of Demonstration Selection Algorithms for LLM In-Context LearningCode1
CausalDiff: Causality-Inspired Disentanglement via Diffusion Model for Adversarial DefenseCode1
Effective and Efficient Adversarial Detection for Vision-Language Models via A Single VectorCode1
DiaMond: Dementia Diagnosis with Multi-Modal Vision Transformers Using MRI and PETCode1
SFDFusion: An Efficient Spatial-Frequency Domain Fusion Network for Infrared and Visible Image FusionCode1
When can classical neural networks represent quantum states?Code1
Emotional RAG: Enhancing Role-Playing Agents through Emotional RetrievalCode1
TPP-Gaze: Modelling Gaze Dynamics in Space and Time with Neural Temporal Point ProcessesCode1
Simulation-Free Training of Neural ODEs on Paired DataCode1
DiffLight: A Partial Rewards Conditioned Diffusion Model for Traffic Signal Control with Missing DataCode1
DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of PlasticityCode1
bit2bit: 1-bit quanta video reconstruction via self-supervised photon predictionCode1
Online Intrinsic Rewards for Decision Making Agents from Large Language Model FeedbackCode1
Is Function Similarity Over-Engineered? Building a BenchmarkCode1
FlexTSF: A Universal Forecasting Model for Time Series with Variable RegularitiesCode1
High-Fidelity Document Stain Removal via A Large-Scale Real-World Dataset and A Memory-Augmented TransformerCode1
Survey of Cultural Awareness in Language Models: Text and BeyondCode1
WaveRoRA: Wavelet Rotary Route Attention for Multivariate Time Series ForecastingCode1
SCRREAM : SCan, Register, REnder And Map:A Framework for Annotating Accurate and Dense 3D Indoor Scenes with a BenchmarkCode1
Lightweight Frequency Masker for Cross-Domain Few-Shot Semantic SegmentationCode1
Solving Epistemic Logic Programs using Generate-and-Test with PropagationCode1
Text-Guided Attention is All You Need for Zero-Shot Robustness in Vision-Language ModelsCode1
Embedding-based classifiers can detect prompt injection attacksCode1
Volumetric Conditioning Module to Control Pretrained Diffusion Models for 3D Medical ImagesCode1
SimRec: Mitigating the Cold-Start Problem in Sequential Recommendation by Integrating Item SimilarityCode1
PK-YOLO: Pretrained Knowledge Guided YOLO for Brain Tumor Detection in Multiplanar MRI SlicesCode1
SAM-Swin: SAM-Driven Dual-Swin Transformers with Adaptive Lesion Enhancement for Laryngo-Pharyngeal Tumor DetectionCode1
f-PO: Generalizing Preference Optimization with f-divergence MinimizationCode1
Multi-Object 3D Grounding with Dynamic Modules and Language-Informed Spatial AttentionCode1
An Efficient Approach to Generate Safe Drivable Space by LiDAR-Camera-HDmap FusionCode1
EconoJax: A Fast & Scalable Economic Simulation in JaxCode1
Show:102550
← PrevPage 400 of 9486Next →