SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1715117200 of 474278 papers

TitleStatusHype
Terrier: A Deep Learning Repeat ClassifierCode1
Fair Federated Medical Image Classification Against Quality Shift via Inter-Client Progressive State MatchingCode1
CoRe^2: Collect, Reflect and Refine to Generate Better and FasterCode1
BIMBA: Selective-Scan Compression for Long-Range Video Question AnsweringCode1
Revisiting semi-supervised learning in the era of foundation modelsCode1
Robust Multimodal Survival Prediction with the Latent Differentiation Conditional Variational AutoEncoderCode1
RewardSDS: Aligning Score Distillation via Reward-Weighted SamplingCode1
Prompt to Restore, Restore to Prompt: Cyclic Prompting for Universal Adverse Weather RemovalCode1
AdvAD: Exploring Non-Parametric Diffusion for Imperceptible Adversarial AttacksCode1
AgentDAM: Privacy Leakage Evaluation for Autonomous Web AgentsCode1
CyberLLMInstruct: A New Dataset for Analysing Safety of Fine-Tuned LLMs Using Cyber Security DataCode1
CASTLE: Benchmarking Dataset for Static Code Analyzers and LLMs towards CWE DetectionCode1
Towards Next-Generation Recommender Systems: A Benchmark for Personalized Recommendation Assistant with LLMsCode1
Motion Blender Gaussian Splatting for Dynamic Scene ReconstructionCode1
How Well Does Your Tabular Generator Learn the Structure of Tabular Data?Code1
MP-HSIR: A Multi-Prompt Framework for Universal Hyperspectral Image RestorationCode1
PerCoV2: Improved Ultra-Low Bit-Rate Perceptual Image Compression with Implicit Hierarchical Masked Image ModelingCode1
MOAT: Evaluating LMMs for Capability Integration and Instruction GroundingCode1
AnyMoLe: Any Character Motion In-betweening Leveraging Video Diffusion ModelsCode1
Regulatory DNA sequence Design with Reinforcement LearningCode1
Learning to Detect Objects from Multi-Agent LiDAR Scans without Manual LabelsCode1
CoLMDriver: LLM-based Negotiation Benefits Cooperative Autonomous DrivingCode1
Detecting Backdoor Attacks in Federated Learning via Direction Alignment InspectionCode1
EgoBlind: Towards Egocentric Visual Assistance for the BlindCode1
NullFace: Training-Free Localized Face AnonymizationCode1
X-Field: A Physically Grounded Representation for 3D X-ray ReconstructionCode1
VFM-UDA++: Improving Network Architectures and Data Strategies for Unsupervised Domain Adaptive Semantic SegmentationCode1
BiasEdit: Debiasing Stereotyped Language Models via Model EditingCode1
Chemical reasoning in LLMs unlocks steerable synthesis planning and reaction mechanism elucidationCode1
Rethinking Diffusion Model in High DimensionCode1
PhysVLM: Enabling Visual Language Models to Understand Robotic Physical ReachabilityCode1
Aligning Text to Image in Diffusion Models is Easier Than You ThinkCode1
MegaSR: Mining Customized Semantics and Expressive Guidance for Image Super-ResolutionCode1
SAS: Segment Any 3D Scene with Integrated 2D PriorsCode1
^RFLAV: Rolling Flow matching for infinite Audio Video generationCode1
Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse AttentionCode1
Using Powerful Prior Knowledge of Diffusion Model in Deep Unfolding Networks for Image Compressive SensingCode1
VRMDiff: Text-Guided Video Referring Matting Generation of DiffusionCode1
Can We Detect Failures Without Failure Data? Uncertainty-Aware Runtime Failure Detection for Imitation Learning PoliciesCode1
Enhancing Large Language Models for Hardware Verification: A Novel SystemVerilog Assertion DatasetCode1
EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability TreesCode1
Towards Interpretable Protein Structure Prediction with Sparse AutoencodersCode1
EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open EnvironmentsCode1
Controlling Latent Diffusion Using Latent CLIPCode1
Chain-of-Thought Reasoning In The Wild Is Not Always FaithfulCode1
CFNet: Optimizing Remote Sensing Change Detection through Content-Aware EnhancementCode1
AG-VPReID: A Challenging Large-Scale Benchmark for Aerial-Ground Video-based Person Re-IdentificationCode1
Oasis: One Image is All You Need for Multimodal Instruction Data SynthesisCode1
STEAD: Spatio-Temporal Efficient Anomaly Detection for Time and Compute Sensitive ApplicationsCode1
Source-free domain adaptation based on label reliability for cross-domain bearing fault diagnosisCode1
Show:102550
← PrevPage 344 of 9486Next →