SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 2070120750 of 474278 papers

TitleStatusHype
GreenLight-Gym: Reinforcement learning benchmark environment for control of greenhouse production systemsCode1
Large Scale MRI Collection and Segmentation of Cirrhotic LiverCode1
Algorithmic Capabilities of Random TransformersCode1
CogDevelop2K: Reversed Cognitive Development in Multimodal Large Language ModelsCode1
MC-CoT: A Modular Collaborative CoT Framework for Zero-shot Medical-VQA with LLM and MLLM IntegrationCode1
Taylor Unswift: Secured Weight Release for Large Language Models via Taylor ExpansionCode1
Where are we in audio deepfake detection? A systematic analysis over generative and detection modelsCode1
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHFCode1
Wrong-of-Thought: An Integrated Reasoning Framework with Multi-Perspective Verification and Wrong InformationCode1
Towards Secure Tuning: Mitigating Security Risks Arising from Benign Instruction Fine-TuningCode1
From Reading to Compressing: Exploring the Multi-document Reader for Prompt CompressionCode1
Text2Chart31: Instruction Tuning for Chart Generation with Automatic FeedbackCode1
IceCloudNet: 3D reconstruction of cloud ice from Meteosat SEVIRICode1
Improving Temporal Link Prediction via Temporal Walk Matrix ProjectionCode1
DB-SAM: Delving into High Quality Universal Medical Image SegmentationCode1
IV-Mixed Sampler: Leveraging Image Diffusion Models for Enhanced Video SynthesisCode1
AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web TextCode1
LongGenBench: Long-context Generation BenchmarkCode1
Beyond Language: Applying MLX Transformers to Engineering PhysicsCode1
Hyperbolic Fine-tuning for Large Language ModelsCode1
Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic PlanningCode1
Embrace rejection: Kernel matrix approximation by accelerated randomly pivoted CholeskyCode1
ECHOPulse: ECG controlled echocardio-grams video generationCode1
Autoregressive Moving-average Attention Mechanism for Time Series ForecastingCode1
Entanglement-induced provable and robust quantum learning advantagesCode1
CliMedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models in Clinical ScenariosCode1
Enriching Music Descriptions with a Finetuned-LLM and Metadata for Text-to-Music RetrievalCode1
MDMP: Multi-modal Diffusion for supervised Motion Predictions with uncertaintyCode1
Geometric Representation Condition Improves Equivariant Molecule GenerationCode1
LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative DecodingCode1
Variational Language Concepts for Interpreting Foundation Language ModelsCode1
Robust Barycenter Estimation using Semi-Unbalanced Neural Optimal TransportCode1
Cayley Graph PropagationCode1
Human-aligned Chess with a Bit of SearchCode1
Manikin-Recorded Cardiopulmonary Sounds Dataset Using Digital StethoscopeCode1
Test-time Adaptation for Regression by Subspace AlignmentCode1
Aligning LLMs with Individual Preferences via InteractionCode1
TrustEMG-Net: Using Representation-Masking Transformer with U-Net for Surface Electromyography EnhancementCode1
Tadashi: Enabling AI-Based Automated Code Generation With Guaranteed CorrectnessCode1
Gradient-based Jailbreak Images for Multimodal Fusion ModelsCode1
Learning Code Preference via Synthetic EvolutionCode1
Not All Diffusion Model Activations Have Been Evaluated as Discriminative FeaturesCode1
Redefining Temporal Modeling in Video Diffusion: The Vectorized Timestep ApproachCode1
Predictive Coding for Decision TransformerCode1
Can Watermarked LLMs be Identified by Users via Crafted Prompts?Code1
Variational Bayes Gaussian SplattingCode1
Diffusion State-Guided Projected Gradient for Inverse ProblemsCode1
RFBoost: Understanding and Boosting Deep WiFi Sensing via Physical Data AugmentationCode1
Mitigating Adversarial Perturbations for Deep Reinforcement Learning via Vector QuantizationCode1
Real-World Benchmarks Make Membership Inference Attacks Fail on Diffusion ModelsCode1
Show:102550
← PrevPage 415 of 9486Next →