SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 99519975 of 474278 papers

TitleStatusHype
Leveraging Pre-Trained Autoencoders for Interpretable Prototype Learning of Music AudioCode2
LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning DatasetCode2
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM InferenceCode2
Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven AgentsCode2
Generalized Portrait Quality AssessmentCode2
PC-NeRF: Parent-Child Neural Radiance Fields Using Sparse LiDAR Frames in Autonomous Driving EnvironmentsCode2
YOLOv8-AM: YOLOv8 Based on Effective Attention Mechanisms for Pediatric Wrist Fracture DetectionCode2
Extreme Video Compression with Pre-trained Diffusion ModelsCode2
Personalized Large Language ModelsCode2
MultiMedEval: A Benchmark and a Toolkit for Evaluating Medical Vision-Language ModelsCode2
Learning Emergent Gaits with Decentralized Phase Oscillators: on the role of Observations, Rewards, and FeedbackCode2
BEFUnet: A Hybrid CNN-Transformer Architecture for Precise Medical Image SegmentationCode2
An Embarrassingly Simple Approach for LLM with Strong ASR CapacityCode2
A Survey of Generative AI for de novo Drug Design: New Frontiers in Molecule and Protein GenerationCode2
RBF-PINN: Non-Fourier Positional Embedding in Physics-Informed Neural NetworksCode2
Learning Continuous 3D Words for Text-to-Image GenerationCode2
THE COLOSSEUM: A Benchmark for Evaluating Generalization for Robotic ManipulationCode2
Transductive Active Learning: Theory and ApplicationsCode2
DNABERT-S: Pioneering Species Differentiation with Species-Aware DNA EmbeddingsCode2
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially FastCode2
Can LLMs Learn New Concepts Incrementally without Forgetting?Code2
Higher Layers Need More LoRA ExpertsCode2
ChatCell: Facilitating Single-Cell Analysis with Natural LanguageCode2
LoTa-Bench: Benchmarking Language-oriented Task Planners for Embodied AgentsCode2
COLD-Attack: Jailbreaking LLMs with Stealthiness and ControllabilityCode2
Show:102550
← PrevPage 399 of 18972Next →