SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 27762800 of 661570 papers

TitleStatusHype
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive AttacksCode3
SAM Fails to Segment Anything? -- SAM-Adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, Medical Image Segmentation, and MoreCode3
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice CloningCode3
StyleGaussian: Instant 3D Style Transfer with Gaussian SplattingCode3
GaMeS: Mesh-Based Adapting and Modification of Gaussian SplattingCode3
REPLUG: Retrieval-Augmented Black-Box Language ModelsCode3
Query-Based Adversarial Prompt GenerationCode3
GRAG: Graph Retrieval-Augmented GenerationCode3
Conformer: Convolution-augmented Transformer for Speech RecognitionCode3
Producing and Leveraging Online Map Uncertainty in Trajectory PredictionCode3
Efficient Inference for Large Reasoning Models: A SurveyCode3
LangFair: A Python Package for Assessing Bias and Fairness in Large Language Model Use CasesCode3
CycleNet: Enhancing Time Series Forecasting through Modeling Periodic PatternsCode3
RF-Diffusion: Radio Signal Generation via Time-Frequency DiffusionCode3
DiLightNet: Fine-grained Lighting Control for Diffusion-based Image GenerationCode3
EXP-Bench: Can AI Conduct AI Research Experiments?Code3
CompSLAM: Complementary Hierarchical Multi-Modal Localization and Mapping for Robot Autonomy in Underground EnvironmentsCode3
Neural Ordinary Differential EquationsCode3
LEADS: Lightweight Embedded Assisted Driving SystemCode3
Fine-Tuning Language Models with Just Forward PassesCode3
USB: A Unified Semi-supervised Learning Benchmark for ClassificationCode3
ITBench: Evaluating AI Agents across Diverse Real-World IT Automation TasksCode3
ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart ReasoningCode3
The Breeze 2 Herd of Models: Traditional Chinese LLMs Based on Llama with Vision-Aware and Function-Calling CapabilitiesCode3
The Role of Generative Systems in Historical Photography Management: A Case Study on Catalan ArchivesCode3
Show:102550
← PrevPage 112 of 26463Next →