SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1980119850 of 474278 papers

TitleStatusHype
Proving Test Set Contamination in Black Box Language ModelsCode1
You Only Need Two Detectors to Achieve Multi-Modal 3D Multi-Object TrackingCode1
Selective Fairness in Recommendation via PromptsCode1
A Versatile Multi-Agent Reinforcement Learning Benchmark for Inventory ManagementCode1
ComStreamClust: a communicative multi-agent approach to text clustering in streaming dataCode1
A Multi-modal Garden Dataset and Hybrid 3D Dense Reconstruction Framework Based on Panoramic Stereo Images for a Trimming RobotCode1
Singing Voice Synthesis Using Differentiable LPC and Glottal-Flow-Inspired WavetablesCode1
An Autotuning-based Optimization Framework for Mixed-kernel SVM Classifications in Smart Pixel Datasets and Heterojunction TransistorsCode1
Generalized One-shot Domain Adaptation of Generative Adversarial NetworksCode1
Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for Offensive Language DetectionCode1
Automated Clinical Coding: What, Why, and Where We Are?Code1
Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter PruningCode1
Enhanced Short Text Modeling: Leveraging Large Language Models for Topic RefinementCode1
MAAT: Mamba Adaptive Anomaly Transformer with association discrepancy for time seriesCode1
Cloning Outfits from Real-World Images to 3D Characters for Generalizable Person Re-IdentificationCode1
Patcher: Patch Transformers with Mixture of Experts for Precise Medical Image SegmentationCode1
Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual ReasoningCode1
Block Coordinate Descent for Sparse NMFCode1
Noise-powered Multi-modal Knowledge Graph Representation FrameworkCode1
Interpretable Generative Models through Post-hoc Concept BottlenecksCode1
MedMNIST-C: Comprehensive benchmark and improved classifier robustness by simulating realistic image corruptionsCode1
Think Step by Step: Chain-of-Gesture Prompting for Error Detection in Robotic Surgical VideosCode1
MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENerationCode1
Explainable Time Series Anomaly Detection using Masked Latent Generative ModelingCode1
DeepVARwT: Deep Learning for a VAR Model with TrendCode1
Stripformer: Strip Transformer for Fast Image DeblurringCode1
EDA: Evolving and Distinct Anchors for Multimodal Motion PredictionCode1
Quantum approximate optimization via learning-based adaptive optimizationCode1
Benchmarking LLMs for Political Science: A United Nations PerspectiveCode1
Continual Learning in Medical Imaging: A Survey and Practical AnalysisCode1
SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation ModulationCode1
Are Deep Neural Networks SMARTer than Second Graders?Code1
SAMCT: Segment Any CT Allowing Labor-Free Task-Indicator PromptsCode1
Robust Point Cloud Registration Framework Based on Deep Graph MatchingCode1
Semi-MoreGAN: A New Semi-supervised Generative Adversarial Network for Mixture of Rain RemovalCode1
On Robust Prefix-Tuning for Text ClassificationCode1
Object-Centric Slot DiffusionCode1
Unveiling Transformers with LEGO: a synthetic reasoning taskCode1
Lifelong Learning on Evolving Graphs Under the Constraints of Imbalanced Classes and New ClassesCode1
DSPNet: Dual-vision Scene Perception for Robust 3D Question AnsweringCode1
SynthSeg: Segmentation of brain MRI scans of any contrast and resolution without retrainingCode1
A Graph-Based Modeling Framework for Tracing Hydrological Pollutant Transport in Surface WatersCode1
Flow Network based Generative Models for Non-Iterative Diverse Candidate GenerationCode1
Efficient Wasserstein Natural Gradients for Reinforcement LearningCode1
Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model EvaluationCode1
Approximating Two-Layer Feedforward Networks for Efficient TransformersCode1
Group DETR: Fast DETR Training with Group-Wise One-to-Many AssignmentCode1
Federated Foundation Models on Heterogeneous Time SeriesCode1
Parallel AutoRegressive Models for Multi-Agent Combinatorial OptimizationCode1
Towards the Practical Utility of Federated Learning in the Medical DomainCode1
Show:102550
← PrevPage 397 of 9486Next →