SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1270112750 of 177340 papers

TitleStatusHype
HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language ModelsCode2
Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A SurveyCode2
On the representation and methodology for wide and short range head pose estimationCode2
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM AgentsCode2
Low Latency Point Cloud Rendering with Learned SplattingCode2
LLM-based SPARQL Query Generation from Natural Language over Federated Knowledge GraphsCode2
Understanding the Tricks of Deep Learning in Medical Image Segmentation: Challenges and Future DirectionsCode2
Rethinking Test-time Likelihood: The Likelihood Path Principle and Its Application to OOD DetectionCode2
DaD: Distilled Reinforcement Learning for Diverse Keypoint DetectionCode2
ESLAM: Efficient Dense SLAM System Based on Hybrid Representation of Signed Distance FieldsCode2
SMT 2.0: A Surrogate Modeling Toolbox with a focus on Hierarchical and Mixed Variables Gaussian ProcessesCode2
Advancing 6-DoF Instrument Pose Estimation in Variable X-Ray Imaging GeometriesCode2
CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language ModelCode2
CURLoRA: Stable LLM Continual Fine-Tuning and Catastrophic Forgetting MitigationCode2
Token Merging for Training-Free Semantic Binding in Text-to-Image SynthesisCode2
CT2Rep: Automated Radiology Report Generation for 3D Medical ImagingCode2
MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized AdaptationCode2
TimeMIL: Advancing Multivariate Time Series Classification via a Time-aware Multiple Instance LearningCode2
Modelling Non-Smooth Signals with Complex Spectral StructureCode2
RapFlow-TTS: Rapid and High-Fidelity Text-to-Speech with Improved Consistency Flow MatchingCode2
NeuralPLexer3: Accurate Biomolecular Complex Structure Prediction with Flow ModelsCode2
Salient Object-Aware Background Generation using Text-Guided Diffusion ModelsCode2
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler FeedbackCode2
Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language ModelsCode2
The Super Weight in Large Language ModelsCode2
ProtComposer: Compositional Protein Structure Generation with 3D EllipsoidsCode2
Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose EstimationCode2
When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression RecognitionCode2
Towards Generalizable Scene Change DetectionCode2
CFMW: Cross-modality Fusion Mamba for Multispectral Object Detection under Adverse Weather ConditionsCode2
WildAvatar: Web-scale In-the-wild Video Dataset for 3D Avatar CreationCode2
PointGPT: Auto-regressively Generative Pre-training from Point CloudsCode2
Decentralization and Acceleration Enables Large-Scale Bundle AdjustmentCode2
Quiver: Supporting GPUs for Low-Latency, High-Throughput GNN Serving with Workload AwarenessCode2
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and GenerationCode2
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent SystemsCode2
SSUMamba: Spatial-Spectral Selective State Space Model for Hyperspectral Image DenoisingCode2
Large Scale Longitudinal Experiments: Estimation and InferenceCode2
Image Referenced Sketch Colorization Based on Animation Creation WorkflowCode2
PromptReps: Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document RetrievalCode2
Learning Harmonized Representations for Speculative SamplingCode2
MobileVLM: A Vision-Language Model for Better Intra- and Inter-UI UnderstandingCode2
MMSci: A Dataset for Graduate-Level Multi-Discipline Multimodal Scientific UnderstandingCode2
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer ModelsCode2
UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language ModelsCode2
Mathematical Introduction to Deep Learning: Methods, Implementations, and TheoryCode2
Adaptive Probabilistic ODE Solvers Without Adaptive Memory RequirementsCode2
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family ExpertsCode2
Enhancing Vectorized Map Perception with Historical Rasterized MapsCode2
RoboBERT: An End-to-end Multimodal Robotic Manipulation ModelCode2
Show:102550
← PrevPage 255 of 3547Next →