SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1037610400 of 177340 papers

TitleStatusHype
Context is Key: A Benchmark for Forecasting with Essential Textual InformationCode2
InfiAgent-DABench: Evaluating Agents on Data Analysis TasksCode2
A Survey on 3D Gaussian SplattingCode2
SafeAgentBench: A Benchmark for Safe Task Planning of Embodied LLM AgentsCode2
Efficient Parallel Genetic Algorithm for Perturbed Substructure Optimization in Complex NetworkCode2
A Survey on Hardware Accelerators for Large Language ModelsCode2
PSAvatar: A Point-based Shape Model for Real-Time Head Avatar Animation with 3D Gaussian SplattingCode2
EQ-Bench: An Emotional Intelligence Benchmark for Large Language ModelsCode2
Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark DatasetCode2
Machine Unlearning of Pre-trained Large Language ModelsCode2
Segment Any Anomaly without Training via Hybrid Prompt RegularizationCode2
Nexus: A Lightweight and Scalable Multi-Agent Framework for Complex Tasks AutomationCode2
Advanced Millimeter-Wave Radar System for Real-Time Multiple-Human Tracking and Fall DetectionCode2
DISTFLASHATTN: Distributed Memory-efficient Attention for Long-context LLMs TrainingCode2
VideoSAGE: Video Summarization with Graph Representation LearningCode2
DISC-LawLLM: Fine-tuning Large Language Models for Intelligent Legal ServicesCode2
Free-T2M: Frequency Enhanced Text-to-Motion Diffusion Model With Consistency LossCode2
Dual-path Mamba: Short and Long-term Bidirectional Selective Structured State Space Models for Speech SeparationCode2
Magic-Boost: Boost 3D Generation with Multi-View Conditioned DiffusionCode2
Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency AdaptationCode2
LaSagnA: Language-based Segmentation Assistant for Complex QueriesCode2
Piccolo2: General Text Embedding with Multi-task Hybrid Loss TrainingCode2
An Initial Investigation of Language Adaptation for TTS Systems under Low-resource ScenariosCode2
Recipe for a General, Powerful, Scalable Graph TransformerCode2
WATT: Weight Average Test-Time Adaptation of CLIPCode2
Show:102550
← PrevPage 416 of 7094Next →