SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 65516600 of 661570 papers

TitleStatusHype
SensorLLM: Human-Intuitive Alignment of Multivariate Sensor Data with LLMs for Activity RecognitionCode2
MambaMorph: a Mamba-based Framework for Medical MR-CT Deformable RegistrationCode2
Common Objects in 3D: Large-Scale Learning and Evaluation of Real-life 3D Category ReconstructionCode2
Human Pose as Compositional TokensCode2
Dense Distinct Query for End-to-End Object DetectionCode2
Deduplicating Training Data Makes Language Models BetterCode2
Approximate Convex Decomposition for 3D Meshes with Collision-Aware Concavity and Tree SearchCode2
Autonomous GIS: the next-generation AI-powered GISCode2
The Surprising Effectiveness of Negative Reinforcement in LLM ReasoningCode2
DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D DataCode2
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic ManipulationCode2
Graph Neural Network Surrogates to leverage Mechanistic Expert Knowledge towards Reliable and Immediate Pandemic ResponseCode2
UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure AnalysisCode2
LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph MatchingCode2
ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal Feature LearningCode2
PoseFormerV2: Exploring Frequency Domain for Efficient and Robust 3D Human Pose EstimationCode2
SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal ModelCode2
Bracketing Image Restoration and Enhancement with High-Low Frequency DecompositionCode2
LLM4EDA: Emerging Progress in Large Language Models for Electronic Design AutomationCode2
Overview of the PromptCBLUE Shared Task in CHIP2023Code2
DebugBench: Evaluating Debugging Capability of Large Language ModelsCode2
SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement LearningCode2
Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMsCode2
PMFSNet: Polarized Multi-scale Feature Self-attention Network For Lightweight Medical Image SegmentationCode2
CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular FusionCode2
VLFM: Vision-Language Frontier Maps for Zero-Shot Semantic NavigationCode2
STEVE-1: A Generative Model for Text-to-Behavior in MinecraftCode2
An Efficient and Mixed Heterogeneous Model for Image RestorationCode2
Open-LLM-Leaderboard: From Multi-choice to Open-style Questions for LLMs Evaluation, Benchmark, and ArenaCode2
DreamLIP: Language-Image Pre-training with Long CaptionsCode2
ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt TuningCode2
Therapeutics Data Commons: Machine Learning Datasets and Tasks for Drug Discovery and DevelopmentCode2
Unleashing the Power of Multi-Task Learning: A Comprehensive Survey Spanning Traditional, Deep, and Pretrained Foundation Model ErasCode2
2nd Place Winning Solution for the CVPR2023 Visual Anomaly and Novelty Detection Challenge: Multimodal Prompting for Data-centric Anomaly DetectionCode2
TeCH: Text-guided Reconstruction of Lifelike Clothed HumansCode2
BMFM-RNA: An Open Framework for Building and Evaluating Transcriptomic Foundation ModelsCode2
LeapVAD: A Leap in Autonomous Driving via Cognitive Perception and Dual-Process ThinkingCode2
Bottleneck Transformers for Visual RecognitionCode2
HMANet: Hybrid Multi-Axis Aggregation Network for Image Super-ResolutionCode2
EHRMamba: Towards Generalizable and Scalable Foundation Models for Electronic Health RecordsCode2
Multi-Modal Self-Supervised Learning for RecommendationCode2
Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural CalibrationCode2
MAPLM: A Real-World Large-Scale Vision-Language Benchmark for Map and Traffic Scene UnderstandingCode2
ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code GenerationCode2
MoA: Mixture of Sparse Attention for Automatic Large Language Model CompressionCode2
iLLM-TSC: Integration reinforcement learning and large language model for traffic signal control policy improvementCode2
DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text SpottingCode2
Slice-Consistent 3D Volumetric Brain CT-to-MRI Translation with 2D Brownian Bridge Diffusion ModelCode2
PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest TransformerCode2
Exposing the Deception: Uncovering More Forgery Clues for Deepfake DetectionCode2
Show:102550
← PrevPage 132 of 13232Next →