SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 13761400 of 177339 papers

TitleStatusHype
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-ReflectionCode4
MedSAM2: Segment Anything in 3D Medical Images and VideosCode4
DepthFM: Fast Monocular Depth Estimation with Flow MatchingCode4
Strip R-CNN: Large Strip Convolution for Remote Sensing Object DetectionCode4
Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM AgentsCode4
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on EdgeCode4
JAX-Fluids 2.0: Towards HPC for Differentiable CFD of Compressible Two-phase FlowsCode4
AltCLIP: Altering the Language Encoder in CLIP for Extended Language CapabilitiesCode4
Link and code: Fast indexing with graphs and compact regression codesCode4
GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and GenerationCode4
Can Machines Help Us Answering Question 16 in Datasheets, and In Turn Reflecting on Inappropriate Content?Code4
Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMsCode4
Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice PerspectiveCode4
AsyncDiff: Parallelizing Diffusion Models by Asynchronous DenoisingCode4
LLaMA Pro: Progressive LLaMA with Block ExpansionCode4
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory ConstraintsCode4
Fengshenbang 1.0: Being the Foundation of Chinese Cognitive IntelligenceCode4
OpenCalib: A Multi-sensor Calibration Toolbox for Autonomous DrivingCode4
Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPOCode4
SAMPart3D: Segment Any Part in 3D ObjectsCode4
Universal and Extensible Language-Vision Models for Organ Segmentation and Tumor Detection from Abdominal Computed TomographyCode4
Multimodal Chain-of-Thought Reasoning: A Comprehensive SurveyCode4
RGBD GS-ICP SLAMCode4
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language ModelsCode4
Exploring the Capabilities of Large Multimodal Models on Dense TextCode4
Show:102550
← PrevPage 56 of 7094Next →