SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 14011450 of 659983 papers

TitleStatusHype
Knowledge Fusion of Large Language ModelsCode4
TALENT: A Tabular Analytics and Learning ToolboxCode4
Osprey: Pixel Understanding with Visual Instruction TuningCode4
Let's Verify Step by StepCode4
Agent-as-a-Judge: Evaluate Agents with AgentsCode4
TUMTraf V2X Cooperative Perception DatasetCode4
Attention on the SphereCode4
Generalized Recorrupted-to-Recorrupted: Self-Supervised Learning Beyond Gaussian NoiseCode4
GaussianFormer-2: Probabilistic Gaussian Superposition for Efficient 3D Occupancy PredictionCode4
Vision-Language Models for Vision Tasks: A SurveyCode4
A Survey on Visual MambaCode4
End-to-end Autonomous Driving: Challenges and FrontiersCode4
TensoRF: Tensorial Radiance FieldsCode4
A Convergent Single-Loop Algorithm for Relaxation of Gromov-Wasserstein in Graph DataCode4
Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language TasksCode4
Generating Structured Outputs from Language Models: Benchmark and StudiesCode4
Semi-Mamba-UNet: Pixel-Level Contrastive and Pixel-Level Cross-Supervised Visual Mamba-based UNet for Semi-Supervised Medical Image SegmentationCode4
Radiative Gaussian Splatting for Efficient X-ray Novel View SynthesisCode4
Timer-XL: Long-Context Transformers for Unified Time Series ForecastingCode4
TRUE: Re-evaluating Factual Consistency EvaluationCode4
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-ReflectionCode4
MedSAM2: Segment Anything in 3D Medical Images and VideosCode4
DepthFM: Fast Monocular Depth Estimation with Flow MatchingCode4
Strip R-CNN: Large Strip Convolution for Remote Sensing Object DetectionCode4
Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM AgentsCode4
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on EdgeCode4
JAX-Fluids 2.0: Towards HPC for Differentiable CFD of Compressible Two-phase FlowsCode4
AltCLIP: Altering the Language Encoder in CLIP for Extended Language CapabilitiesCode4
Link and code: Fast indexing with graphs and compact regression codesCode4
GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and GenerationCode4
Can Machines Help Us Answering Question 16 in Datasheets, and In Turn Reflecting on Inappropriate Content?Code4
Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMsCode4
Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice PerspectiveCode4
AsyncDiff: Parallelizing Diffusion Models by Asynchronous DenoisingCode4
LLaMA Pro: Progressive LLaMA with Block ExpansionCode4
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory ConstraintsCode4
Fengshenbang 1.0: Being the Foundation of Chinese Cognitive IntelligenceCode4
OpenCalib: A Multi-sensor Calibration Toolbox for Autonomous DrivingCode4
Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPOCode4
SAMPart3D: Segment Any Part in 3D ObjectsCode4
Universal and Extensible Language-Vision Models for Organ Segmentation and Tumor Detection from Abdominal Computed TomographyCode4
Multimodal Chain-of-Thought Reasoning: A Comprehensive SurveyCode4
RGBD GS-ICP SLAMCode4
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language ModelsCode4
Exploring the Capabilities of Large Multimodal Models on Dense TextCode4
CameraCtrl: Enabling Camera Control for Text-to-Video GenerationCode4
SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion ModelsCode4
Mutual Reasoning Makes Smaller LLMs Stronger Problem-SolversCode4
Data quality dimensions for fair AICode4
AnyText: Multilingual Visual Text Generation And EditingCode4
Show:102550
← PrevPage 29 of 13200Next →