SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 81018150 of 661570 papers

TitleStatusHype
Law of Vision Representation in MLLMsCode2
CLUE: A Chinese Language Understanding Evaluation BenchmarkCode2
MASS: Multi-Agent Simulation Scaling for Portfolio ConstructionCode2
Towards High-Resolution 3D Anomaly Detection: A Scalable Dataset and Real-Time Framework for Subtle Industrial DefectsCode2
Massive Activations in Large Language ModelsCode2
Natural and Robust Walking using Reinforcement Learning without Demonstrations in High-Dimensional Musculoskeletal ModelsCode2
Feature Pyramid Networks for Object DetectionCode2
Frozen Transformers in Language Models Are Effective Visual Encoder LayersCode2
The Russian Legislative CorpusCode2
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instructionCode2
FisherRF: Active View Selection and Uncertainty Quantification for Radiance Fields using Fisher InformationCode2
Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative DecodingCode2
Generative Pretraining from PixelsCode2
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)Code2
Invertible Diffusion Models for Compressed SensingCode2
HASSOD: Hierarchical Adaptive Self-Supervised Object DetectionCode2
OpenCity: Open Spatio-Temporal Foundation Models for Traffic PredictionCode2
Toward Unified Practices in Trajectory Prediction Research on Bird's-Eye-View DatasetsCode2
Closing the Gap Between Synthetic and Ground Truth Time Series Distributions via Neural MappingCode2
EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space DualityCode2
DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic ResolutionCode2
An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLMCode2
EMOv2: Pushing 5M Vision Model FrontierCode2
PSP-HDRI+: A Synthetic Dataset Generator for Pre-Training of Human-Centric Computer Vision ModelsCode2
OpenBox: A Python Toolkit for Generalized Black-box OptimizationCode2
When Attention Meets Fast Recurrence: Training Language Models with Reduced ComputeCode2
ICML 2023 Topological Deep Learning Challenge : Design and ResultsCode2
Longhorn: State Space Models are Amortized Online LearnersCode2
CCPL: Contrastive Coherence Preserving Loss for Versatile Style TransferCode2
A mmWave Software-Defined Array Platform for Wireless Experimentation at 24-29.5 GHzCode2
Empirical Asset Pricing with Large Language Model AgentsCode2
The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT ImprovementsCode2
DCoM: Active Learning for All LearnersCode2
Foundation Models for Remote Sensing and Earth Observation: A SurveyCode2
PMC-LLaMA: Towards Building Open-source Language Models for MedicineCode2
SWE-bench Goes Live!Code2
Uncertainty-Informed Deep Learning Models Enable High-Confidence Predictions for Digital HistopathologyCode2
Accelerated Policy Learning with Parallel Differentiable SimulationCode2
SimVP: Simpler yet Better Video PredictionCode2
Rethinking Imitation-based Planner for Autonomous DrivingCode2
Contrastive Flow MatchingCode2
Conformal prediction interval for dynamic time-seriesCode2
ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at ScaleCode2
video-SALMONN 2: Captioning-Enhanced Audio-Visual Large Language ModelsCode2
InstructSAM: A Training-Free Framework for Instruction-Oriented Remote Sensing Object RecognitionCode2
DiscoveryBench: Towards Data-Driven Discovery with Large Language ModelsCode2
Investigating image-based fallow weed detection performance on Raphanus sativus and Avena sativa at speeds up to 30 km h^-1Code2
Training Socially Aligned Language Models on Simulated Social InteractionsCode2
Stabilizing Transformer Training by Preventing Attention Entropy CollapseCode2
End-to-End Vectorized HD-map Construction with Piecewise Bezier CurveCode2
Show:102550
← PrevPage 163 of 13232Next →