SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 40514100 of 177340 papers

TitleStatusHype
Distance Adaptive Beam Search for Provably Accurate Graph-Based Nearest Neighbor SearchCode3
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online VideosCode3
VISTA3D: Versatile Imaging SegmenTation and Annotation model for 3D Computed TomographyCode3
Remote Sensing Temporal Vision-Language Models: A Comprehensive SurveyCode3
STG-Mamba: Spatial-Temporal Graph Learning via Selective State Space ModelCode3
Rethinking Evaluation Metrics of Open-Vocabulary SegmentaionCode3
Trajectory Consistency Distillation: Improved Latent Consistency Distillation by Semi-Linear Consistency Function with Trajectory MappingCode3
ALLaVA: Harnessing GPT4V-Synthesized Data for Lite Vision-Language ModelsCode3
ACE2: Accurately learning subseasonal to decadal atmospheric variability and forced responsesCode3
ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and SpeechCode3
KnowAgent: Knowledge-Augmented Planning for LLM-Based AgentsCode3
Scaling Analysis of Interleaved Speech-Text Language ModelsCode3
MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated CapabilitiesCode3
GPU-accelerated Evolutionary Many-objective Optimization Using Tensorized NSGA-IIICode3
Docs2KG: Unified Knowledge Graph Construction from Heterogeneous Documents Assisted by Large Language ModelsCode3
EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and BenchmarkCode3
A Survey on LLM Test-Time Compute via Search: Tasks, LLM Profiling, Search Algorithms, and Relevant FrameworksCode3
PyThaiNLP: Thai Natural Language Processing in PythonCode3
FRACTAL: An Ultra-Large-Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse LandscapesCode3
A Survey of Large Language Models in Medicine: Progress, Application, and ChallengeCode3
MultiDiffusion: Fusing Diffusion Paths for Controlled Image GenerationCode3
Rule Based Rewards for Language Model SafetyCode3
Hyper-parameter tuning for text guided image editingCode3
RepoGraph: Enhancing AI Software Engineering with Repository-level Code GraphCode3
Efficient Large Language Models: A SurveyCode3
Navigating Eukaryotic Genome Annotation Pipelines: A Route Map to BRAKER, Galba, and TSEBRACode3
PC Agent: While You Sleep, AI Works -- A Cognitive Journey into Digital WorldCode3
SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time seriesCode3
BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language ModelsCode3
Arctic-Text2SQL-R1: Simple Rewards, Strong Reasoning in Text-to-SQLCode3
DrivingForward: Feed-forward 3D Gaussian Splatting for Driving Scene Reconstruction from Flexible Surround-view InputCode3
TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Similarity TreeCode3
VidTwin: Video VAE with Decoupled Structure and DynamicsCode3
Probabilistic Weather Forecasting with Hierarchical Graph Neural NetworksCode3
Dataset and Baseline System for Multi-lingual Extraction and Normalization of Temporal and Numerical ExpressionsCode3
How Well Do Supervised 3D Models Transfer to Medical Imaging Tasks?Code3
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at ScaleCode3
OctoPack: Instruction Tuning Code Large Language ModelsCode3
Learning Smooth Humanoid Locomotion through Lipschitz-Constrained PoliciesCode3
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization LandscapeCode3
MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-ExpertsCode3
On the use of deep learning for phase recoveryCode3
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language ModelsCode3
NVS-Solver: Video Diffusion Model as Zero-Shot Novel View SynthesizerCode3
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language ModelCode3
MAPIE: an open-source library for distribution-free uncertainty quantificationCode3
PhysX: Physical-Grounded 3D Asset GenerationCode3
Sigma: Siamese Mamba Network for Multi-Modal Semantic SegmentationCode3
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at ScaleCode3
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign LanguagesCode3
Show:102550
← PrevPage 82 of 3547Next →