SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 17511800 of 659983 papers

TitleStatusHype
NeMo-Aligner: Scalable Toolkit for Efficient Model AlignmentCode4
OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual ReasoningCode4
Self-Play Preference Optimization for Language Model AlignmentCode4
RAPIDFlow: Recurrent Adaptable Pyramids with Iterative Decoding for Efficient Optical Flow EstimationCode4
Visual Mamba: A Survey and New OutlooksCode4
A Survey on Diffusion Models for Time Series and Spatio-Temporal DataCode4
Hallucination of Multimodal Large Language Models: A SurveyCode4
Mamba-FETrack: Frame-Event Tracking via State Space ModelCode4
MovieChat+: Question-aware Sparse Memory for Long Video Question AnsweringCode4
PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense CaptioningCode4
Continual Learning of Large Language Models: A Comprehensive SurveyCode4
A Survey on Visual MambaCode4
Autonomous LLM-driven research from data to human-verifiable research papersCode4
FlowMap: High-Quality Camera Poses, Intrinsics, and Depth via Gradient DescentCode4
SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and GenerationCode4
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language ModelsCode4
StyleBooth: Image Style Editing with Multimodal InstructionCode4
AgentKit: Structured LLM Reasoning with Dynamic GraphsCode4
State Space Model for New-Generation Network Alternative to Transformers: A SurveyCode4
Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language ModelsCode4
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context LengthCode4
JetMoE: Reaching Llama2 Performance with 0.1M DollarsCode4
ControlNet++: Improving Conditional Controls with Efficient Consistency FeedbackCode4
RecurrentGemma: Moving Past Transformers for Efficient Open Language ModelsCode4
A Foundation Model for Zero-shot Logical Query ReasoningCode4
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attentionCode4
FLEX: FLEXible Federated Learning FrameworkCode4
Matching 2D Images in 3D: Metric Relative Pose from Metric CorrespondencesCode4
No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene SegmentationCode4
Sailor: Open Language Models for South-East AsiaCode4
ChangeMamba: Remote Sensing Change Detection With Spatiotemporal State Space ModelCode4
MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual TokensCode4
AutoWebGLM: A Large Language Model-based Web Navigating AgentCode4
The largest EEG-based BCI reproducibility study for open science: the MOABB benchmarkCode4
Symbolic Prompt Program Search: A Structure-Aware Approach to Efficient Compile-Time Prompt OptimizationCode4
CameraCtrl: Enabling Camera Control for Text-to-Video GenerationCode4
A Survey on Large Language Model-Based Game AgentsCode4
SnAG: Scalable and Accurate Video GroundingCode4
PyTorch Frame: A Modular Framework for Multi-Modal Tabular LearningCode4
End-to-End Autonomous Driving through V2X CooperationCode4
Croissant: A Metadata Format for ML-Ready DatasetsCode4
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language ModelsCode4
Tiny Machine Learning: Progress and FuturesCode4
Dataverse: Open-Source ETL (Extract, Transform, Load) Pipeline for Large Language ModelsCode4
Long-form factuality in large language modelsCode4
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical TextCode4
TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild VideosCode4
Deepfake Generation and Detection: A Benchmark and SurveyCode4
Octree-GS: Towards Consistent Real-time Rendering with LOD-Structured 3D GaussiansCode4
DN-Splatter: Depth and Normal Priors for Gaussian Splatting and MeshingCode4
Show:102550
← PrevPage 36 of 13200Next →