SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 31513200 of 177340 papers

TitleStatusHype
Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text IntegrationCode3
AgentTuning: Enabling Generalized Agent Abilities for LLMsCode3
Hawk: Learning to Understand Open-World Video AnomaliesCode3
PhoWhisper: Automatic Speech Recognition for VietnameseCode3
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMsCode3
How to build the best medical image segmentation algorithm using foundation models: a comprehensive empirical study with Segment Anything ModelCode3
Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement LearningCode3
Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object DetectionCode3
Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACsCode3
DRCT: Saving Image Super-resolution away from Information BottleneckCode3
TopoX: A Suite of Python Packages for Machine Learning on Topological DomainsCode3
OSUM: Advancing Open Speech Understanding Models with Limited Resources in AcademiaCode3
Emu3: Next-Token Prediction is All You NeedCode3
Multi-SWE-bench: A Multilingual Benchmark for Issue ResolvingCode3
MagicPose: Realistic Human Poses and Facial Expressions Retargeting with Identity-aware DiffusionCode3
MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop QueriesCode3
NerfAcc: A General NeRF Acceleration ToolboxCode3
Llemma: An Open Language Model For MathematicsCode3
Datasets: A Community Library for Natural Language ProcessingCode3
Tri-Perspective View for Vision-Based 3D Semantic Occupancy PredictionCode3
ResNeSt: Split-Attention NetworksCode3
MedSegDiff-V2: Diffusion based Medical Image Segmentation with TransformerCode3
IEPile: Unearthing Large-Scale Schema-Based Information Extraction CorpusCode3
StableToolBench-MirrorAPI: Modeling Tool Environments as Mirrors of 7,000+ Real-World APIsCode3
Dynamic Cheatsheet: Test-Time Learning with Adaptive MemoryCode3
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMsCode3
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked ModelingCode3
Inferring Articulated Rigid Body Dynamics from RGBD VideoCode3
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual ComprehensionCode3
Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts AdaptersCode3
Neural Network Verification with Branch-and-Bound for General NonlinearitiesCode3
AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia Content CreationCode3
DrivAerNet: A Parametric Car Dataset for Data-Driven Aerodynamic Design and PredictionCode3
Exploring Intrinsic Normal Prototypes within a Single Image for Universal Anomaly DetectionCode3
Diffusion Model-Based Video Editing: A SurveyCode3
Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter TransferCode3
BoT-SORT: Robust Associations Multi-Pedestrian TrackingCode3
TopoBench: A Framework for Benchmarking Topological Deep LearningCode3
InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image GenerationCode3
Impact of architecture on robustness and interpretability of multispectral deep neural networksCode3
Are Language Models Actually Useful for Time Series Forecasting?Code3
PDEBENCH: An Extensive Benchmark for Scientific Machine LearningCode3
Activating More Pixels in Image Super-Resolution TransformerCode3
The First Competition on Resource-Limited Infrared Small Target Detection Challenge: Methods and ResultsCode3
ELIZA Reanimated: The world's first chatbot restored on the world's first time sharing systemCode3
The Manga Whisperer: Automatically Generating Transcriptions for ComicsCode3
Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View SynthesisCode3
Speedy-Splat: Fast 3D Gaussian Splatting with Sparse Pixels and Sparse PrimitivesCode3
Dispelling the Mirage of Progress in Offline MARL through Standardised Baselines and EvaluationCode3
Deep Neural Networks for Rank-Consistent Ordinal Regression Based On Conditional ProbabilitiesCode3
Show:102550
← PrevPage 64 of 3547Next →