SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 24012450 of 659983 papers

TitleStatusHype
DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation3
LLaDA2.1: Speeding Up Text Diffusion via Token Editing3
ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking3
MagicPose: Realistic Human Poses and Facial Expressions Retargeting with Identity-aware DiffusionCode3
MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop QueriesCode3
NerfAcc: A General NeRF Acceleration ToolboxCode3
Llemma: An Open Language Model For MathematicsCode3
Datasets: A Community Library for Natural Language ProcessingCode3
Tri-Perspective View for Vision-Based 3D Semantic Occupancy PredictionCode3
ResNeSt: Split-Attention NetworksCode3
MedSegDiff-V2: Diffusion based Medical Image Segmentation with TransformerCode3
IEPile: Unearthing Large-Scale Schema-Based Information Extraction CorpusCode3
StableToolBench-MirrorAPI: Modeling Tool Environments as Mirrors of 7,000+ Real-World APIsCode3
Dynamic Cheatsheet: Test-Time Learning with Adaptive MemoryCode3
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMsCode3
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked ModelingCode3
Inferring Articulated Rigid Body Dynamics from RGBD VideoCode3
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual ComprehensionCode3
Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts AdaptersCode3
Neural Network Verification with Branch-and-Bound for General NonlinearitiesCode3
AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia Content CreationCode3
DrivAerNet: A Parametric Car Dataset for Data-Driven Aerodynamic Design and PredictionCode3
Exploring Intrinsic Normal Prototypes within a Single Image for Universal Anomaly DetectionCode3
Diffusion Model-Based Video Editing: A SurveyCode3
Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter TransferCode3
BoT-SORT: Robust Associations Multi-Pedestrian TrackingCode3
TopoBench: A Framework for Benchmarking Topological Deep LearningCode3
InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image GenerationCode3
Impact of architecture on robustness and interpretability of multispectral deep neural networksCode3
Are Language Models Actually Useful for Time Series Forecasting?Code3
PDEBENCH: An Extensive Benchmark for Scientific Machine LearningCode3
Activating More Pixels in Image Super-Resolution TransformerCode3
The First Competition on Resource-Limited Infrared Small Target Detection Challenge: Methods and ResultsCode3
ELIZA Reanimated: The world's first chatbot restored on the world's first time sharing systemCode3
The Manga Whisperer: Automatically Generating Transcriptions for ComicsCode3
Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View SynthesisCode3
Speedy-Splat: Fast 3D Gaussian Splatting with Sparse Pixels and Sparse PrimitivesCode3
Dispelling the Mirage of Progress in Offline MARL through Standardised Baselines and EvaluationCode3
Deep Neural Networks for Rank-Consistent Ordinal Regression Based On Conditional ProbabilitiesCode3
Channel Permutations for N:M SparsityCode3
PP-MSVSR: Multi-Stage Video Super-ResolutionCode3
QOC: Quantum On-Chip Training with Parameter Shift and Gradient PruningCode3
Pastiche Master: Exemplar-Based High-Resolution Portrait Style TransferCode3
Min-Max Similarity: A Contrastive Semi-Supervised Deep Learning Network for Surgical Tools SegmentationCode3
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on WikipediaCode3
Deep Learning for Trajectory Data Management and Mining: A Survey and BeyondCode3
DeepCAVE: An Interactive Analysis Tool for Automated Machine LearningCode3
Plotly-Resampler: Effective Visual Analytics for Large Time SeriesCode3
MDAgents: An Adaptive Collaboration of LLMs for Medical Decision-MakingCode3
The Common Core OntologiesCode3
Show:102550
← PrevPage 49 of 13200Next →