SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 21012150 of 177339 papers

TitleStatusHype
Knowledge-tuning Large Language Models with Structured Medical Knowledge Bases for Reliable Response Generation in ChineseCode4
MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 SecondsCode4
BLOOM: A 176B-Parameter Open-Access Multilingual Language ModelCode4
Gender Representation in TV and Radio: Automatic Information Extraction methods versus Manual AnalysesCode4
BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective SupervisionCode4
NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as General Image PriorsCode4
RealisDance-DiT: Simple yet Strong Baseline towards Controllable Character Animation in the WildCode4
COS-Mix: Cosine Similarity and Distance Fusion for Improved Information RetrievalCode4
UniScene: Unified Occupancy-centric Driving Scene GenerationCode4
Motion-X++: A Large-Scale Multimodal 3D Whole-body Human Motion DatasetCode4
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense PredictionCode4
Goldfish: Vision-Language Understanding of Arbitrarily Long VideosCode4
When Does Perceptual Alignment Benefit Vision Representations?Code4
MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from Microwatts to Megawatts for Sustainable AICode4
Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion ModelsCode4
A foundation model for human-AI collaboration in medical literature miningCode4
Language Model Beats Diffusion -- Tokenizer is Key to Visual GenerationCode4
PRISM: A Multi-Modal Generative Foundation Model for Slide-Level HistopathologyCode4
FFCV: Accelerating Training by Removing Data BottlenecksCode4
Trace is the Next AutoDiff: Generative Optimization with Rich Feedback, Execution Traces, and LLMsCode4
Building a Culture of Reproducibility in Academic ResearchCode4
A deep learning framework for efficient pathology image analysisCode4
Story-Adapter: A Training-free Iterative Framework for Long Story VisualizationCode4
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse AttentionCode4
CRUXEval: A Benchmark for Code Reasoning, Understanding and ExecutionCode4
VideoEval-Pro: Robust and Realistic Long Video Understanding EvaluationCode4
CitationMap: A Python Tool to Identify and Visualize Your Google Scholar Citations Around the WorldCode4
Real-time volumetric rendering of dynamic humansCode4
Improving Parallel Program Performance with LLM Optimizers via Agent-System InterfacesCode4
DeepFakes and Beyond: A Survey of Face Manipulation and Fake DetectionCode4
Inductive Moment MatchingCode4
Polysemous codesCode4
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?Code4
RUMI: Rummaging Using Mutual InformationCode4
ChatGPT Outperforms Crowd-Workers for Text-Annotation TasksCode4
A General Theoretical Paradigm to Understand Learning from Human PreferencesCode4
Self-Supervised Geometry-Guided Initialization for Robust Monocular Visual OdometryCode4
MUSE: Machine Unlearning Six-Way Evaluation for Language ModelsCode4
Stock Price Prediction via Discovering Multi-Frequency Trading PatternsCode4
The Model Openness Framework: Promoting Completeness and Openness for Reproducibility, Transparency, and Usability in Artificial IntelligenceCode4
Fast Transformer Decoding: One Write-Head is All You NeedCode4
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction DataCode4
DisCo-DSO: Coupling Discrete and Continuous Optimization for Efficient Generative Design in Hybrid SpacesCode4
Ideas in Inference-time Scaling can Benefit Generative Pre-training AlgorithmsCode4
Tiny-PULP-Dronets: Squeezing Neural Networks for Faster and Lighter Inference on Multi-Tasking Autonomous Nano-DronesCode4
ReARTeR: Retrieval-Augmented Reasoning with Trustworthy Process RewardingCode4
PointVLA: Injecting the 3D World into Vision-Language-Action ModelsCode4
ViViD: Video Virtual Try-on using Diffusion ModelsCode4
GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single ImageCode4
GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face SynthesisCode4
Show:102550
← PrevPage 43 of 3547Next →