SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 39764000 of 661570 papers

TitleStatusHype
BlackMamba: Mixture of Experts for State-Space ModelsCode3
On the Error Analysis of 3D Gaussian Splatting and an Optimal Projection StrategyCode3
StopThePop: Sorted Gaussian Splatting for View-Consistent Real-time RenderingCode3
PirateNets: Physics-informed Deep Learning with Residual Adaptive NetworksCode3
Repeat After Me: Transformers are Better than State Space Models at CopyingCode3
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache QuantizationCode3
LongAlign: A Recipe for Long Context Alignment of Large Language ModelsCode3
Hi-SAM: Marrying Segment Anything Model for Hierarchical Text SegmentationCode3
Common Sense Reasoning for Deepfake DetectionCode3
Towards Urban General Intelligence: A Review and Outlook of Urban Foundation ModelsCode3
MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with Mutual Scoring of the Unlabeled ImagesCode3
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf modelsCode3
When Large Language Models Meet Vector Databases: A SurveyCode3
CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language ModelsCode3
Corrective Retrieval Augmented GenerationCode3
DeFlow: Decoder of Scene Flow Network in Autonomous DrivingCode3
StableIdentity: Inserting Anybody into Anywhere at First SightCode3
FengWu-GHR: Learning the Kilometer-scale Medium-range Global Weather ForecastingCode3
BrepGen: A B-rep Generative Diffusion Model with Structured Latent GeometryCode3
A Practical Probabilistic Benchmark for AI Weather ModelsCode3
MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop QueriesCode3
Scientific Large Language Models: A Survey on Biological & Chemical DomainsCode3
SliceGPT: Compress Large Language Models by Deleting Rows and ColumnsCode3
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-DesignCode3
pix2gestalt: Amodal Segmentation by Synthesizing WholesCode3
Show:102550
← PrevPage 160 of 26463Next →