SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 39514000 of 661570 papers

TitleStatusHype
EscherNet: A Generative Model for Scalable View SynthesisCode3
Self-Discover: Large Language Models Self-Compose Reasoning StructuresCode3
CogCoM: Train Large Vision-Language Models Diving into Details through Chain of ManipulationsCode3
BiLLM: Pushing the Limit of Post-Training Quantization for LLMsCode3
V-IRL: Grounding Virtual Intelligence in Real LifeCode3
Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement LearningCode3
Swin-UMamba: Mamba-based UNet with ImageNet-based pretrainingCode3
SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAMCode3
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV CacheCode3
Neural networks for abstraction and reasoning: Towards broad generalization in machinesCode3
Pathformer: Multi-scale Transformers with Adaptive Pathways for Time Series ForecastingCode3
A Survey of Large Language Models in Finance (FinLLMs)Code3
AutoTimes: Autoregressive Time Series Forecasters via Large Language ModelsCode3
Transolver: A Fast Transformer Solver for PDEs on General GeometriesCode3
SIMPL: A Simple and Efficient Multi-agent Motion Prediction Baseline for Autonomous DrivingCode3
TopoX: A Suite of Python Packages for Machine Learning on Topological DomainsCode3
Position: Graph Foundation Models are Already HereCode3
PokeLLMon: A Human-Parity Agent for Pokemon Battles with Large Language ModelsCode3
cmaes : A Simple yet Practical Python Library for CMA-ESCode3
GaMeS: Mesh-Based Adapting and Modification of Gaussian SplattingCode3
TravelPlanner: A Benchmark for Real-World Planning with Language AgentsCode3
ReEvo: Large Language Models as Hyper-Heuristics with Reflective EvolutionCode3
A Survey on Self-Supervised Learning for Non-Sequential Tabular DataCode3
BlackMamba: Mixture of Experts for State-Space ModelsCode3
Safety of Multimodal Large Language Models on Images and TextsCode3
StopThePop: Sorted Gaussian Splatting for View-Consistent Real-time RenderingCode3
On the Error Analysis of 3D Gaussian Splatting and an Optimal Projection StrategyCode3
Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective State SpacesCode3
PirateNets: Physics-informed Deep Learning with Residual Adaptive NetworksCode3
Repeat After Me: Transformers are Better than State Space Models at CopyingCode3
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache QuantizationCode3
LongAlign: A Recipe for Long Context Alignment of Large Language ModelsCode3
Hi-SAM: Marrying Segment Anything Model for Hierarchical Text SegmentationCode3
Common Sense Reasoning for Deepfake DetectionCode3
MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with Mutual Scoring of the Unlabeled ImagesCode3
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf modelsCode3
Towards Urban General Intelligence: A Review and Outlook of Urban Foundation ModelsCode3
When Large Language Models Meet Vector Databases: A SurveyCode3
CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language ModelsCode3
Corrective Retrieval Augmented GenerationCode3
StableIdentity: Inserting Anybody into Anywhere at First SightCode3
DeFlow: Decoder of Scene Flow Network in Autonomous DrivingCode3
BrepGen: A B-rep Generative Diffusion Model with Structured Latent GeometryCode3
FengWu-GHR: Learning the Kilometer-scale Medium-range Global Weather ForecastingCode3
MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop QueriesCode3
A Practical Probabilistic Benchmark for AI Weather ModelsCode3
Scientific Large Language Models: A Survey on Biological & Chemical DomainsCode3
SliceGPT: Compress Large Language Models by Deleting Rows and ColumnsCode3
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-DesignCode3
pix2gestalt: Amodal Segmentation by Synthesizing WholesCode3
Show:102550
← PrevPage 80 of 13232Next →