SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 40014025 of 661570 papers

TitleStatusHype
An Extensible Framework for Open Heterogeneous Collaborative PerceptionCode3
MoE-Infinity: Efficient MoE Inference on Personal Machines with Sparsity-Aware Expert CacheCode3
pix2gestalt: Amodal Segmentation by Synthesizing WholesCode3
Wordflow: Social Prompt Engineering for Large Language ModelsCode3
Unitxt: Flexible, Shareable and Reusable Data Preparation and Evaluation for Generative AICode3
VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web TasksCode3
AgentBoard: An Analytical Evaluation Board of Multi-turn LLM AgentsCode3
Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-AlignmentCode3
Lumiere: A Space-Time Diffusion Model for Video GenerationCode3
Benchmarking LLMs via Uncertainty QuantificationCode3
A Vision-Language Foundation Model to Enhance Efficiency of Chest X-ray InterpretationCode3
In-Context Learning for Extreme Multi-Label ClassificationCode3
Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated TextCode3
MVSFormer++: Revealing the Devil in Transformer's Details for Multi-View StereoCode3
Bridging Evolutionary Algorithms and Reinforcement Learning: A Comprehensive Survey on Hybrid AlgorithmsCode3
RAP-SAM: Towards Real-Time All-Purpose Segment AnythingCode3
MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature SynchronizerCode3
The Manga Whisperer: Automatically Generating Transcriptions for ComicsCode3
Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual ModelsCode3
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI AgentsCode3
GARField: Group Anything with Radiance FieldsCode3
RoHM: Robust Human Motion Reconstruction via DiffusionCode3
ModernTCN: A Modern Pure Convolution Structure for General Time Series AnalysisCode3
Inferflow: an Efficient and Highly Configurable Inference Engine for Large Language ModelsCode3
A Survey of Resource-efficient LLM and Multimodal Foundation ModelsCode3
Show:102550
← PrevPage 161 of 26463Next →