SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 91019125 of 474278 papers

TitleStatusHype
PnLCalib: Sports Field Registration via Points and Lines OptimizationCode2
OmniSat: Self-Supervised Modality Fusion for Earth ObservationCode2
LLM-Seg: Bridging Image Segmentation and Large Language Model ReasoningCode2
Learning representations of learning representationsCode2
MambaDFuse: A Mamba-based Dual-phase Model for Multi-modality Image FusionCode2
Under pressure: learning-based analog gauge reading in the wildCode2
LlamaTouch: A Faithful and Scalable Testbed for Mobile UI Task AutomationCode2
Inheritune: Training Smaller Yet More Attentive Language ModelsCode2
GoMVS: Geometrically Consistent Cost Aggregation for Multi-View StereoCode2
Can Vehicle Motion Planning Generalize to Realistic Long-tail Scenarios?Code2
QuasiSim: Parameterized Quasi-Physical Simulators for Dexterous Manipulations TransferCode2
Manipulating Large Language Models to Increase Product VisibilityCode2
DesignQA: A Multimodal Benchmark for Evaluating Large Language Models' Understanding of Engineering DocumentationCode2
LLoCO: Learning Long Contexts OfflineCode2
From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context ExamplesCode2
SFSORT: Scene Features-based Simple Online Real-Time TrackerCode2
AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMsCode2
PRAM: Place Recognition Anywhere Model for Efficient Visual LocalizationCode2
Heron-Bench: A Benchmark for Evaluating Vision Language Models in JapaneseCode2
Behavior Trees Enable Structured Programming of Language Model AgentsCode2
Do You Remember? Dense Video Captioning with Cross-Modal Memory RetrievalCode2
Classifier-guided neural blind deconvolution: a physics-informed denoising module for bearing fault diagnosis under heavy noiseCode2
Latent Guard: a Safety Framework for Text-to-image GenerationCode2
HGRN2: Gated Linear RNNs with State ExpansionCode2
Content-Adaptive Non-Local Convolution for Remote Sensing PansharpeningCode2
Show:102550
← PrevPage 365 of 18972Next →