SOTAVerified

Benchmarking

Papers

Showing 34113420 of 5548 papers

TitleStatusHype
Ensuring Reliability of Curated EHR-Derived Data: The Validation of Accuracy for LLM/ML-Extracted Information and Data (VALID) Framework0
Benchmarking Middle-Trained Language Models for Neural Search0
Logically at Factify 2: A Multi-Modal Fact Checking System Based on Evidence Retrieval techniques and Transformer Encoder Architecture0
Logically at Factify 2022: Multimodal Fact Verification0
Toward an ImageNet Library of Functions for Global Optimization Benchmarking0
Benchmarking Meta-heuristic Optimization0
Brittle Minds, Fixable Activations: Understanding Belief Representations in Language Models0
Toward end-to-end interpretable convolutional neural networks for waveform signals0
Benchmarking MedMNIST dataset on real quantum hardware0
Benchmarking Machine Translated Sentiment Analysis for Arabic Tweets0
Show:102550
← PrevPage 342 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified