SOTAVerified

Benchmarking

Papers

Showing 11011125 of 5548 papers

TitleStatusHype
A Comprehensive Benchmark for COVID-19 Predictive Modeling Using Electronic Health Records in Intensive CareCode1
ScreenQA: Large-Scale Question-Answer Pairs over Mobile App ScreenshotsCode1
Benchmarking Multimodal Variational Autoencoders: CdSprites+ Dataset and ToolkitCode1
Structural Bias for Aspect Sentiment Triplet ExtractionCode1
nnOOD: A Framework for Benchmarking Self-supervised Anomaly Localisation MethodsCode1
Benchmarking Compositionality with Formal LanguagesCode1
A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation ModelsCode1
CIPCaD-Bench: Continuous Industrial Process datasets for benchmarking Causal Discovery methodsCode1
Accelerated and interpretable oblique random survival forestsCode1
Tracking Every Thing in the WildCode1
ArtFID: Quantitative Evaluation of Neural Style TransferCode1
Physiology-based simulation of the retinal vasculature enables annotation-free segmentation of OCT angiographsCode1
Detecting beats in the photoplethysmogram: benchmarking open-source algorithmsCode1
ALTO: A Large-Scale Dataset for UAV Visual Place Recognition and LocalizationCode1
Initial recommendations for performing, benchmarking, and reporting single-cell proteomics experimentsCode1
Benchmarking Omni-Vision Representation through the Lens of Visual RealmsCode1
TASKOGRAPHY: Evaluating robot task planning over large 3D scene graphsCode1
Graph Generative Model for Benchmarking Graph Neural NetworksCode1
Can Language Models Make Fun? A Case Study in Chinese Comical CrosstalkCode1
Less Is More: A Comparison of Active Learning Strategies for 3D Medical Image SegmentationCode1
DFGC 2022: The Second DeepFake Game CompetitionCode1
Benchmarking the Robustness of Deep Neural Networks to Common Corruptions in Digital PathologyCode1
Summarizing Videos using Concentrated Attention and Considering the Uniqueness and Diversity of the Video FramesCode1
Beyond neural scaling laws: beating power law scaling via data pruningCode1
The DEBS 2022 Grand Challenge: Detecting Trading Trends in Financial Tick DataCode1
Show:102550
← PrevPage 45 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified