Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1101–1150 of 5548 papers

Title	Date	Tasks	Status	Hype
A Comprehensive Benchmark for COVID-19 Predictive Modeling Using Electronic Health Records in Intensive Care	Sep 16, 2022	BenchmarkingDeep Learning	CodeCode Available	1
ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots	Sep 16, 2022	BenchmarkingQuestion Answering	CodeCode Available	1
Benchmarking Multimodal Variational Autoencoders: CdSprites+ Dataset and Toolkit	Sep 7, 2022	Benchmarking	CodeCode Available	1
nnOOD: A Framework for Benchmarking Self-supervised Anomaly Localisation Methods	Sep 2, 2022	Anomaly DetectionBenchmarking	CodeCode Available	1
Structural Bias for Aspect Sentiment Triplet Extraction	Sep 2, 2022	Aspect Sentiment Triplet ExtractionBenchmarking	CodeCode Available	1
Benchmarking Compositionality with Formal Languages	Aug 17, 2022	BenchmarkingOpen-Ended Question Answering	CodeCode Available	1
A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation Models	Aug 2, 2022	BenchmarkingSynthetic Data Generation	CodeCode Available	1
CIPCaD-Bench: Continuous Industrial Process datasets for benchmarking Causal Discovery methods	Aug 2, 2022	BenchmarkingCausal Discovery	CodeCode Available	1
Accelerated and interpretable oblique random survival forests	Aug 1, 2022	BenchmarkingComputational Efficiency	CodeCode Available	1
Tracking Every Thing in the Wild	Jul 26, 2022	BenchmarkingClassification	CodeCode Available	1
ArtFID: Quantitative Evaluation of Neural Style Transfer	Jul 25, 2022	BenchmarkingMeta-Learning	CodeCode Available	1
Physiology-based simulation of the retinal vasculature enables annotation-free segmentation of OCT angiographs	Jul 22, 2022	BenchmarkingRetinal Vessel Segmentation	CodeCode Available	1
ALTO: A Large-Scale Dataset for UAV Visual Place Recognition and Localization	Jul 19, 2022	BenchmarkingImage Registration	CodeCode Available	1
Detecting beats in the photoplethysmogram: benchmarking open-source algorithms	Jul 19, 2022	BenchmarkingPhotoplethysmography (PPG) beat detection	CodeCode Available	1
Initial recommendations for performing, benchmarking, and reporting single-cell proteomics experiments	Jul 19, 2022	BenchmarkingExperimental Design	CodeCode Available	1
Benchmarking Omni-Vision Representation through the Lens of Visual Realms	Jul 14, 2022	BenchmarkingContrastive Learning	CodeCode Available	1
TASKOGRAPHY: Evaluating robot task planning over large 3D scene graphs	Jul 11, 2022	BenchmarkingRepresentation Learning	CodeCode Available	1
Graph Generative Model for Benchmarking Graph Neural Networks	Jul 10, 2022	BenchmarkingGraph Generation	CodeCode Available	1
Can Language Models Make Fun? A Case Study in Chinese Comical Crosstalk	Jul 2, 2022	BenchmarkingMachine Translation	CodeCode Available	1
Less Is More: A Comparison of Active Learning Strategies for 3D Medical Image Segmentation	Jul 2, 2022	Active LearningBenchmarking	CodeCode Available	1
DFGC 2022: The Second DeepFake Game Competition	Jun 30, 2022	BenchmarkingFace Swapping	CodeCode Available	1
Benchmarking the Robustness of Deep Neural Networks to Common Corruptions in Digital Pathology	Jun 30, 2022	BenchmarkingDiagnostic	CodeCode Available	1
Beyond neural scaling laws: beating power law scaling via data pruning	Jun 29, 2022	Benchmarking	CodeCode Available	1
Summarizing Videos using Concentrated Attention and Considering the Uniqueness and Diversity of the Video Frames	Jun 29, 2022	BenchmarkingDiversity	CodeCode Available	1
The DEBS 2022 Grand Challenge: Detecting Trading Trends in Financial Tick Data	Jun 23, 2022	Benchmarking	CodeCode Available	1
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code	Jun 22, 2022	BenchmarkingText Generation	CodeCode Available	1
OpenXAI: Towards a Transparent Evaluation of Model Explanations	Jun 22, 2022	BenchmarkingExplainable Artificial Intelligence (XAI)	CodeCode Available	1
Benchmarking Constraint Inference in Inverse Reinforcement Learning	Jun 20, 2022	Autonomous DrivingBenchmarking	CodeCode Available	1
What is Where by Looking: Weakly-Supervised Open-World Phrase-Grounding without Text Inputs	Jun 19, 2022	BenchmarkingImage Captioning	CodeCode Available	1
NAS-Bench-Graph: Benchmarking Graph Neural Architecture Search	Jun 18, 2022	BenchmarkingGraph Neural Network	CodeCode Available	1
SMPL: Simulated Industrial Manufacturing and Process Control Learning Environments	Jun 17, 2022	BenchmarkingDeep Reinforcement Learning	CodeCode Available	1
Long Range Graph Benchmark	Jun 16, 2022	BenchmarkingGraph Classification	CodeCode Available	1
Taxonomy of Benchmarks in Graph Representation Learning	Jun 15, 2022	BenchmarkingGraph Representation Learning	CodeCode Available	1
Evaluating histopathology transfer learning with ChampKit	Jun 14, 2022	BenchmarkingCell Detection	CodeCode Available	1
ISLES 2022: A multi-center magnetic resonance imaging stroke lesion segmentation dataset	Jun 14, 2022	BenchmarkingIschemic Stroke Lesion Segmentation	CodeCode Available	1
Data-Driven Denoising of Stationary Accelerometer Signals	Jun 13, 2022	BenchmarkingDenoising	CodeCode Available	1
SwinCheX: Multi-label classification on chest X-ray images with transformers	Jun 9, 2022	BenchmarkingMulti-Label Classification	CodeCode Available	1
Do We Need Another Explainable AI Method? Toward Unifying Post-hoc XAI Evaluation Methods into an Interactive and Multi-dimensional Benchmark	Jun 8, 2022	BenchmarkingExplainable Artificial Intelligence (XAI)	CodeCode Available	1
Revisiting Realistic Test-Time Training: Sequential Inference and Adaptation by Anchored Clustering	Jun 6, 2022	BenchmarkingClustering	CodeCode Available	1
Revisiting the "Video" in Video-Language Understanding	Jun 3, 2022	BenchmarkingQuestion Answering	CodeCode Available	1
Needle In A Haystack, Fast: Benchmarking Image Perceptual Similarity Metrics At Scale	Jun 1, 2022	Benchmarking	CodeCode Available	1
Jojajovai: A Parallel Guarani-Spanish Corpus for MT Benchmarking	Jun 1, 2022	BenchmarkingSentence	CodeCode Available	1
A Japanese Dataset for Subjective and Objective Sentiment Polarity Classification in Micro Blog Domain	Jun 1, 2022	BenchmarkingEmotion Recognition	CodeCode Available	1
Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object Detection	May 30, 2022	3D Object DetectionAutonomous Driving	CodeCode Available	1
Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions	May 27, 2022	BenchmarkingFew-Shot Image Classification	CodeCode Available	1
Failure Detection in Medical Image Classification: A Reality Check and Benchmarking Testbed	May 27, 2022	BenchmarkingBinary Classification	CodeCode Available	1
MIMII DG: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection for Domain Generalization Task	May 27, 2022	BenchmarkingDomain Generalization	CodeCode Available	1
GENEVA: Benchmarking Generalizability for Event Argument Extraction with Hundreds of Event Types and Argument Roles	May 25, 2022	BenchmarkingEvent Argument Extraction	CodeCode Available	1
Optimizing Performance of Federated Person Re-identification: Benchmarking and Analysis	May 24, 2022	BenchmarkingFederated Learning	CodeCode Available	1
PyRelationAL: a python library for active learning research and development	May 23, 2022	Active LearningBenchmarking	CodeCode Available	1

Show:10 25 50

← PrevPage 23 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified