Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1151–1200 of 5548 papers

Title	Date	Tasks	Status	Hype
Deep Learning-Based Synchronization for Uplink NB-IoT	May 22, 2022	BenchmarkingDeep Learning	CodeCode Available	1
Oracle-MNIST: a Realistic Image Dataset for Benchmarking Machine Learning Algorithms	May 19, 2022	BenchmarkingBIG-bench Machine Learning	CodeCode Available	1
The VoicePrivacy 2020 Challenge Evaluation Plan	May 14, 2022	Benchmarking	CodeCode Available	1
Federated Learning Under Intermittent Client Availability and Time-Varying Communication Constraints	May 13, 2022	BenchmarkingFederated Learning	CodeCode Available	1
Clinical Prompt Learning with Frozen Language Models	May 11, 2022	BenchmarkingGPU	CodeCode Available	1
Structured, flexible, and robust: benchmarking and improving large language models towards more human-like behavior in out-of-distribution reasoning tasks	May 11, 2022	BenchmarkingExplanation Generation	CodeCode Available	1
BiCo-Net: Regress Globally, Match Locally for Robust 6D Pose Estimation	May 7, 2022	6D Pose EstimationBenchmarking	CodeCode Available	1
GenISP: Neural ISP for Low-Light Machine Cognition	May 7, 2022	BenchmarkingImage Restoration	CodeCode Available	1
Benchmarking Econometric and Machine Learning Methodologies in Nowcasting	May 6, 2022	BenchmarkingBIG-bench Machine Learning	CodeCode Available	1
Creating a Forensic Database of Shoeprints from Online Shoe Tread Photos	May 4, 2022	BenchmarkingDepth Estimation	CodeCode Available	1
Continual Learning with Foundation Models: An Empirical Study of Latent Replay	Apr 30, 2022	BenchmarkingContinual Learning	CodeCode Available	1
A global analysis of metrics used for measuring performance in natural language processing	Apr 25, 2022	BenchmarkingMachine Translation	CodeCode Available	1
NICO++: Towards Better Benchmarking for Domain Generalization	Apr 17, 2022	BenchmarkingDomain Generalization	CodeCode Available	1
Stress-Testing Point Cloud Registration on Automotive LiDAR	Apr 16, 2022	Autonomous DrivingBenchmarking	CodeCode Available	1
Deep learning model solves change point detection for multiple change types	Apr 15, 2022	BenchmarkingChange Point Detection	CodeCode Available	1
Do You Really Mean That? Content Driven Audio-Visual Deepfake Dataset and Multimodal Method for Temporal Forgery Localization	Apr 13, 2022	BenchmarkingDeepFake Detection	CodeCode Available	1
Data Splits and Metrics for Method Benchmarking on Surgical Action Triplet Datasets	Apr 11, 2022	Action Triplet RecognitionBenchmarking	CodeCode Available	1
BioRED: A Rich Biomedical Relation Extraction Dataset	Apr 8, 2022	BenchmarkingBinary Relation Extraction	CodeCode Available	1
The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems	Apr 6, 2022	AttributeBenchmarking	CodeCode Available	1
Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks	Apr 5, 2022	Benchmarking	CodeCode Available	1
Coarse-to-Fine Q-attention with Learned Path Ranking	Apr 4, 2022	Benchmarking	CodeCode Available	1
Parameter-efficient Model Adaptation for Vision Transformers	Mar 29, 2022	BenchmarkingClassification	CodeCode Available	1
Earnings-22: A Practical Benchmark for Accents in the Wild	Mar 29, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
Visual Abductive Reasoning	Mar 26, 2022	BenchmarkingSentence	CodeCode Available	1
Fantastic Questions and Where to Find Them: FairytaleQA -- An Authentic Dataset for Narrative Comprehension	Mar 26, 2022	BenchmarkingQuestion Answering	CodeCode Available	1
Benchmarking Visual Localization for Autonomous Navigation	Mar 24, 2022	Autonomous NavigationBenchmarking	CodeCode Available	1
minicons: Enabling Flexible Behavioral and Representational Analyses of Transformer Language Models	Mar 24, 2022	BenchmarkingSentence	CodeCode Available	1
Sionna: An Open-Source Library for Next-Generation Physical Layer Research	Mar 22, 2022	BenchmarkingGPU	CodeCode Available	1
SHEL5K: An Extended Dataset and Benchmarking for Safety Helmet Detection	Mar 17, 2022	BenchmarkingObject Detection	CodeCode Available	1
ROOD-MRI: Benchmarking the robustness of deep learning segmentation models to out-of-distribution and corrupted data in MRI	Mar 11, 2022	BenchmarkingData Augmentation	CodeCode Available	1
ClearPose: Large-scale Transparent Object Dataset and Benchmark	Mar 8, 2022	BenchmarkingDepth Completion	CodeCode Available	1
Quasi-Balanced Self-Training on Noise-Aware Synthesis of Object Point Clouds for Closing Domain Gap	Mar 8, 2022	BenchmarkingDomain Adaptation	CodeCode Available	1
ImageNet-Patch: A Dataset for Benchmarking Machine Learning Robustness against Adversarial Patches	Mar 7, 2022	Adversarial RobustnessBenchmarking	CodeCode Available	1
SurvSet: An open-source time-to-event dataset repository	Mar 7, 2022	Benchmarking	CodeCode Available	1
The importance of being constrained: dealing with infeasible solutions in Differential Evolution and beyond	Mar 7, 2022	Benchmarking	CodeCode Available	1
A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for Segment-level Video Copy Detection	Mar 5, 2022	BenchmarkingCopy Detection	CodeCode Available	1
Just Rank: Rethinking Evaluation with Word and Sentence Similarities	Mar 5, 2022	BenchmarkingSemantic Similarity	CodeCode Available	1
HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction	Mar 3, 2022	Action SegmentationBenchmarking	CodeCode Available	1
Mukayese: Turkish NLP Strikes Back	Mar 2, 2022	BenchmarkingLanguage Modeling	CodeCode Available	1
3D Common Corruptions and Data Augmentation	Mar 2, 2022	BenchmarkingData Augmentation	CodeCode Available	1
GraphWorld: Fake Graphs Bring Real Insights for GNNs	Feb 28, 2022	Benchmarking	CodeCode Available	1
PMC-Patients: A Large-scale Dataset of Patient Summaries and Relations for Benchmarking Retrieval-based Clinical Decision Support Systems	Feb 28, 2022	ArticlesBenchmarking	CodeCode Available	1
MultiRes-NetVLAD: Augmenting Place Recognition Training with Low-Resolution Imagery	Feb 18, 2022	BenchmarkingRepresentation Learning	CodeCode Available	1
Benchmarking of DL Libraries and Models on Mobile Devices	Feb 14, 2022	BenchmarkingGPU	CodeCode Available	1
MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts	Feb 14, 2022	Benchmarking	CodeCode Available	1
What are the best systems? New perspectives on NLP Benchmarking	Feb 8, 2022	Benchmarking	CodeCode Available	1
ECRECer: Enzyme Commission Number Recommendation and Benchmarking based on Multiagent Dual-core Learning	Feb 8, 2022	BenchmarkingLanguage Modelling	CodeCode Available	1
Benchmarking Deep Models for Salient Object Detection	Feb 7, 2022	BenchmarkingObject	CodeCode Available	1
Benchmarking and Analyzing Point Cloud Classification under Corruptions	Feb 7, 2022	BenchmarkingClassification	CodeCode Available	1
RECOVER: sequential model optimization platform for combination drug repurposing identifies novel synergistic compounds in vitro	Feb 7, 2022	BenchmarkingModel Optimization	CodeCode Available	1

Show:10 25 50

← PrevPage 24 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified