Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3651–3700 of 5548 papers

Title	Date	Tasks	Status	Hype
MTEB: Massive Text Embedding Benchmark	Oct 13, 2022	BenchmarkingInformation Retrieval	CodeCode Available	4
OpenOOD: Benchmarking Generalized Out-of-Distribution Detection	Oct 13, 2022	Anomaly DetectionBenchmarking	CodeCode Available	0
Benchmarking Long-tail Generalization with Likelihood Splits	Oct 13, 2022	BenchmarkingLanguage Modeling	CodeCode Available	0
Simulated Contextual Bandits for Personalization Tasks from Recommendation Datasets	Oct 12, 2022	BenchmarkingMulti-Armed Bandits	CodeCode Available	0
Vote'n'Rank: Revision of Benchmarking with Social Choice Theory	Oct 11, 2022	BenchmarkingResult aggregation	CodeCode Available	0
DCL-Net: Deep Correspondence Learning Network for 6D Pose Estimation	Oct 11, 2022	6D Pose Estimation6D Pose Estimation using RGB	CodeCode Available	1
Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems	Oct 11, 2022	BenchmarkingRecommendation Systems	—Unverified	0
Benchmarking saliency methods for chest X-ray interpretation	Oct 10, 2022	BenchmarkingDecision Making	CodeCode Available	1
A Survey of Methods for Addressing Class Imbalance in Deep-Learning Based Natural Language Processing	Oct 10, 2022	BenchmarkingData Augmentation	—Unverified	0
Benchmarking Reinforcement Learning Techniques for Autonomous Navigation	Oct 10, 2022	Autonomous NavigationBenchmarking	CodeCode Available	1
Quantifying Social Biases Using Templates is Unreliable	Oct 9, 2022	AttributeBenchmarking	—Unverified	0
ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial Viewpoints	Oct 8, 2022	Autonomous DrivingBenchmarking	CodeCode Available	1
Are All Steps Equally Important? Benchmarking Essentiality Detection of Events	Oct 8, 2022	AllBenchmarking	—Unverified	0
Is margin all you need? An extensive empirical study of active learning on tabular data	Oct 7, 2022	Active LearningAll	—Unverified	0
A Theory of Dynamic Benchmarks	Oct 6, 2022	Benchmarking	—Unverified	0
SynBench: Task-Agnostic Benchmarking of Pretrained Representations using Synthetic Data	Oct 6, 2022	BenchmarkingRepresentation Learning	—Unverified	0
IJCB 2022 Mobile Behavioral Biometrics Competition (MobileB2C)	Oct 6, 2022	Benchmarking	CodeCode Available	0
A Framework for Large Scale Synthetic Graph Dataset Generation	Oct 4, 2022	BenchmarkingDataset Generation	—Unverified	0
Benchmarking Learnt Radio Localisation under Distribution Shift	Oct 4, 2022	Benchmarking	—Unverified	0
MEDFAIR: Benchmarking Fairness for Medical Imaging	Oct 4, 2022	BenchmarkingFairness	CodeCode Available	0
Detection and Evaluation of Clusters within Sequential Data	Oct 4, 2022	BenchmarkingClustering	—Unverified	0
rPPG-Toolbox: Deep Remote PPG Toolbox	Oct 3, 2022	BenchmarkingData Augmentation	CodeCode Available	2
The current state of single-cell proteomics data analysis	Oct 3, 2022	Benchmarking	CodeCode Available	0
DELAD: Deep Landweber-guided deconvolution with Hessian and sparse prior	Sep 30, 2022	BenchmarkingBlind Image Deblurring	—Unverified	0
State-specific protein-ligand complex structure prediction with a multi-scale deep generative model	Sep 30, 2022	BenchmarkingBlind Docking	CodeCode Available	2
Building Normalizing Flows with Stochastic Interpolants	Sep 30, 2022	BenchmarkingDensity Estimation	CodeCode Available	2
Benchmarking Learning Efficiency in Deep Reservoir Computing	Sep 29, 2022	Benchmarking	CodeCode Available	0
Neural Methods for Logical Reasoning Over Knowledge Graphs	Sep 28, 2022	BenchmarkingKnowledge Graphs	CodeCode Available	1
Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video Grounding	Sep 26, 2022	BenchmarkingNatural Language Queries	CodeCode Available	0
Deep Feature Selection Using a Novel Complementary Feature Mask	Sep 25, 2022	Benchmarkingfeature selection	—Unverified	0
Feature Encodings for Gradient Boosting with Automunge	Sep 25, 2022	BenchmarkingBinarization	—Unverified	0
Removal of Ocular Artifacts in EEG Using Deep Learning	Sep 24, 2022	BenchmarkingDeep Learning	—Unverified	0
How Good Is Neural Combinatorial Optimization? A Systematic Evaluation on the Traveling Salesman Problem	Sep 22, 2022	BenchmarkingCombinatorial Optimization	—Unverified	0
Benchmarking Apache Spark and Hadoop MapReduce on Big Data Classification	Sep 21, 2022	BenchmarkingManagement	CodeCode Available	0
Progressive with Purpose: Guiding Progressive Inpainting DNNs through Context and Structure	Sep 21, 2022	BenchmarkingImage Inpainting	—Unverified	0
Benchmarking energy consumption and latency for neuromorphic computing in condensed matter and particle physics	Sep 21, 2022	Anomaly DetectionBenchmarking	—Unverified	0
Benchmarking and Analyzing 3D Human Pose and Shape Estimation Beyond Algorithms	Sep 21, 2022	3D human pose and shape estimationBenchmarking	CodeCode Available	1
Periodic Extrapolative Generalisation in Neural Networks	Sep 21, 2022	Benchmarking	CodeCode Available	0
A framework for benchmarking clustering algorithms	Sep 20, 2022	BenchmarkingClustering	CodeCode Available	1
Feature embedding in click-through rate prediction	Sep 20, 2022	BenchmarkingClick-Through Rate Prediction	CodeCode Available	0
FACT: Learning Governing Abstractions Behind Integer Sequences	Sep 20, 2022	Benchmarking	—Unverified	0
Sanity Check for External Clustering Validation Benchmarks using Internal Validation Measures	Sep 20, 2022	BenchmarkingClustering	CodeCode Available	1
Rewarding Episodic Visitation Discrepancy for Exploration in Reinforcement Learning	Sep 19, 2022	Atari GamesBenchmarking	—Unverified	0
Skills and Liquidity Barriers to Youth Employment: Medium-term Evidence from a Cash Benchmarking Experiment in Rwanda	Sep 18, 2022	Benchmarking	—Unverified	0
Active-Passive SimStereo -- Benchmarking the Cross-Generalization Capabilities of Deep Learning-based Stereo Methods	Sep 17, 2022	BenchmarkingStereo Matching	CodeCode Available	1
ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots	Sep 16, 2022	BenchmarkingQuestion Answering	CodeCode Available	1
A Comprehensive Benchmark for COVID-19 Predictive Modeling Using Electronic Health Records in Intensive Care	Sep 16, 2022	BenchmarkingDeep Learning	CodeCode Available	1
LAVIS: A Library for Language-Vision Intelligence	Sep 15, 2022	BenchmarkingImage Captioning	—Unverified	0
Is Synthetic Dataset Reliable for Benchmarking Generalizable Person Re-Identification?	Sep 12, 2022	BenchmarkingGeneralizable Person Re-identification	—Unverified	0
OpenMixup: Open Mixup Toolbox and Benchmark for Visual Representation Learning	Sep 11, 2022	BenchmarkingClassification	—Unverified	0

Show:10 25 50

← PrevPage 74 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified