Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4201–4225 of 5548 papers

Title	Date	Tasks	Status
Dyna-bAbI: unlocking bAbI’s potential with dynamic synthetic benchmarking	Jul 1, 2022	BenchmarkingNatural Language Understanding	—Unverified
HATE-ITA: New Baselines for Hate Speech Detection in Italian	Jul 1, 2022	BenchmarkingHate Speech Detection	CodeCode Available
Benchmarking Intersectional Biases in NLP	Jul 1, 2022	BenchmarkingBIG-bench Machine Learning	CodeCode Available
SentSpace: Large-Scale Benchmarking and Evaluation of Text using Cognitively Motivated Lexical, Syntactic, and Semantic Features	Jul 1, 2022	BenchmarkingSentence	—Unverified
Local manifold learning and its link to domain-based physics knowledge	Jul 1, 2022	BenchmarkingDimensionality Reduction	CodeCode Available
Analyzing the behaviour of D'WAVE quantum annealer: fine-tuning parameterization and tests with restrictive Hamiltonian formulations	Jul 1, 2022	BenchmarkingCombinatorial Optimization	—Unverified
Benchmarking Language-agnostic Intent Classification for Virtual Assistant Platforms	Jul 1, 2022	BenchmarkingClassification	CodeCode Available
Beyond Emotion: A Multi-Modal Dataset for Human Desire Understanding	Jul 1, 2022	Benchmarking	—Unverified
Computer-aided diagnosis and prediction in brain disorders	Jun 29, 2022	BenchmarkingDecision Making	—Unverified
An extensible Benchmarking Graph-Mesh dataset for studying Steady-State Incompressible Navier-Stokes Equations	Jun 29, 2022	Benchmarking	CodeCode Available
Toward an ImageNet Library of Functions for Global Optimization Benchmarking	Jun 27, 2022	Benchmarkingglobal-optimization	—Unverified
VRKitchen2.0-IndoorKit: A Tutorial for Augmented Indoor Scene Building in Omniverse	Jun 23, 2022	BenchmarkingIndoor Scene Synthesis	CodeCode Available
Beyond Uniform Lipschitz Condition in Differentially Private Optimization	Jun 21, 2022	Benchmarkingregression	—Unverified
BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs	Jun 21, 2022	Anomaly DetectionBenchmarking	CodeCode Available
ConvGeN: Convex space learning improves deep-generative oversampling for tabular imbalanced classification on smaller datasets	Jun 20, 2022	BenchmarkingFraud Detection	CodeCode Available
Design of Supervision-Scalable Learning Systems: Methodology and Performance Benchmarking	Jun 18, 2022	Benchmarkingimage-classification	—Unverified
Motley: Benchmarking Heterogeneity and Personalization in Federated Learning	Jun 18, 2022	BenchmarkingFairness	CodeCode Available
Colonoscopy 3D Video Dataset with Paired Depth from 2D-3D Registration	Jun 17, 2022	BenchmarkingDepth Estimation	—Unverified
Beyond Supervised vs. Unsupervised: Representative Benchmarking and Analysis of Image Representation Learning	Jun 16, 2022	BenchmarkingClustering	CodeCode Available
Pythae: Unifying Generative Autoencoders in Python -- A Benchmarking Use Case	Jun 16, 2022	BenchmarkingDensity Estimation	—Unverified
SATBench: Benchmarking the speed-accuracy tradeoff in object recognition by humans and dynamic neural networks	Jun 16, 2022	BenchmarkingDynamic neural networks	CodeCode Available
Benchmarking Heterogeneous Treatment Effect Models through the Lens of Interpretability	Jun 16, 2022	BenchmarkingFeature Importance	—Unverified
Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models	Jun 16, 2022	BenchmarkingLanguage Modeling	—Unverified
BEHAVIOR in Habitat 2.0: Simulator-Independent Logical Task Description for Benchmarking Embodied AI Agents	Jun 13, 2022	Benchmarking	—Unverified
EmProx: Neural Network Performance Estimation For Neural Architecture Search	Jun 13, 2022	BenchmarkingDecoder	CodeCode Available

Show:10 25 50

← PrevPage 169 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified