SOTAVerified

Benchmarking

Papers

Showing 36513700 of 5548 papers

TitleStatusHype
MTEB: Massive Text Embedding BenchmarkCode4
OpenOOD: Benchmarking Generalized Out-of-Distribution DetectionCode0
Benchmarking Long-tail Generalization with Likelihood SplitsCode0
Simulated Contextual Bandits for Personalization Tasks from Recommendation DatasetsCode0
Vote'n'Rank: Revision of Benchmarking with Social Choice TheoryCode0
DCL-Net: Deep Correspondence Learning Network for 6D Pose EstimationCode1
Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems0
Benchmarking saliency methods for chest X-ray interpretationCode1
A Survey of Methods for Addressing Class Imbalance in Deep-Learning Based Natural Language Processing0
Benchmarking Reinforcement Learning Techniques for Autonomous NavigationCode1
Quantifying Social Biases Using Templates is Unreliable0
ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial ViewpointsCode1
Are All Steps Equally Important? Benchmarking Essentiality Detection of Events0
Is margin all you need? An extensive empirical study of active learning on tabular data0
A Theory of Dynamic Benchmarks0
SynBench: Task-Agnostic Benchmarking of Pretrained Representations using Synthetic Data0
IJCB 2022 Mobile Behavioral Biometrics Competition (MobileB2C)Code0
A Framework for Large Scale Synthetic Graph Dataset Generation0
Benchmarking Learnt Radio Localisation under Distribution Shift0
MEDFAIR: Benchmarking Fairness for Medical ImagingCode0
Detection and Evaluation of Clusters within Sequential Data0
rPPG-Toolbox: Deep Remote PPG ToolboxCode2
The current state of single-cell proteomics data analysisCode0
DELAD: Deep Landweber-guided deconvolution with Hessian and sparse prior0
State-specific protein-ligand complex structure prediction with a multi-scale deep generative modelCode2
Building Normalizing Flows with Stochastic InterpolantsCode2
Benchmarking Learning Efficiency in Deep Reservoir ComputingCode0
Neural Methods for Logical Reasoning Over Knowledge GraphsCode1
Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video GroundingCode0
Deep Feature Selection Using a Novel Complementary Feature Mask0
Feature Encodings for Gradient Boosting with Automunge0
Removal of Ocular Artifacts in EEG Using Deep Learning0
How Good Is Neural Combinatorial Optimization? A Systematic Evaluation on the Traveling Salesman Problem0
Benchmarking Apache Spark and Hadoop MapReduce on Big Data ClassificationCode0
Progressive with Purpose: Guiding Progressive Inpainting DNNs through Context and Structure0
Benchmarking energy consumption and latency for neuromorphic computing in condensed matter and particle physics0
Benchmarking and Analyzing 3D Human Pose and Shape Estimation Beyond AlgorithmsCode1
Periodic Extrapolative Generalisation in Neural NetworksCode0
A framework for benchmarking clustering algorithmsCode1
Feature embedding in click-through rate predictionCode0
FACT: Learning Governing Abstractions Behind Integer Sequences0
Sanity Check for External Clustering Validation Benchmarks using Internal Validation MeasuresCode1
Rewarding Episodic Visitation Discrepancy for Exploration in Reinforcement Learning0
Skills and Liquidity Barriers to Youth Employment: Medium-term Evidence from a Cash Benchmarking Experiment in Rwanda0
Active-Passive SimStereo -- Benchmarking the Cross-Generalization Capabilities of Deep Learning-based Stereo MethodsCode1
ScreenQA: Large-Scale Question-Answer Pairs over Mobile App ScreenshotsCode1
A Comprehensive Benchmark for COVID-19 Predictive Modeling Using Electronic Health Records in Intensive CareCode1
LAVIS: A Library for Language-Vision Intelligence0
Is Synthetic Dataset Reliable for Benchmarking Generalizable Person Re-Identification?0
OpenMixup: Open Mixup Toolbox and Benchmark for Visual Representation Learning0
Show:102550
← PrevPage 74 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified