SOTAVerified

Benchmarking

Papers

Showing 47764800 of 5548 papers

TitleStatusHype
WordCraft: An Environment for Benchmarking Commonsense AgentsCode1
Domain2Vec: Domain Embedding for Unsupervised Domain AdaptationCode0
Towards an Automated SOAP Note: Classifying Utterances from Medical Conversations0
CoNES: Convex Natural Evolutionary Strategies0
Are We There Yet? Evaluating State-of-the-Art Neural Network based Geoparsers Using EUPEG as a Benchmarking PlatformCode1
Emoji Prediction: Extensions and BenchmarkingCode1
Towards causal benchmarking of bias in face analysis algorithmsCode0
CheXphoto: 10,000+ Photos and Transformations of Chest X-rays for Benchmarking Deep Learning RobustnessCode1
Affine Non-negative Collaborative Representation Based Pattern ClassificationCode0
GAMA: a General Automated Machine learning AssistantCode1
VisImages: A Fine-Grained Expert-Annotated Visualization Dataset0
Enhancing spatial and textual analysis with EUPEG: an extensible and unified platform for evaluating geoparsersCode1
URSABench: Comprehensive Benchmarking of Approximate Bayesian Inference Methods for Deep Neural NetworksCode1
Quaternion Capsule NetworksCode0
RobFR: Benchmarking Adversarial Robustness on Face RecognitionCode1
IOHanalyzer: Detailed Performance Analyses for Iterative Optimization HeuristicsCode1
Benchmarking in Optimization: Best Practice and Open Issues0
Re-thinking Co-Salient Object DetectionCode1
Wiki-CS: A Wikipedia-Based Benchmark for Graph Neural NetworksCode1
Complex Human Action Recognition in Live Videos Using Hybrid FR-DL Method0
Does imputation matter? Benchmark for predictive models0
Automatic Target Recognition on Synthetic Aperture Radar Imagery: A Survey0
Building benchmarking frameworks for supporting replicability and reproducibility: spatial and textual analysis as an example0
Quo Vadis, Skeleton Action Recognition ?Code1
Meta-SAC: Auto-tune the Entropy Temperature of Soft Actor-Critic via MetagradientCode1
Show:102550
← PrevPage 192 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified