SOTAVerified

Benchmarking

Papers

Showing 44264450 of 5548 papers

TitleStatusHype
Speed Benchmarking of Genetic Programming Frameworks0
FedScale: Benchmarking Model and System Performance of Federated Learning at ScaleCode1
Benchmarking the Performance of Bayesian Optimization across Multiple Experimental Materials Science DomainsCode1
Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking0
Helsinki Deblur Challenge 2021: description of photographic data0
Anabranch Network for Camouflaged Object SegmentationCode1
Laughing Heads: Can Transformers Detect What Makes a Sentence Funny?Code0
Multimodal Fusion via Teacher-Student Network for Indoor Action RecognitionCode1
DACBench: A Benchmark Library for Dynamic Algorithm ConfigurationCode1
Global Wheat Head Dataset 2021: more diversity to improve the benchmarking of wheat head localization methods0
Quantifying the Impact of Boundary Constraint Handling Methods on Differential Evolution0
Sanity Simulations for Saliency MethodsCode0
Best practices for constructing, preparing, and evaluating protein-ligand binding affinity benchmarksCode1
A Reinforcement Learning Environment for Multi-Service UAV-enabled Wireless SystemsCode1
Benchmarking down-scaled (not so large) pre-trained language modelsCode0
Examining convolutional feature extraction using Maximum Entropy (ME) and Signal-to-Noise Ratio (SNR) for image classification0
CREPO: An Open Repository to Benchmark Credal Network AlgorithmsCode0
Towards Benchmarking the Utility of Explanations for Model Debugging0
Beyond Monocular Deraining: Parallel Stereo Deraining Network Via Semantic Prior0
MS MARCO: Benchmarking Ranking Models in the Large-Data Regime0
D2S: Document-to-Slide Generation Via Query-Based Text SummarizationCode1
Covariance Matrix Adaptation Evolution Strategy Assisted by Principal Component Analysis0
AnomalyHop: An SSL-based Image Anomaly Localization MethodCode1
Building and benchmarking an Arabic Speech Commands dataset for small-footprint keyword spottingCode0
Open Radar Initiative: Large Scale Dataset for Benchmarking of micro-Doppler Recognition AlgorithmsCode1
Show:102550
← PrevPage 178 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified