SOTAVerified

Benchmarking

Papers

Showing 44014450 of 5548 papers

TitleStatusHype
Real Time Egocentric Object Segmentation: THU-READ Labeling and Benchmarking Results0
The Russian practice of applying cluster approach in regional development0
EXPObench: Benchmarking Surrogate-based Optimisation Algorithms on Expensive Black-box FunctionsCode1
The Medkit-Learn(ing) Environment: Medical Decision Modelling through SimulationCode1
A critical look at the current train/test split in machine learning0
RobustNav: Towards Benchmarking Robustness in Embodied NavigationCode1
On the use of automatically generated synthetic image datasets for benchmarking face recognitionCode0
Benchmarking Bias Mitigation Algorithms in Representation Learning through Fairness MetricsCode1
A Benchmarking Protocol for Pansharpening: Dataset, Preprocessing, and Quality Assessment0
Can a single neuron learn predictive uncertainty?Code0
Predicting Quantum Potentials by Deep Neural Network and Metropolis Sampling0
On Training Sample Memorization: Lessons from Benchmarking Generative Modeling with a Large-scale CompetitionCode0
Tetrad: Actively Secure 4PC for Secure Training and Inference0
Top-k Regularization for Supervised Feature Selection0
Adaptive Epidemic Forecasting and Community Risk Evaluation of COVID-190
Comprehensive Energy Footprint Benchmarking Algorithm for Electrified Powertrains0
DFGC 2021: A DeepFake Game CompetitionCode1
Benchmarking CNN on 3D Anatomical Brain MRI: Architectures, Data Augmentation and Deep Ensemble Learning0
OctoPath: An OcTree Based Self-Supervised Learning Approach to Local Trajectory Planning for Mobile Robots0
Knowledge-Driven Slot Constraints for Goal-Oriented Dialogue SystemsCode0
Comprehensive Energy Footprint Benchmarking of Strong Parallel Electrified Powertrain0
Cash versus Kind: Benchmarking a Child Nutrition Program against Unconditional Cash Transfers in Rwanda0
Procedural Content Generation: Better Benchmarks for Transfer Reinforcement Learning0
A General Taylor Framework for Unifying and Revisiting Attribution Methods0
Benchmarking Scientific Image Forgery Detectors0
Speed Benchmarking of Genetic Programming Frameworks0
FedScale: Benchmarking Model and System Performance of Federated Learning at ScaleCode1
Benchmarking the Performance of Bayesian Optimization across Multiple Experimental Materials Science DomainsCode1
Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking0
Helsinki Deblur Challenge 2021: description of photographic data0
Anabranch Network for Camouflaged Object SegmentationCode1
Laughing Heads: Can Transformers Detect What Makes a Sentence Funny?Code0
Multimodal Fusion via Teacher-Student Network for Indoor Action RecognitionCode1
DACBench: A Benchmark Library for Dynamic Algorithm ConfigurationCode1
Global Wheat Head Dataset 2021: more diversity to improve the benchmarking of wheat head localization methods0
Quantifying the Impact of Boundary Constraint Handling Methods on Differential Evolution0
Sanity Simulations for Saliency MethodsCode0
Best practices for constructing, preparing, and evaluating protein-ligand binding affinity benchmarksCode1
A Reinforcement Learning Environment for Multi-Service UAV-enabled Wireless SystemsCode1
Benchmarking down-scaled (not so large) pre-trained language modelsCode0
Examining convolutional feature extraction using Maximum Entropy (ME) and Signal-to-Noise Ratio (SNR) for image classification0
CREPO: An Open Repository to Benchmark Credal Network AlgorithmsCode0
Towards Benchmarking the Utility of Explanations for Model Debugging0
Beyond Monocular Deraining: Parallel Stereo Deraining Network Via Semantic Prior0
MS MARCO: Benchmarking Ranking Models in the Large-Data Regime0
D2S: Document-to-Slide Generation Via Query-Based Text SummarizationCode1
Covariance Matrix Adaptation Evolution Strategy Assisted by Principal Component Analysis0
AnomalyHop: An SSL-based Image Anomaly Localization MethodCode1
Building and benchmarking an Arabic Speech Commands dataset for small-footprint keyword spottingCode0
Open Radar Initiative: Large Scale Dataset for Benchmarking of micro-Doppler Recognition AlgorithmsCode1
Show:102550
← PrevPage 89 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified