SOTAVerified

Benchmarking

Papers

Showing 16511700 of 5548 papers

TitleStatusHype
Determinants of Performance in European ATM -- How to Analyze a Diverse Industry0
Benchmarking data encoding methods in Quantum Machine Learning0
DetoxBench: Benchmarking Large Language Models for Multitask Fraud & Abuse Detection0
An Interpretable Measure for Quantifying Predictive Dependence between Continuous Random Variables -- Extended Version0
Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models0
Detection and Evaluation of Clusters within Sequential Data0
Benchmarking Data-driven Automatic Text Simplification for German0
Detection of Adversarial Attacks and Characterization of Adversarial Subspace0
detrex: Benchmarking Detection Transformers0
Development details and computational benchmarking of DEPAM0
Benchmarking Cross-Domain Audio-Visual Deception Detection0
Benchmarking Counterfactual Interpretability in Deep Learning Models for Time Series Classification0
Benchmarking Convolutional Neural Network and Graph Neural Network based Surrogate Models on a Real-World Car External Aerodynamics Dataset0
An in-depth experimental study of sensor usage and visual reasoning of robots navigating in real environments0
Benchmarking Conventional Vision Models on Neuromorphic Fall Detection and Action Recognition Dataset0
Benchmarking Conventional and Learned Video Codecs with a Low-Delay Configuration0
Benchmarking Continual Learning from Cognitive Perspectives0
Absolute Ranking: An Essential Normalization for Benchmarking Optimization Algorithms0
Detecting Finger-Vein Presentation Attacks Using 3D Shape & Diffuse Reflectance Decomposition0
Benchmarking Constraint-Based Bayesian Structure Learning Algorithms: Role of Network Topology0
Benchmarking confound regression strategies for the control of motion artifact in studies of functional connectivity0
ABSA-Bench: Towards the Unified Evaluation of Aspect-based Sentiment Analysis Research0
Design of Supervision-Scalable Learning Systems: Methodology and Performance Benchmarking0
Design Target Achievement Index: A Differentiable Metric to Enhance Deep Generative Models in Multi-Objective Inverse Design0
Benchmarking common uncertainty estimation methods with histopathological images under domain shift and label noise0
Benchmarking Collaborative Learning Methods Cost-Effectiveness for Prostate Segmentation0
Benchmarking Cognitive Domains for LLMs: Insights from Taiwanese Hakka Culture0
A Distance Oriented Kalman Filter Particle Swarm Optimizer Applied to Multi-Modality Image Registration0
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems0
Detecting Out-Of-Distribution Samples Using Low-Order Deep Features Statistics0
Device Modeling Bias in ReRAM-based Neural Network Simulations0
Different Horses for Different Courses: Comparing Bias Mitigation Algorithms in ML0
Diverse Community Data for Benchmarking Data Privacy Algorithms0
Benchmarking CNN on 3D Anatomical Brain MRI: Architectures, Data Augmentation and Deep Ensemble Learning0
Benchmarking Clinical Decision Support Search0
Ad-hoc Concept Forming in the Game Codenames as a Means for Evaluating Large Language Models0
Design2Code: Benchmarking Multimodal Code Generation for Automated Front-End Engineering0
Benchmarking Classical, Deep, and Generative Models for Human Activity Recognition0
An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis0
Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies0
A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection0
ABOUT ML: Annotation and Benchmarking on Understanding and Transparency of Machine Learning Lifecycles0
Design and benchmarking of a two degree of freedom tendon driver unit for cable-driven wearable technologies0
CKnowEdit: A New Chinese Knowledge Editing Dataset for Linguistics, Facts, and Logic Error Correction in LLMs0
A New Stereo Benchmarking Dataset for Satellite Images0
A New Real-World Video Dataset for the Comparison of Defogging Algorithms0
Benchmarking Chest X-ray Diagnosis Models Across Multinational Datasets0
A Density-Guided Temporal Attention Transformer for Indiscernible Object Counting in Underwater Video0
A Boosting Approach to Constructing an Ensemble Stack0
An Analysis of an Integrated Mathematical Modeling -- Artificial Neural Network Approach for the Problems with a Limited Learning Dataset0
Show:102550
← PrevPage 34 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified