SOTAVerified

Benchmarking

Papers

Showing 19011950 of 5548 papers

TitleStatusHype
A novel database of Children's Spontaneous Facial Expressions (LIRIS-CSE)0
Benchmarking and Pushing the Multi-Bias Elimination Boundary of LLMs via Causal Effect Estimation-guided Debiasing0
Benchmarking and Performance Modelling of MapReduce Communication Pattern0
CRF-based Single-stage Acoustic Modeling with CTC Topology0
ADCB: An Alzheimer's disease benchmark for evaluating observational estimators of causal effects0
Channel Attention based Iterative Residual Learning for Depth Map Super-Resolution0
A novel machine learning based framework for detection of Autism Spectrum Disorder (ASD)0
Benchmarking and Optimization of Gradient Boosting Decision Tree Algorithms0
Efficient Sparse Coding with the Adaptive Locally Competitive Algorithm for Speech Classification0
Benchmarking Zero-Shot Recognition with Vision-Language Models: Challenges on Granularity and Specificity0
CroCoDL: Cross-device Collaborative Dataset for Localization0
CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models0
CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code Models0
Cross-functional transferability in universal machine learning interatomic potentials0
Benchmarking Domain Generalization on EEG-based Emotion Recognition0
A Novel Momentum-Based Deep Learning Techniques for Medical Image Classification and Segmentation0
Benchmarking Domain Randomisation for Visual Sim-to-Real Transfer0
crossMoDA Challenge: Evolution of Cross-Modality Domain Adaptation Techniques for Vestibular Schwannoma and Cochlea Segmentation from 2021 to 20230
EfficientSRFace: An Efficient Network with Super-Resolution Enhancement for Accurate Face Detection0
Efficient Training of Deep Classifiers for Wireless Source Identification using Test SNR Estimates0
Cross-replication Reliability -- An Empirical Approach to Interpreting Inter-rater Reliability0
Cross-replication Reliability - An Empirical Approach to Interpreting Inter-rater Reliability0
Cross-subject Brain Functional Connectivity Analysis for Multi-task Cognitive State Evaluation0
Cross-Subject Deep Transfer Models for Evoked Potentials in Brain-Computer Interface0
EgoPressure: A Dataset for Hand Pressure and Pose Estimation in Egocentric Vision0
Challenges in Benchmarking Stream Learning Algorithms with Real-world Data0
CRS Arena: Crowdsourced Benchmarking of Conversational Recommender Systems0
Benchmarking Edge AI Platforms for High-Performance ML Inference0
Challenges and Pitfalls of Machine Learning Evaluation and Benchmarking0
Benchmarking and Learning Multi-Dimensional Quality Evaluator for Text-to-3D Generation0
CSPO: Cross-Market Synergistic Stock Price Movement Forecasting with Pseudo-volatility Optimization0
CSR-Bench: Benchmarking LLM Agents in Deployment of Computer Science Research Repositories0
Challenges and perspectives in computational deconvolution of genomics data0
Evaluation of simulation methods for tumor subclonal reconstruction0
Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors0
AN ELIXIR FOR BLOCKCHAIN SCALABILITY WITH CHANNEL BASED CLUSTERED SHARDING0
Challenges and Advancements in Modeling Shock Fronts with Physics-Informed Neural Networks: A Review and Benchmarking Study0
CubeSat-Enabled Free-Space Optics: Joint Data Communication and Fine Beam Tracking0
Benchmarking End-to-end Learning of MIMO Physical-Layer Communication0
Challenge Results Are Not Reproducible0
A Dataset Similarity Evaluation Framework for Wireless Communications and Sensing0
Benchmarking End-To-End Performance of AI-Based Chip Placement Algorithms0
ChakmaNMT: A Low-resource Machine Translation On Chakma Language0
CURE: Concept Unlearning via Orthogonal Representation Editing in Diffusion Models0
Benchmarking Energy-Conserving Neural Networks for Learning Dynamics from Data0
Chain of LoRA: Efficient Fine-tuning of Language Models via Residual Learning0
Audio Turing Test: Benchmarking the Human-likeness of Large Language Model-based Text-to-Speech Systems in Chinese0
Curse of Slicing: Why Sliced Mutual Information is a Deceptive Measure of Statistical Dependence0
Benchmarking Estimators for Natural Experiments: A Novel Dataset and a Doubly Robust Algorithm0
C-FedRAG: A Confidential Federated Retrieval-Augmented Generation System0
Show:102550
← PrevPage 39 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified