SOTAVerified

Benchmarking

Papers

Showing 39513975 of 5548 papers

TitleStatusHype
Decisions and Performance Under Bounded Rationality: A Computational Benchmarking Approach0
Transfer of Knowledge through Reverse Annealing: A Preliminary Analysis of the Benefits and What to Share0
What Will it Take to Fix Benchmarking in Natural Language Understanding?0
Transformed Subspace Clustering0
On the Evaluation of Speech Foundation Models for Spoken Language Understanding0
On the Evaluation of User Privacy in Deep Neural Networks using Timing Side Channel0
Transformers in Protein: A Survey0
Benchmarking Answer Verification Methods for Question Answering-Based Summarization Evaluation Metrics0
On the Impact of Data Heterogeneity in Federated Learning Environments with Application to Healthcare Networks0
Broadening the Scope of Neural Network Potentials through Direct Inclusion of Additional Molecular Attributes0
On the Interaction of Belief Bias and Explanations0
Visual Anomaly Detection under Complex View-Illumination Interplay: A Large-Scale Benchmark0
On the Performance of Multimodal Language Models0
On the Potential of Large Language Models to Solve Semantics-Aware Process Mining Tasks0
On the project risk baseline: integrating aleatory uncertainty into project scheduling0
On the Real-Time Semantic Segmentation of Aphid Clusters in the Wild0
On the reduction of Linear Parameter-Varying State-Space models0
On the relationship between Benchmarking, Standards and Certification in Robotics and AI0
On the Reliability and Validity of Detecting Approval of Political Actors in Tweets0
On the Robustness of Human-Object Interaction Detection against Distribution Shift0
On the role of benchmarking data sets and simulations in method comparison studies0
Optimizer Benchmarking Needs to Account for Hyperparameter Tuning0
Transformers Utilization in Chart Understanding: A Review of Recent Advances & Future Trends0
Transforming Game Play: A Comparative Study of DCQN and DTQN Architectures in Reinforcement Learning0
Translation Canvas: An Explainable Interface to Pinpoint and Analyze Translation Systems0
Show:102550
← PrevPage 159 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified