SOTAVerified

Benchmarking

Papers

Showing 40014025 of 5548 papers

TitleStatusHype
Hawk: An Industrial-strength Multi-label Document Classifier0
Benchmarking Robustness in Neural Radiance Fields0
Evaluating the Transferability of Machine-Learned Force Fields for Material Property ModelingCode0
Critical review of conformational B-cell epitope prediction methodsCode0
Logically at Factify 2: A Multi-Modal Fact Checking System Based on Evidence Retrieval techniques and Transformer Encoder Architecture0
AERF: Adaptive ensemble random fuzzy algorithm for anomaly detection in cloud computing0
"It's a Match!" -- A Benchmark of Task Affinity Scores for Joint Learning0
The Evolutionary Computation Methods No One Should Use0
ANNA: Abstractive Text-to-Image Synthesis with Filtered News CaptionsCode0
Benchmarking common uncertainty estimation methods with histopathological images under domain shift and label noise0
HaN-Seg: The head and neck organ-at-risk CT and MR segmentation dataset0
Improving Sequential Recommendation Models with an Enhanced Loss FunctionCode0
Tree Instance Segmentation With Temporal Contour Graph0
Comparison of tree-based ensemble algorithms for merging satellite and earth-observed precipitation data at the daily time scale0
4Seasons: Benchmarking Visual SLAM and Long-Term Localization for Autonomous Driving in Challenging Conditions0
Biologically Plausible Learning on Neuromorphic Hardware Architectures0
MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing0
Quality at the Tail of Machine Learning Inference0
Benchmarking Machine Learning Models to Predict Corporate Bankruptcy0
A Seven-Layer Model for Standardising AI Fairness Assessment0
Distributed Software-Defined Network Architecture for Smart Grid Resilience to Denial-of-Service Attacks0
AI applications in forest monitoring need remote sensing benchmark datasets0
AnyTOD: A Programmable Task-Oriented Dialog System0
Causally Testing Gender Bias in LLMs: A Case Study on Occupational BiasCode0
Benchmarking person re-identification datasets and approaches for practical real-world implementationsCode0
Show:102550
← PrevPage 161 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified