SOTAVerified

Benchmarking

Papers

Showing 16511675 of 5548 papers

TitleStatusHype
Benchmarking Approximate Inference Methods for Neural Structured PredictionCode0
KArSL: Arabic Sign Language DatabaseCode0
Benchmarking Apache Spark and Hadoop MapReduce on Big Data ClassificationCode0
a-DCF: an architecture agnostic metric with application to spoofing-robust speaker verificationCode0
Benchmarking Jetson Edge Devices with an End-to-end Video-based Anomaly Detection SystemCode0
KamNet: An Integrated Spatiotemporal Deep Neural Network for Rare Event Search in KamLAND-ZenCode0
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense ScenariosCode0
Benchmarking and Understanding Compositional Relational Reasoning of LLMsCode0
Benchmarking and Rethinking Knowledge Editing for Large Language ModelsCode0
Joint Multi-Scale Tone Mapping and Denoising for HDR Image EnhancementCode0
JExplore: Design Space Exploration Tool for Nvidia Jetson BoardsCode0
An Empirical Evaluation of Cost-based Federated SPARQL Query Processing EnginesCode0
Benchmarking and optimizing organism wide single-cell RNA alignment methodsCode0
An empirical comparison between stochastic and deterministic centroid initialisation for K-Means variationsCode0
A Dataset for Web-Scale Knowledge Base PopulationCode0
JALMBench: Benchmarking Jailbreak Vulnerabilities in Audio Language ModelsCode0
DyKnow: Dynamically Verifying Time-Sensitive Factual Knowledge in LLMsCode0
Benchmarking and Improving Text-to-SQL Generation under AmbiguityCode0
Is Your Model Fairly Certain? Uncertainty-Aware Fairness Evaluation for LLMsCode0
An Efficient Two-stage Gradient Boosting Framework for Short-term Traffic State EstimationCode0
JATE 2.0: Java Automatic Term Extraction with Apache SolrCode0
A Benchmark on Extremely Weakly Supervised Text Classification: Reconcile Seed Matching and Prompting ApproachesCode0
ISImed: A Framework for Self-Supervised Learning using Intrinsic Spatial Information in Medical ImagesCode0
Certifiable Black-Box Attacks with Randomized Adversarial Examples: Breaking Defenses with Provable ConfidenceCode0
IoT Data Trust Evaluation via Machine LearningCode0
Show:102550
← PrevPage 67 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified