SOTAVerified

Benchmarking

Papers

Showing 49514960 of 5548 papers

TitleStatusHype
Benchmarking of Query Strategies: Towards Future Deep Active LearningCode0
Semi-Supervised Learning for Anomaly Traffic Detection via Bidirectional Normalizing FlowsCode0
A Context-Aware Citation Recommendation Model with BERT and Graph Convolutional NetworksCode0
Named Clinical Entity Recognition BenchmarkCode0
EvalxNLP: A Framework for Benchmarking Post-Hoc Explainability Methods on NLP ModelsCode0
Evaluating the Transferability of Machine-Learned Force Fields for Material Property ModelingCode0
Evaluating the Systematic Reasoning Abilities of Large Language Models through Graph ColoringCode0
Evaluating the Robustness of Deep Reinforcement Learning for Autonomous Policies in a Multi-agent Urban Driving EnvironmentCode0
Watts: Infrastructure for Open-Ended LearningCode0
Evaluating the Ability of LLMs to Solve Semantics-Aware Process Mining TasksCode0
Show:102550
← PrevPage 496 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified