SOTAVerified

Benchmarking

Papers

Showing 18611870 of 5548 papers

TitleStatusHype
Benchmarking Graph Neural Networks for Document Layout Analysis in Public Affairs0
From Knowledge to Reasoning: Evaluating LLMs for Ionic Liquids Research in Chemical and Biological EngineeringCode0
Multi-Modal Explainable Medical AI Assistant for Trustworthy Human-AI Collaboration0
Optimizing Recommendations using Fine-Tuned LLMs0
Evaluating Financial Sentiment Analysis with Annotators Instruction Assisted Prompting: Enhancing Contextual Interpretation and Stock Prediction Accuracy0
Contributions of the Petabyte Scale Sequence Search Codeathon toward efforts to scale sequence-based searches on SRA0
Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information0
Federated Deconfounding and Debiasing Learning for Out-of-Distribution Generalization0
Enhancing Treatment Effect Estimation via Active Learning: A Counterfactual Covering PerspectiveCode0
clem:todd: A Framework for the Systematic Benchmarking of LLM-Based Task-Oriented Dialogue System Realisations0
Show:102550
← PrevPage 187 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified