SOTAVerified

Benchmarking

Papers

Showing 13711380 of 5548 papers

TitleStatusHype
SinaTools: Open Source Toolkit for Arabic Natural Language Processing0
FEET: A Framework for Evaluating Embedding TechniquesCode0
Varco Arena: A Tournament Approach to Reference-Free Benchmarking Large Language Models0
Artificial Intelligence for Microbiology and Microbiome Research0
A Review of Reinforcement Learning in Financial Applications0
Modern, Efficient, and Differentiable Transport Equation Models using JAX: Applications to Population Balance Equations0
Improving Few-Shot Cross-Domain Named Entity Recognition by Instruction Tuning a Word-Embedding based Retrieval Augmented Large Language Model0
Benchmarking Bias in Large Language Models during Role-Playing0
MIRFLEX: Music Information Retrieval Feature Library for ExtractionCode1
Cityscape-Adverse: Benchmarking Robustness of Semantic Segmentation with Realistic Scene Modifications via Diffusion-Based Image EditingCode0
Show:102550
← PrevPage 138 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified