SOTAVerified

Benchmarking

Papers

Showing 581590 of 5548 papers

TitleStatusHype
MIRFLEX: Music Information Retrieval Feature Library for ExtractionCode1
LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language ModelsCode1
DetectRL: Benchmarking LLM-Generated Text Detection in Real-World ScenariosCode1
Pedestrian Trajectory Prediction with Missing Data: Datasets, Imputation, and BenchmarkingCode1
LLM4Mat-Bench: Benchmarking Large Language Models for Materials Property PredictionCode1
EMGBench: Benchmarking Out-of-Distribution Generalization and Adaptation for ElectromyographyCode1
AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite ImageryCode1
DataRec: A Python Library for Standardized and Reproducible Data Management in Recommender SystemsCode1
Survey of Cultural Awareness in Language Models: Text and BeyondCode1
LLMCBench: Benchmarking Large Language Model Compression for Efficient DeploymentCode1
Show:102550
← PrevPage 59 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified