SOTAVerified

Benchmarking

Papers

Showing 11611170 of 5548 papers

TitleStatusHype
Continual Learning with Foundation Models: An Empirical Study of Latent ReplayCode1
A global analysis of metrics used for measuring performance in natural language processingCode1
NICO++: Towards Better Benchmarking for Domain GeneralizationCode1
Stress-Testing Point Cloud Registration on Automotive LiDARCode1
Deep learning model solves change point detection for multiple change typesCode1
Do You Really Mean That? Content Driven Audio-Visual Deepfake Dataset and Multimodal Method for Temporal Forgery LocalizationCode1
Data Splits and Metrics for Method Benchmarking on Surgical Action Triplet DatasetsCode1
BioRED: A Rich Biomedical Relation Extraction DatasetCode1
The Moral Integrity Corpus: A Benchmark for Ethical Dialogue SystemsCode1
Dynatask: A Framework for Creating Dynamic AI Benchmark TasksCode1
Show:102550
← PrevPage 117 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified