SOTAVerified

Benchmarking

Papers

Showing 33763400 of 5548 papers

TitleStatusHype
Advanced Manufacturing Configuration by Sample-efficient Batch Bayesian Optimization0
Line Goes Up? Inherent Limitations of Benchmarks for Evaluating Large Language Models0
Liquid State Genetic Programming0
Livestock Monitoring with Transformer0
Benchmarking Multimodal Sentiment Analysis0
LLaVA-Docent: Instruction Tuning with Multimodal Large Language Model to Support Art Appreciation Education0
LLAVIDAL: A Large LAnguage VIsion Model for Daily Activities of Living0
LLM4DV: Using Large Language Models for Hardware Test Stimuli Generation0
Benchmarking Multimodal Regex Synthesis with Complex Structures0
LLM-based Evaluation Policy Extraction for Ecological Modeling0
A War Beyond Deepfake: Benchmarking Facial Counterfeits and Countermeasures0
Benchmarking Multimodal Models for Ukrainian Language Understanding Across Academic and Cultural Domains0
A Distance Oriented Kalman Filter Particle Swarm Optimizer Applied to Multi-Modality Image Registration0
Benchmarking Multimodal Models for Fine-Grained Image Analysis: A Comparative Study Across Diverse Visual Features0
LLM Evaluators Recognize and Favor Their Own Generations0
Benchmarking Multimodal LLMs on Recognition and Understanding over Chemical Tables0
Benchmarking multimedia technologies with the CAMOMILE platform: the case of Multimodal Person Discovery at MediaEval 20150
LLM-initialized Differentiable Causal Discovery0
Totally Corrective Boosting with Cardinality Penalization0
Benchmarking Multi-Domain Active Learning on Image Classification0
LLMPopcorn: An Empirical Study of LLMs as Assistants for Popular Micro-video Generation0
LLM-Powered Grapheme-to-Phoneme Conversion: Benchmark and Case Study0
Incorporating Human Flexibility through Reward Preferences in Human-AI Teaming0
Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms0
LLMs and Finetuning: Benchmarking cross-domain performance for hate speech detection0
Show:102550
← PrevPage 136 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified