SOTAVerified

Benchmarking

Papers

Showing 32013225 of 5548 papers

TitleStatusHype
From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future0
From Precision to Perception: User-Centred Evaluation of Keyword Extraction Algorithms for Internet-Scale Contextual Advertising0
From Private to Public: Benchmarking GANs in the Context of Private Time Series Classification0
From Protoscience to Epistemic Monoculture: How Benchmarking Set the Stage for the Deep Learning Revolution0
From Sound Representation to Model Robustness0
From Standalone LLMs to Integrated Intelligence: A Survey of Compound Al Systems0
From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference0
FSD-10: A Dataset for Competitive Sports Content Analysis0
Full-scale modal testing of a Hawk T1A aircraft for benchmarking vibration-based methods0
Full-stack evaluation of Machine Learning inference workloads for RISC-V systems0
FunBench: Benchmarking Fundus Reading Skills of MLLMs0
Functional Code Building Genetic Programming0
Efficient Pauli channel estimation with logarithmic quantum memory0
FuzzWiz -- Fuzzing Framework for Efficient Hardware Coverage0
Fuzzy Knowledge Distillation from High-Order TSK to Low-Order TSK0
Genetic algorithm for feature selection of EEG heterogeneous data0
Galvatron: An Automatic Distributed System for Efficient Foundation Model Training0
GAN-based disentanglement learning for chest X-ray rib suppression0
GANmut: Generating and Modifying Facial Expressions0
GaSLight: Gaussian Splats for Spatially-Varying Lighting in HDR0
GateLens: A Reasoning-Enhanced LLM Agent for Automotive Software Release Analytics0
Gauss-Ramanujan Functions: Constructions, Properties, and Applications in Communications and Signal Processing0
GenderBias-VL: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing0
GeneAgent: Self-verification Language Agent for Gene Set Knowledge Discovery using Domain Databases0
Generalised Gaussian Process Latent Variable Models (GPLVM) with Stochastic Variational Inference0
Show:102550
← PrevPage 129 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified