SOTAVerified

Benchmarking

Papers

Showing 23762400 of 5548 papers

TitleStatusHype
Scaling laws in global corporations as a benchmarking approach to assess environmental performance0
A Correlation- and Mean-Aware Loss Function and Benchmarking Framework to Improve GAN-based Tabular Data Synthesis0
Full-stack evaluation of Machine Learning inference workloads for RISC-V systems0
Efficient Pauli channel estimation with logarithmic quantum memory0
Generative Psycho-Lexical Approach for Constructing Value Systems in Large Language Models0
Benchmarking Transformer-based Language Models for Arabic Sentiment and Sarcasm Detection0
Automatic Microprocessor Performance Bug Detection0
From Standalone LLMs to Integrated Intelligence: A Survey of Compound Al Systems0
From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference0
Benchmarking Toxic Molecule Classification using Graph Neural Networks and Few Shot Learning0
Automatic detection of passable roads after floods in remote sensed and social media data0
From Protoscience to Epistemic Monoculture: How Benchmarking Set the Stage for the Deep Learning Revolution0
A Line-of-Sight Channel Model for the 100-450 Gigahertz Frequency Band0
A Continuously Growing Dataset of Sentential Paraphrases0
From Sound Representation to Model Robustness0
FSD-10: A Dataset for Competitive Sports Content Analysis0
Benchmarking Time Series Forecasting Models: From Statistical Techniques to Foundation Models in Real-World Applications0
Model Performance-Guided Evaluation Data Selection for Effective Prompt Optimization0
Benchmarking the Text-to-SQL Capability of Large Language Models: A Comprehensive Evaluation0
Automated Structured Radiology Report Generation0
From Precision to Perception: User-Centred Evaluation of Keyword Extraction Algorithms for Internet-Scale Contextual Advertising0
Benchmarking the Spatial Robustness of DNNs via Natural and Adversarial Localized Corruptions0
Benchmarking the Sim-to-Real Gap in Cloth Manipulation0
Automated Machine Learning on Big Data using Stochastic Algorithm Tuning0
From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future0
Show:102550
← PrevPage 96 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified