Task-Specific Efficiency Analysis: When Small Language Models Outperform Large Language Models
Jinghan Cao, Yu Ma, Xinjin Li, Qingyang Ren, Xiangyun Chen
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Large Language Models achieve remarkable performance but incur substantial computational costs unsuitable for resource-constrained deployments. This paper presents the first comprehensive task-specific efficiency analysis comparing 16 language models across five diverse NLP tasks. We introduce the Performance-Efficiency Ratio (PER), a novel metric integrating accuracy, throughput, memory, and latency through geometric mean normalization. Our systematic evaluation reveals that small models (0.5--3B parameters) achieve superior PER scores across all given tasks. These findings establish quantitative foundations for deploying small models in production environments prioritizing inference efficiency over marginal accuracy gains.