SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot

2023-01-02Code Available4· sign in to hype

Elias Frantar, Dan Alistarh

Code Available — Be the first to reproduce this paper.

Code

github.com/ist-daslab/sparsegpt
OfficialIn paperpytorch★ 877
github.com/nvidia/tensorrt-model-optimizer
pytorch★ 2,222
github.com/nvlabs/maskllm
pytorch★ 187
github.com/baithebest/adagp
pytorch★ 67
github.com/baithebest/sparsellm
pytorch★ 67
github.com/eth-easl/deltazip
pytorch★ 35

Abstract

We show for the first time that large-scale generative pretrained transformer (GPT) family models can be pruned to at least 50% sparsity in one-shot, without any retraining, at minimal loss of accuracy. This is achieved via a new pruning method called SparseGPT, specifically designed to work efficiently and accurately on massive GPT-family models. We can execute SparseGPT on the largest available open-source models, OPT-175B and BLOOM-176B, in under 4.5 hours, and can reach 60% unstructured sparsity with negligible increase in perplexity: remarkably, more than 100 billion weights from these models can be ignored at inference time. SparseGPT generalizes to semi-structured (2:4 and 4:8) patterns, and is compatible with weight quantization approaches. The code is available at: https://github.com/IST-DASLab/sparsegpt.

Tasks

Common Sense Reasoning Language Modelling Quantization Question Answering

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
arc_challenge	OPT-175B	Accuracy	43.94	—	Unverified
arc_challenge	OPT-175B (50% Sparsity)	Accuracy	25.6	—	Unverified
arc_challenge	SparseGPT (175B, 2:4 Sparsity)	Accuracy	38.99	—	Unverified
arc_challenge	SparseGPT (175B, 4:8 Sparsity)	Accuracy	39.85	—	Unverified
arc_challenge	SparseGPT (175B, 50% Sparsity)	Accuracy	41.3	—	Unverified
arc_easy	SparseGPT 175B (2:4 sparsity)	Accuracy	67.08	—	Unverified
arc_easy	OPT 175B (50% Sparsity)	Accuracy	28.03	—	Unverified
arc_easy	OPT-175B	Accuracy	71.04	—	Unverified
arc_easy	SparseGPT (175B, 4:8 Sparsity)	Accuracy	68.35	—	Unverified
arc_easy	SparseGPT 175B (50% sparsity)	Accuracy	69.65	—	Unverified

SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot

Code

Abstract

Tasks

Benchmark Results

Reproductions