GoLLIE: Annotation Guidelines improve Zero-Shot Information-Extraction

2023-10-05Code Available2· sign in to hype

Oscar Sainz, Iker García-Ferrero, Rodrigo Agerri, Oier Lopez de Lacalle, German Rigau, Eneko Agirre

Code Available — Be the first to reproduce this paper.

Code

github.com/hitz-zentroa/gollie
OfficialIn paperpytorch★ 433

Abstract

Large Language Models (LLMs) combined with instruction tuning have made significant progress when generalizing to unseen tasks. However, they have been less successful in Information Extraction (IE), lagging behind task-specific models. Typically, IE tasks are characterized by complex annotation guidelines that describe the task and give examples to humans. Previous attempts to leverage such information have failed, even with the largest models, as they are not able to follow the guidelines out of the box. In this paper, we propose GoLLIE (Guideline-following Large Language Model for IE), a model able to improve zero-shot results on unseen IE tasks by virtue of being fine-tuned to comply with annotation guidelines. Comprehensive evaluation empirically demonstrates that GoLLIE is able to generalize to and follow unseen guidelines, outperforming previous attempts at zero-shot information extraction. The ablation study shows that detailed guidelines are key for good results.

Tasks

Event Argument Extraction Event Extraction Language Modeling Language Modelling Large Language Model Low Resource Named Entity Recognition Named Entity Recognition Named Entity Recognition (NER)Relation Extraction Zero-shot Event Extraction Zero-shot Named Entity Recognition (NER)

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
ACE 2005	GoLLIE	F1	89.6	—	Unverified
BC5CDR	GoLLIE	F1	88.4	—	Unverified
CoNLL 2003 (English)	GoLLIE	F1	93.1	—	Unverified
NCBI Disease	GoLLIE	F1	86.5	—	Unverified
WNUT 2017	GoLLIE	F1	54.3	—	Unverified

GoLLIE: Annotation Guidelines improve Zero-Shot Information-Extraction

Code

Abstract

Tasks

Benchmark Results

Reproductions