SOTAVerified

Using LLMs for the Extraction and Normalization of Product Attribute Values

2024-03-04Code Available1· sign in to hype

Alexander Brinkmann, Nick Baumann, Christian Bizer

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Product offers on e-commerce websites often consist of a product title and a textual product description. In order to enable features such as faceted product search or to generate product comparison tables, it is necessary to extract structured attribute-value pairs from the unstructured product titles and descriptions and to normalize the extracted values to a single, unified scale for each attribute. This paper explores the potential of using large language models (LLMs), such as GPT-3.5 and GPT-4, to extract and normalize attribute values from product titles and descriptions. We experiment with different zero-shot and few-shot prompt templates for instructing LLMs to extract and normalize attribute-value pairs. We introduce the Web Data Commons - Product Attribute Value Extraction (WDC-PAVE) benchmark dataset for our experiments. WDC-PAVE consists of product offers from 59 different websites which provide schema.org annotations. The offers belong to five different product categories, each with a specific set of attributes. The dataset provides manually verified attribute-value pairs in two forms: (i) directly extracted values and (ii) normalized attribute values. The normalization of the attribute values requires systems to perform the following types of operations: name expansion, generalization, unit of measurement conversion, and string wrangling. Our experiments demonstrate that GPT-4 outperforms the PLM-based extraction methods SU-OpenTag, AVEQA, and MAVEQA by 10%, achieving an F1-score of 91%. For the extraction and normalization of product attribute values, GPT-4 achieves a similar performance to the extraction scenario, while being particularly strong at string wrangling and name expansion.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
WDC-PAVEGPT-4_10_example_values_&_10_demonstrationsF1-Score90.54Unverified
WDC-PAVEGPT-3.5_10_example_values_&_10_demonstrationsF1-Score88.02Unverified
WDC-PAVEAVEQAF1-Score80.83Unverified
WDC-PAVEMAVEQAF1-Score65.1Unverified
WDC-PAVESU-OpenTagF1-Score60.44Unverified

Reproductions