Fine-Tuning BERTs for Definition Extraction from Mathematical Text

2024-06-19Unverified0· sign in to hype

Lucy Horowitz, Ryan Hathaway

Unverified — Be the first to reproduce this paper.

Abstract

In this paper, we fine-tuned three pre-trained BERT models on the task of "definition extraction" from mathematical English written in LaTeX. This is presented as a binary classification problem, where either a sentence contains a definition of a mathematical term or it does not. We used two original data sets, "Chicago" and "TAC," to fine-tune and test these models. We also tested on WFMALL, a dataset presented by Vanetik and Litvak in 2021 and compared the performance of our models to theirs. We found that a high-performance Sentence-BERT transformer model performed best based on overall accuracy, recall, and precision metrics, achieving comparable results to the earlier models with less computational effort.

Tasks

Binary Classification Definition Extraction Sentence

Fine-Tuning BERTs for Definition Extraction from Mathematical Text

Abstract

Tasks

Reproductions