SOTAVerified

Fine-Tuning BERTs for Definition Extraction from Mathematical Text

2024-06-19Unverified0· sign in to hype

Lucy Horowitz, Ryan Hathaway

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In this paper, we fine-tuned three pre-trained BERT models on the task of "definition extraction" from mathematical English written in LaTeX. This is presented as a binary classification problem, where either a sentence contains a definition of a mathematical term or it does not. We used two original data sets, "Chicago" and "TAC," to fine-tune and test these models. We also tested on WFMALL, a dataset presented by Vanetik and Litvak in 2021 and compared the performance of our models to theirs. We found that a high-performance Sentence-BERT transformer model performed best based on overall accuracy, recall, and precision metrics, achieving comparable results to the earlier models with less computational effort.

Tasks

Reproductions