VaeDiff-DocRE: End-to-end Data Augmentation Framework for Document-level Relation Extraction

2024-12-18Code Available0· sign in to hype

Khai Phan Tran, Wen Hua, Xue Li

Code Available — Be the first to reproduce this paper.

Code

github.com/khaitran22/vaediff-docre
OfficialIn paperpytorch★ 2

Abstract

Document-level Relation Extraction (DocRE) aims to identify relationships between entity pairs within a document. However, most existing methods assume a uniform label distribution, resulting in suboptimal performance on real-world, imbalanced datasets. To tackle this challenge, we propose a novel data augmentation approach using generative models to enhance data from the embedding space. Our method leverages the Variational Autoencoder (VAE) architecture to capture all relation-wise distributions formed by entity pair representations and augment data for underrepresented relations. To better capture the multi-label nature of DocRE, we parameterize the VAE's latent space with a Diffusion Model. Additionally, we introduce a hierarchical training framework to integrate the proposed VAE-based augmentation module into DocRE systems. Experiments on two benchmark datasets demonstrate that our method outperforms state-of-the-art models, effectively addressing the long-tail distribution problem in DocRE.

Tasks

Data Augmentation Document-level Relation Extraction Relation Relation Extraction

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
DWIE	VaeDiff-DocRE	F1	0.73	—	Unverified
Re-DocRED	VaeDiff-DocRE	F1	0.79	—	Unverified

VaeDiff-DocRE: End-to-end Data Augmentation Framework for Document-level Relation Extraction

Code

Abstract

Tasks

Benchmark Results

Reproductions