NAMER: Non-Autoregressive Modeling for Handwritten Mathematical Expression Recognition

2024-07-16Unverified0· sign in to hype

Chenyu Liu, Jia Pan, Jinshui Hu, BaoCai Yin, Bing Yin, Mingjun Chen, Cong Liu, Jun Du, Qingfeng Liu

Unverified — Be the first to reproduce this paper.

Abstract

Recently, Handwritten Mathematical Expression Recognition (HMER) has gained considerable attention in pattern recognition for its diverse applications in document understanding. Current methods typically approach HMER as an image-to-sequence generation task within an autoregressive (AR) encoder-decoder framework. However, these approaches suffer from several drawbacks: 1) a lack of overall language context, limiting information utilization beyond the current decoding step; 2) error accumulation during AR decoding; and 3) slow decoding speed. To tackle these problems, this paper makes the first attempt to build a novel bottom-up Non-AutoRegressive Modeling approach for HMER, called NAMER. NAMER comprises a Visual Aware Tokenizer (VAT) and a Parallel Graph Decoder (PGD). Initially, the VAT tokenizes visible symbols and local relations at a coarse level. Subsequently, the PGD refines all tokens and establishes connectivities in parallel, leveraging comprehensive visual and linguistic contexts. Experiments on CROHME 2014/2016/2019 and HME100K datasets demonstrate that NAMER not only outperforms the current state-of-the-art (SOTA) methods on ExpRate by 1.93%/2.35%/1.49%/0.62%, but also achieves significant speedups of 13.7x and 6.7x faster in decoding time and overall FPS, proving the effectiveness and efficiency of NAMER.

Tasks

Decoder document understanding Handwritten Mathmatical Expression Recognition

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
CROHME 2014	NAMER	ExpRate	60.51	—	Unverified
CROHME 2016	NAMER	ExpRate	60.24	—	Unverified
CROHME 2019	NAMER	ExpRate	61.72	—	Unverified
HME100K	NAMER	ExpRate	68.52	—	Unverified

NAMER: Non-Autoregressive Modeling for Handwritten Mathematical Expression Recognition

Abstract

Tasks

Benchmark Results

Reproductions