Heterogenous Memory Augmented Neural Networks
Zihan Qiu, Zhen Liu, Shuicheng Yan, Shanghang Zhang, Jie Fu
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/qiuzh20/hmaOfficialIn paperpytorch★ 5
Abstract
It has been shown that semi-parametric methods, which combine standard neural networks with non-parametric components such as external memory modules and data retrieval, are particularly helpful in data scarcity and out-of-distribution (OOD) scenarios. However, existing semi-parametric methods mostly depend on independent raw data points - this strategy is difficult to scale up due to both high computational costs and the incapacity of current attention mechanisms with a large number of tokens. In this paper, we introduce a novel heterogeneous memory augmentation approach for neural networks which, by introducing learnable memory tokens with attention mechanism, can effectively boost performance without huge computational overhead. Our general-purpose method can be seamlessly combined with various backbones (MLP, CNN, GNN, and Transformer) in a plug-and-play manner. We extensively evaluate our approach on various image and graph-based tasks under both in-distribution (ID) and OOD conditions and show its competitive performance against task-specific state-of-the-art methods. Code is available at https://github.com/qiuzh20/HMA.