Matrix Shuffle-Exchange Networks for Hard 2D Tasks
Emīls Ozoliņš, Kārlis Freivalds, Agris Šostaks
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/LUMII-Syslab/Matrix-SEOfficialIn papertf★ 1
- github.com/LUMII-Syslab/SwitchbladeOfficialIn papertf★ 1
Abstract
Convolutional neural networks have become the main tools for processing two-dimensional data. They work well for images, yet convolutions have a limited receptive field that prevents its applications to more complex 2D tasks. We propose a new neural model, called Matrix Shuffle-Exchange network, that can efficiently exploit long-range dependencies in 2D data and has comparable speed to a convolutional neural network. It is derived from Neural Shuffle-Exchange network and has O( n) layers and O( n^2 n) total time and space complexity for processing a n n data matrix. We show that the Matrix Shuffle-Exchange network is well-suited for algorithmic and logical reasoning tasks on matrices and dense graphs, exceeding convolutional and graph neural network baselines. Its distinct advantage is the capability of retaining full long-range dependency modelling when generalizing to larger instances - much larger than could be processed with models equipped with a dense attention mechanism.