Finite Automata Can be Linearly Decoded from Language-Recognizing RNNs

2019-05-01ICLR 2019Unverified0· sign in to hype

Joshua J. Michalenko, Ameesh Shah, Abhinav Verma, Swarat Chaudhuri, Ankit B. Patel

Unverified — Be the first to reproduce this paper.

Abstract

We study the internal representations that a recurrent neural network (RNN) uses while learning to recognize a regular formal language. Specifically, we train an RNN on positive and negative examples from a regular language, and ask if there is a simple decoding function that maps states of this RNN to states of the minimal deterministic finite automaton (MDFA) for the language. Our experiments show that such a decoding function exists, that it is in fact linear, but that it maps states of the RNN not to MDFA states, but to states of an abstraction obtained by clustering small sets of MDFA states into "superstates". A qualitative analysis reveals that the abstraction often has a simple interpretation. Overall, the results suggest a strong structural relationship between internal representations used by RNNs and finite automata, and explain the well-known ability of RNNs to recognize formal grammatical structure.

Tasks

Clustering

Finite Automata Can be Linearly Decoded from Language-Recognizing RNNs

Abstract

Tasks

Reproductions