Adaptive teachers for amortized samplers

2024-10-02Code Available0· sign in to hype

Minsu Kim, Sanghyeok Choi, Taeyoung Yun, Emmanuel Bengio, Leo Feng, Jarrid Rector-Brooks, Sungsoo Ahn, Jinkyoo Park, Nikolay Malkin, Yoshua Bengio

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/alstn12088/adaptive-teacher
OfficialIn paperpytorch★ 5

Abstract

Amortized inference is the task of training a parametric model, such as a neural network, to approximate a distribution with a given unnormalized density where exact sampling is intractable. When sampling is implemented as a sequential decision-making process, reinforcement learning (RL) methods, such as generative flow networks, can be used to train the sampling policy. Off-policy RL training facilitates the discovery of diverse, high-reward candidates, but existing methods still face challenges in efficient exploration. We propose to use an adaptive training distribution (the ) to guide the training of the primary amortized sampler (the ). The , an auxiliary behavior model, is trained to sample high-loss regions of the and can generalize across unexplored modes, thereby enhancing mode coverage by providing an efficient training curriculum. We validate the effectiveness of this approach in a synthetic environment designed to present an exploration challenge, two diffusion-based sampling tasks, and four biochemical discovery tasks demonstrating its ability to improve sample efficiency and mode coverage. Source code is available at https://github.com/alstn12088/adaptive-teacher.

Tasks

Decision Making Efficient Exploration Reinforcement Learning (RL)Sequential Decision Making

Adaptive teachers for amortized samplers

Code

Abstract

Tasks

Reproductions