This repo contains a simple implementation of a GFlowNet using PyTorch. GFlowNets were first proposed by Bengio et al. in the paper Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation (2021).
The model is trained using online learning (i.e. by evaluating samples drawn from the model's behavior policy rather than a fixed set of samples drawn from another policy) and the trajectory balance loss. We evaluate the model's performance using the grid domain of the original paper.
The model training follows the following steps:
- Initialize the grid environment using a grid size.
- Define a policy network taking a state vector as input and returning a vector of probabilities over possible actions. (In the grid domain, there are three possible actions: Down, Right, and Terminate.)
- Define a backward policy. In this case, the policy is not estimated but fixed to 0.5 for all parent states (except when there is only one parent state).
Run main.py to start the GFlowNet training and use the model once trained.