This repository is a fork of the code originally accompanying the paper "Variational autoencoders for collaborative filtering" by Dawen Liang, Rahul G. Krishnan, Matthew D. Hoffman, and Tony Jebara, in The Web Conference (aka WWW) 2018."
See last section for the original README contents
This fork of the repository aims to update the original code, which was designed for older versions of Python and TensorFlow, making it compatible with the latest Python and TensorFlow versions, particularly for use in platforms like Google Colab.
- Refactored Pandas data preparation code cells to be executable
- Preinstall
bottleneck
package used in this notebook - Add cell for fetching and unzipping the dataset
- Fix hyperparameter setup
- Fix MultiDAE and MultiVAE regularization step
- Add
tqdm
for more readable training tracking - Add 1e-6 to divisors when calculating NDCG and Recall to fix NaN values issue
- Attached
requirements.txt
The notebook should be ready to go as is in Google Colab (T4 GPU instance recommended, available in free version of Google Colab). Executing all code in notebook takes about 4 hours in T4 instance.
The results of validation NDCG per epoch, test NDCG and test Recall are not exactly the same like those displayed in the original notebook, but are very close.
- Pandas - 1.5.3
- Tensorflow - 2.14.1
This notebook accompanies the paper "Variational autoencoders for collaborative filtering" by Dawen Liang, Rahul G. Krishnan, Matthew D. Hoffman, and Tony Jebara, in The Web Conference (aka WWW) 2018.
In this notebook, we show a complete self-contained example of training a variational autoencoder (as well as a denoising autoencoder) with multinomial likelihood (described in the paper) on the public Movielens-20M dataset, including both data preprocessing and model training.