We use auto-encoders in an anomaly detection setting to search for SUEP (soft unclustered energy patterns) and SVJ (semi-visible jets) signals in a background of QCD events.
In our journal paper we propose a family of Conditional Dual Auto-Encoders (CoDAEs) models that can learn multiple anomaly detection scores from raw images of particle collisions.
- There are two encodes: one with high capacity (
$f_R$ ) to capture details in its large bottleneck,$Z$ , and a smaller one ($f_m$ ) responsible to learn a discriminative 2-dimensional latent space,$Z_m$ , that can be directly used for anomaly detection. - The encoder
$f_m$ is learned by conditioning (operation denoted by blue circles and paths in the figure)$Z$ on$Z_m$ . - Then, conditioning occurs multiple times at different resolutions of the decoder,
$D$ .
We can define multiple anomaly scores:
- From the latent space,
$Z$ : like the KL-divergence w.r.t. a prior,$p(Z)$ . - From the auxiliary bottleneck,
$Z_m$ , where each component can be considered as a score. - Lastly, from the decoder's reconstructions: for example a score can be the reconstruction error.
The repository is organized as follows:
ad/
contains the source code that defines layers, models, metrics, etc.weights/
contains pre-trained weights.data/
contains an example of the data used in our experiments.- The notebooks
codae.ipynb
,categorical_codvae.ipynb
,dirichlet_vae.ipynb
, andqcd_or_what_model.ipynb
show the training of the respective models (and evaluation of only the latest two.)supervised_cct.ipynb
is used to train the supervised classifier. n_tracks.ipynb
provides a comparison of our models against both physics-based and supervised baselines.tf-lite_convert.ipynb
shows how to optimize (quantize) a CoDAE model, and measure its inference time.
Installation with virtual environment (otherwise open in, e.g., Google Colab):
- Clone the repository:
git clone https://github.com/Luca96/dark-autoencoders.git
. - Change directory:
cd dark-autoencoders\
. - Create the virtual environment (named "venv"):
python -m venv venv
. - Activate it:
venv/Scripts/activate
(Windows) orvenv/bin/activate
(UNIX). - Install dependencies:
pip install -r requirements.txt
. - (optional) Install Jupyter notebook (or lab):
pip install notebook
orpip install jupyterlab
.
Please consider citing our paper, if using any of the provided code and approach in your own research or project.
@article{anzalone2024triggering,
doi = {10.1088/2632-2153/ad652b},
url = {https://dx.doi.org/10.1088/2632-2153/ad652b},
year = {2024},
month = {sep},
publisher = {IOP Publishing},
volume = {5},
number = {3},
pages = {035064},
author = {Luca Anzalone and Simranjit Singh Chhibra and Benedikt Maier and Nadezda Chernyavskaya and Maurizio Pierini},
title = {Triggering dark showers with conditional dual auto-encoders},
journal = {Machine Learning: Science and Technology},
}