Conditional Dual Auto-Encoders for Anomaly Detection in HEP

We use auto-encoders in an anomaly detection setting to search for SUEP (soft unclustered energy patterns) and SVJ (semi-visible jets) signals in a background of QCD events.

In our journal paper we propose a family of Conditional Dual Auto-Encoders (CoDAEs) models that can learn multiple anomaly detection scores from raw images of particle collisions.

There are two encodes: one with high capacity ($f_R$) to capture details in its large bottleneck, $Z$, and a smaller one ($f_m$) responsible to learn a discriminative 2-dimensional latent space, $Z_m$, that can be directly used for anomaly detection.
The encoder $f_m$ is learned by conditioning (operation denoted by blue circles and paths in the figure) $Z$ on $Z_m$.
Then, conditioning occurs multiple times at different resolutions of the decoder, $D$.

We can define multiple anomaly scores:

From the latent space, $Z$: like the KL-divergence w.r.t. a prior, $p(Z)$.
From the auxiliary bottleneck, $Z_m$, where each component can be considered as a score.
Lastly, from the decoder's reconstructions: for example a score can be the reconstruction error.

Code description

The repository is organized as follows:

ad/ contains the source code that defines layers, models, metrics, etc.
weights/ contains pre-trained weights.
data/ contains an example of the data used in our experiments.
The notebooks codae.ipynb, categorical_codvae.ipynb, dirichlet_vae.ipynb, and qcd_or_what_model.ipynb show the training of the respective models (and evaluation of only the latest two.) supervised_cct.ipynb is used to train the supervised classifier.
n_tracks.ipynb provides a comparison of our models against both physics-based and supervised baselines.
tf-lite_convert.ipynb shows how to optimize (quantize) a CoDAE model, and measure its inference time.

Usage

Installation with virtual environment (otherwise open in, e.g., Google Colab):

Clone the repository: git clone https://github.com/Luca96/dark-autoencoders.git.
Change directory: cd dark-autoencoders\.
Create the virtual environment (named "venv"): python -m venv venv.
Activate it: venv/Scripts/activate (Windows) or venv/bin/activate (UNIX).
Install dependencies: pip install -r requirements.txt.
(optional) Install Jupyter notebook (or lab): pip install notebook or pip install jupyterlab.

Citation

Please consider citing our paper, if using any of the provided code and approach in your own research or project.

@article{anzalone2024triggering,
  doi = {10.1088/2632-2153/ad652b},
  url = {https://dx.doi.org/10.1088/2632-2153/ad652b},
  year = {2024},
  month = {sep},
  publisher = {IOP Publishing},
  volume = {5},
  number = {3},
  pages = {035064},
  author = {Luca Anzalone and Simranjit Singh Chhibra and Benedikt Maier and Nadezda Chernyavskaya and Maurizio Pierini},
  title = {Triggering dark showers with conditional dual auto-encoders},
  journal = {Machine Learning: Science and Technology},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Conditional Dual Auto-Encoders for Anomaly Detection in HEP

Code description

Usage

Citation

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
ad		ad
data		data
src		src
weights		weights
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build_dataset_script.py		build_dataset_script.py
categorical_codvae.ipynb		categorical_codvae.ipynb
codae.ipynb		codae.ipynb
dirichlet_vae.ipynb		dirichlet_vae.ipynb
n_tracks.ipynb		n_tracks.ipynb
qcd_or_what_model.ipynb		qcd_or_what_model.ipynb
requirements.txt		requirements.txt
supervised_cct.ipynb		supervised_cct.ipynb
tf-lite_convert.ipynb		tf-lite_convert.ipynb

License

Luca96/dark-autoencoders

Folders and files

Latest commit

History

Repository files navigation

Conditional Dual Auto-Encoders for Anomaly Detection in HEP

Code description

Usage

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages