This repository contains the official implementation of the paper Evaluation of Security of ML-based Watermarking: Copy and Removal Attacks.
Generalized diagram explaining the proposed (a) copy and (b) untargted and targeted removal attacks (on the example of
zero-bit watermarking). The secret carrier and the decision region (show in gray) are unknown for the attacker.
The vast amounts of digital content captured from the real world or AI-generated media necessitate methods for copyright protection, traceability, or data provenance verification. Digital watermarking serves as a crucial approach to address these challenges. Its evolution spans three generations: handcrafted methods, autoencoder-based schemes, and methods based on foundation models. While the robustness of these systems is well-documented, the security against adversarial attacks remains underexplored. This paper evaluates the security of foundation models' latent space digital watermarking systems that utilize adversarial embedding techniques. A series of experiments investigate the security dimensions under copy and removal attacks, providing empirical insights into these systems' vulnerabilities.
This project uses Conda for environment management. Follow the steps below to set up the necessary environment.
- Conda installed on your system.
- Clone the Repository
git clone https://github.com/vkinakh/ssl-watermarking-attacks.git
cd ssl-watermarking-attacks
- Update submodules
git submodule update --init --recursive
- Create a Conda Environment
conda env create -f environment.yml
- Activate the Environment
conda activate ssl-watermarking-attacks
- **Download Model and Normlayer **
Download model and normlayer, used in experiments
- Model: ResNet-50 trained with DINO
- Normalization layer whitening
The objective of a copy attack is to maximize the probability of falsely accepting a non-watermarked image as a watermarked one.
Given a watermarked image
It is done by estimating the embedding
Run 0-bit copy attack:
python run_attacks.py \
--attack=copy_0bit \
--model=<MODEL NAME> \ # currently supports resnet50
--path_backbone=/PATH/TO/BACKBONE \ # currently supports DINO resnet50
--path_norm_layer=/PATH/TO/NORMLAYER \
--path_images=/PATH/TO/DIRECTORY/WITH/IMAGES \
--transform=none \ # supports all, none
--psnr_wm=42 \ # 42 is default
--psnr_attack=42 \ # 42 is default
--lambda_w=50000 \ # 50000 is default for 0-bit copy attack
--lambda_i=1 \ # 1 is default for all attacks
--epochs=100 \ # 100 is default, 25 might be enough
--target_fpr=1e-6 \ # 1e-6 is default, method was tested with 1e-7, 1e-8 as well
--path_outputs=/PATH/TO/OUTPUT.csv \ # output dataframe with results, should be csv file
--use_cosine_sim # copy attack works the best with cosine similarity loss
Probability of false acceptance for zero-bit watermarking under copy attack.
Run multibit copy attack:
python run_attacks.py \
--attack=copy_multibit \
--model=<MODEL NAME> \ # currently supports resnet50
--path_backbone=/PATH/TO/BACKBONE \ # currently supports DINO resnet50
--path_norm_layer=/PATH/TO/NORMLAYER \
--path_images=/PATH/TO/DIRECTORY/WITH/IMAGES \
--transform=none \ # supports all, none
--psnr_wm=42 \ # 42 is default
--psnr_attack=42 \ # 42 is default
--lambda_w=50000 \ # 50000 is default for all multibit attacks
--lambda_i=1 \ # 1 is default for all attacks
--epochs=100 \ # 100 is default
--num_bits=30 \ # method was tested with 10, 30 and 100 bits
--path_outputs=/PATH/TO/OUTPUT.csv \ # output dataframe with results, should be csv file
--use_cosine_sim # copy attack works the best with cosine similarity loss
Bit Error Rate (BER) for multi-bit watermarking under the copy attack.
The watermark removal damages the watermarked image to maximize the probability of miss detection (zero-bit watermarking), or the bit error rate (BER) (multi-bit watermarking).
The targeted removal attack generates an attacked image
The target selection during the removal attack plays an important role for the success of the considered removal attack.
Three strategies are being considered.
- Choosing any random non-watermarked image
${\bf x}_t$ . - Setting target to be a heavily degraded version of
${\bf x}_w$ where the watermark is no longer detected. - Selecting random watermarking carrier as the new target.
Run 0-bit random non-watermarked image removal attack:
python run_attacks.py \
--attack=remove_other_0bit \
--model=<MODEL NAME> \ # currently supports resnet50
--path_backbone=/PATH/TO/BACKBONE \ # currently supports DINO resnet50
--path_norm_layer=/PATH/TO/NORMLAYER \
--path_images=/PATH/TO/DIRECTORY/WITH/IMAGES \
--transform=none \ # supports all, none
--psnr_wm=42 \ # 42 is default
--psnr_attack=42 \ # 42 is default
--lambda_w=1 \ # 1 is default, values between 1 and 50000 give decent results
--lambda_i=1 \ # 1 is default for all attacks
--epochs=100 \ # 100 is default
--target_fpr=1e-6 \ # 1e-6 is default, method was tested with 1e-7, 1e-8 as well
--path_outputs=/PATH/TO/OUTPUT.csv # output dataframe with results, should be csv file
Run multibit random non-watermarked image removal attack:
python run_attacks.py \
--attack=remove_other_multibit \
--model=<MODEL NAME> \ # currently supports resnet50
--path_backbone=/PATH/TO/BACKBONE \ # currently supports DINO resnet50
--path_norm_layer=/PATH/TO/NORMLAYER \
--path_images=/PATH/TO/DIRECTORY/WITH/IMAGES \
--transform=none \ # supports all, none
--psnr_wm=42 \ # 42 is default
--psnr_attack=42 \ # 42 is default
--lambda_w=50000 \ # 50000 is default for all multibit attacks
--lambda_i=1 \ # 1 is default for all attacks
--epochs=100 \ # 100 is default
--num_bits=30 \ # method was tested with 10, 30 and 100 bits
--path_outputs=/PATH/TO/OUTPUT.csv # output dataframe with results, should be csv file
Wiener filter with kernel size 25
Run 0-bit heavily degraded version of watermarked image removal attack:
python run_attacks.py \
--attack=remove_denoise_0bit \
--model=<MODEL NAME> \ # currently supports resnet50
--path_backbone=/PATH/TO/BACKBONE \ # currently supports DINO resnet50
--path_norm_layer=/PATH/TO/NORMLAYER \
--path_images=/PATH/TO/DIRECTORY/WITH/IMAGES \
--transform=none \ # supports all, none
--psnr_wm=42 \ # 42 is default
--psnr_attack=42 \ # 42 is default
--lambda_w=50000 \ # 50000 is default, values between 1 and 50000 give decent results
--lambda_i=1 \ # 1 is default for all attacks
--epochs=100 \ # 100 is default
--target_fpr=1e-6 \ # 1e-6 is default, method was tested with 1e-7, 1e-8 as well
--path_outputs=/PATH/TO/OUTPUT.csv \ # output dataframe with results, should be csv file
--wiener_filter_size=25 # 25 is default wiener filter size, bigger filter distorts image more
Run 0-bit heavily degraded version of watermarked image removal attack.:
python run_attacks.py \
--attack=remove_denoise_multibit \
--model=<MODEL NAME> \ # currently supports resnet50
--path_backbone=/PATH/TO/BACKBONE \ # currently supports DINO resnet50
--path_norm_layer=/PATH/TO/NORMLAYER \
--path_images=/PATH/TO/DIRECTORY/WITH/IMAGES \
--transform=none \ # supports all, none
--psnr_wm=42 \ # 42 is default
--psnr_attack=42 \ # 42 is default
--lambda_w=50000 \ # 50000 is default for all multibit attacks
--lambda_i=1 \ # 1 is default for all attacks
--epochs=100 \ # 100 is default
--num_bits=30 \ # method was tested with 10, 30 and 100 bits
--path_outputs=/PATH/TO/OUTPUT.csv \ # output dataframe with results, should be csv file
--wiener_filter_size=25 # 25 is default wiener filter size, bigger filter distorts image more
Run 0-bit random watermarking carrier removal attack:
python run_attacks.py \
--attack=remove_random_0bit \
--model=<MODEL NAME> \ # currently supports resnet50
--path_backbone=/PATH/TO/BACKBONE \ # currently supports DINO resnet50
--path_norm_layer=/PATH/TO/NORMLAYER \
--path_images=/PATH/TO/DIRECTORY/WITH/IMAGES \
--transform=none \ # supports all, none
--psnr_wm=42 \ # 42 is default
--psnr_attack=42 \ # 42 is default
--lambda_w=50000 \ # 50000 is default, values between 1 and 50000 give decent results
--lambda_i=1 \ # 1 is default for all attacks
--epochs=100 \ # 100 is default
--target_fpr=1e-6 \ # 1e-6 is default, method was tested with 1e-7, 1e-8 as well
--path_outputs=/PATH/TO/OUTPUT.csv \ # output dataframe with results, should be csv file
--use_cosine_sim # works the best with cosine similarity
Run multibit random watermarking carrier removal attack:
python run_attacks.py \
--attack=remove_random_multibit \
--model=<MODEL NAME> \ # currently supports resnet50
--path_backbone=/PATH/TO/BACKBONE \ # currently supports DINO resnet50
--path_norm_layer=/PATH/TO/NORMLAYER \
--path_images=/PATH/TO/DIRECTORY/WITH/IMAGES \
--transform=none \ # supports all, none
--psnr_wm=42 \ # 42 is default
--psnr_attack=42 \ # 42 is default
--lambda_w=50000 \ # 50000 is default for all multibit attacks
--lambda_i=1 \ # 1 is default for all attacks
--epochs=100 \ # 100 is default
--num_bits=30 \ # method was tested with 10, 30 and 100 bits
--path_outputs=/PATH/TO/OUTPUT.csv # output dataframe with results, should be csv file
Probability of miss for zero-bit watermarking under targeted removal attack with different target image selection strategies.
Bit Error Rate for multi-bit watermarking under targeted removal attack with different target image selection strategies
Run 0-bit untargeted removal attack:
python run_attacks.py \
--attack=untargeted_remove_0bit \
--model=<MODEL NAME> \ # currently supports resnet50
--path_backbone=/PATH/TO/BACKBONE \ # currently supports DINO resnet50
--path_norm_layer=/PATH/TO/NORMLAYER \
--path_images=/PATH/TO/DIRECTORY/WITH/IMAGES \
--transform=none \ # supports all, none
--psnr_wm=42 \ # 42 is default
--psnr_attack=42 \ # 42 is default
--lambda_w=50000 \ # 50000 works the best
--lambda_i=1 \ # 1 is default for all attacks
--epochs=100 \ # 100 is default
--target_fpr=1e-6 \ # 1e-6 is default, method was tested with 1e-7, 1e-8 as well
--path_outputs=/PATH/TO/OUTPUT.csv # output dataframe with results, should be csv file
Run multibit untargeted removal attack:
python run_attacks.py \
--attack=untargeted_remove_multibit \
--model=<MODEL NAME> \ # currently supports resnet50
--path_backbone=/PATH/TO/BACKBONE \ # currently supports DINO resnet50
--path_norm_layer=/PATH/TO/NORMLAYER \
--path_images=/PATH/TO/DIRECTORY/WITH/IMAGES \
--transform=none \ # supports all, none
--psnr_wm=42 \ # 42 is default
--psnr_attack=42 \ # 42 is default
--lambda_w=50000 \ # 50000 is default for all multibit attacks
--lambda_i=1 \ # 1 is default for all attacks
--epochs=100 \ # 100 is default
--num_bits=30 \ # method was tested with 10, 30 and 100 bits
--path_outputs=/PATH/TO/OUTPUT.csv # output dataframe with results, should be csv file
Probability of miss for zero-bit watermarking under untargeted removal attack.
Bit Error Rate for multi-bit watermarking under untargeted removal attack.
Other parameters are
--optimizer=Adam,lr=0.01 # optimizer and parameters in format "optimizer,parameters", supports optimizers from `torch.optim`, default "Adam,lr=0.01"
--scheduler=None # scheduler and parameters in format "scheduler,parameters", supports schedulers from `torch.optim.lr_scheduler`, default None
--seed=42 # seed for reproducibility
@inproceedings {kinakh2024wifs,
author = { Kinakh, Vitaliy and Pulfer, Brian and Belousov, Yury and Fernandez, Pierre and Furon, Teddy and Voloshynovskiy, Slava },
booktitle = { 16th IEEE International Workshop on Information Forensics and Security (WIFS) },
title = { Evaluation of Security of ML-based Watermarking: Copy and Removal Attacks },
address = { Roma, Italy },
month = { December },
year = { 2024 }
}