Defending against adversarial attacks by randomized diversification

O. Taran, S. Rezaeifar, T. Holotyak, and S. Voloshynovskiy

2019_cvpr_008.png

 
Citation
O. Taran, S. Rezaeifar, T. Holotyak, and S. Voloshynovskiy, "Defending against adversarial attacks by randomized diversification," in Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 2019. Bibtex | PDF


Code: PyTorch
If you have questions about our PyTorch code, please contact us.

The research was supported by the SNF project No. 200021_182063.


 

Abstract

 

The vulnerability of machine learning systems to adversarial attacks questions their usage in many applications. We propose a randomized diversification as a defense strategy. We introduce a multi-channel architecture in a gray-box scenario, which assumes that the architecture of the classifier and the training data set are known to the attacker. The attacker does not only have access to a secret key and to the internal states of the system at the test time. The defender processes an input in multiple channels. Each channel introduces its own randomization in a special transform domain based on a secret key shared between the training and testing stages. Such a transform based randomization with a shared key preserves the gradients in key-defined sub-spaces for the defender but it prevents gradient back propagation and the creation of various bypass systems for the attacker. An additional benefit of multi-channel randomization is the aggregation that fuses soft-outputs from all channels, thus increasing the reliability of the final score. The sharing of a secret key creates an information advantage to the defender. Experimental evaluation demonstrates an increased robustness of the proposed method to a number of known state-of-the-art attacks.

 
2019_CVPR_general_schema.png

 

Fig.1: Setup under investigation: the attacker knows the labeled training data set X and the system architecture but he does not have access to secret key k of the defender shared between the training and testing.

 

Multi-channel classification algorithm

 

A multi-channel classifier shown in Fig. 2 forms the core of the proposed architecture and consists of four main building blocks:

  1. Pre-processing of the input data in a transform domain via a mapping 2019_cvpr_001.png
  2. Data independent processing 2019_cvpr_002.png serves as a defense against gradient back propagation to the direct domain.
  3. Classification block can be represented by any family of classifiers.
  4. Aggregation block can be represented by any operation ranging from a simple summation to learnable operators adapted to the data or to a particular adversarial attack.

The chain of the first 3 blocks can be organized in a parallel multi-channel structure that is followed by one or several aggregation blocks. The final decision about the class is made based on the aggregated result. The rejection option can be also naturally envisioned.

 

2019_CVPR_main_schema.png

Fig. 2: Generalized diagram of the proposed multi-channel classifier.

The training of the described algorithm can be represented as:

2019_cvpr_003.png
with

2019_cvpr_004.png

where 2019_cvpr_101.png is a classification loss, 2019_cvpr_102.png is a vectorized class label of the sample 2019_cvpr_103.png, 2019_cvpr_104.pngcorresponds to the aggregation operator with parameters 2019_cvpr_105.png2019_cvpr_106.png is the ith classifier of the jth channel, 2019_cvpr_107.png denotes the parameters of the classifier, 2019_cvpr_108.png equals to the number of training samples, 2019_cvpr_109.png is the total number of channels and 2019_cvpr_110.png equals to the number of classifiers per channel

 

Classification with multi-channel local sign permutation in the DCT domain

 

The generalized diagram of the used multi-channel architecture with the local sign permutation in the DCT domain is illustrated in Fig. 3. The general idea of the local sign permutation consists in the fact that the DCT domain can be split into overlapping or non-overlapping sub-bands of different sizes. For the simplicity and interpretability, we split the DCT domain into 4 sub-bands, namely, (1) top left that represents the low frequencies of the image, (2) vertical, (3) horizontal and (4) diagonal sub-bands as shown in Fig. 4. The DCT sign flipping is applied as a secret key-based randomization in each sub-band keeping all other sub-bands unchanged.

2019_cvpr_005.png

Fig. 4: Local randomization in the DCT sub-bands by key-based sign flipping.

 

 
2019_cvpr_007.png

Fig. 3: Classification with local DCT sign permutations.

The obtained results are shown in Table 1. The column ”Classical” corresponds to the results of the one-channel classical classifier for the original non-permuted data. It should be pointed out that in the proposed scheme with the 12 and 15 sub-channels, the obtained classification error on the adversarial examples corresponds to those of the original data and, in some cases, is ever lower. 

Table 1: Classification error (%) on the first 1000 test samples for the DCT domain
with the local sign flipping in 3 sub-bands (J = 3).
2019_cvpr_006.png
 

Conclusions

 

In our paper, we address a problem of protection against adversarial attacks in classification systems. We propose the randomized diversification mechanism as a defense strategy in the multi-channel architecture with the aggregation of classifiers’ scores. The randomized diversification is a secret key-based randomization in a defined domain. The goal of this randomization is to prevent the gradient back propagation or use of bypass systems by the attacker. We evaluate the efficiency of the proposed defense and the performance of several variations of a new architecture on three standard data sets against a number of known state-of-the art attacks. The numerical results demonstrate the robustness of the proposed defense mechanism against adversarial attacks and show that using the multi-channel architecture with the following aggregation stabilizes the results and increases the classification accuracy.

 
For the future work we aim at investigating the proposed defense strategy against the gradient based sparse attacks and non-gradient based attacks.