Acta mathematica scientia,Series B ›› 2022, Vol. 42 ›› Issue (6): 2399-2418.doi: 10.1007/s10473-022-0613-y

• Articles • Previous Articles     Next Articles

ACHIEVING OPTIMAL ADVERSARIAL ACCURACY FOR ADVERSARIAL DEEP LEARNING USING STACKELBERG GAMES

Xiao-shan GAO, Shuang LIU, Lijia YU   

  1. Academy of Mathematics and Systems Science, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing 100190, China
  • Received:2022-07-17 Online:2022-12-25 Published:2022-12-16
  • Contact: Xiao-shan GAO, E-mail: xgao@mmrc.iss.ac.cn E-mail:xgao@mmrc.iss.ac.cn
  • Supported by:
    This work was partially supported by NSFC (12288201) and NKRDP grant (2018YFA0704705).

Abstract: The purpose of adversarial deep learning is to train robust DNNs against adversarial attacks, and this is one of the major research focuses of deep learning. Game theory has been used to answer some of the basic questions about adversarial deep learning, such as those regarding the existence of a classifier with optimal robustness and the existence of optimal adversarial samples for a given class of classifiers. In most previous works, adversarial deep learning was formulated as a simultaneous game and the strategy spaces were assumed to be certain probability distributions in order for the Nash equilibrium to exist. However, this assumption is not applicable to practical situations. In this paper, we give answers to these basic questions for the practical case where the classifiers are DNNs with a given structure; we do that by formulating adversarial deep learning in the form of Stackelberg games. The existence of Stackelberg equilibria for these games is proven. Furthermore, it is shown that the equilibrium DNN has the largest adversarial accuracy among all DNNs with the same structure, when Carlini-Wagner’s margin loss is used. The trade-off between robustness and accuracy in adversarial deep learning is also studied from a game theoretical perspective.

Key words: adversarial deep learning, Stackelberg game, optimal robust DNN, universal adversarial attack, adversarial accuracy, trade-off result

CLC Number: 

  • 62F35
Trendmd