Robustifying Binary Classification to Adversarial Perturbation

Creators: Salehi, Fariborz; Hassibi, Babak

Style

An error occurred while generating the citation.

Abstract

Despite the enormous success of machine learning models in various applications, most of these models lack resilience to (even small) perturbations in their input data. Hence, new methods to robustify machine learning models seem very essential. To this end, in this paper we consider the problem of binary classification with adversarial perturbations. Investigating the solution to a min-max optimization (which considers the worst-case loss in the presence of adversarial perturbations) we introduce a generalization to the max-margin classifier which takes into account the power of the adversary in manipulating the data. We refer to this classifier as the "Robust Max-margin" (RM) classifier. Under some mild assumptions on the loss function, we theoretically show that the gradient descent iterates (with sufficiently small step size) converge to the RM classifier in its direction. Therefore, the RM classifier can be studied to compute various performance measures (e.g. generalization error) of binary classification with adversarial perturbations.

Attached Files

Submitted - 2010.15391.pdf

Files

2010.15391.pdf

Files (369.3 kB)

Name	Size	Download all
2010.15391.pdf md5:16fc0f1ab160c716cbf1a10e54d69872	369.3 kB	Preview Download

Additional details

	All versions	This version
Views	21	21
Downloads	5	5
Data volume	1.8 MB	1.8 MB