@article{zhang2019theoretically,
title={Theoretically Principled Trade-off between Robustness and Accuracy},
author={Zhang, Hongyang and Yu, Yaodong and Jiao, Jiantao and Xing, Eric P and Ghaoui, Laurent El and Jordan, Michael I},
journal={arXiv: Learning},
year={2019}}
概
从二分类问题入手, 拆分 R r o b \mathcal{R}_{rob} Rrob为 R n a t , R b d y \mathcal{R}_{nat},\mathcal{R}_{bdy} Rnat,Rbdy, 通过 R r o b − R n a t ∗ \mathcal{R}_{rob}-\mathcal{R}_{nat}^* Rrob−Rnat∗的上界建立损失函数,并将这种思想推广到一般的多分类问题.
主要内容
符号说明
X
,
Y
X, Y
X,Y: 随机变量;
x
∈
X
,
y
x\in \mathcal{X}, y
x∈X,y: 样本, 对应的标签(
1
,
−
1
1, -1
1,−1);
f
f
f: 分类器(如神经网络);
B
(
x
,
ϵ
)
\mathbb{B}(x, \epsilon)
B(x,ϵ):
{
x
′
∈
X
:
∥
x
′
−
x
∥
≤
ϵ
}
\{x'\in \mathcal{X}:\|x'-x\| \le \epsilon\}
{x′∈X:∥x′−x∥≤ϵ};
B
(
D
B
(
f
)
,
ϵ
)
\mathbb{B}(DB(f),\epsilon)
B(DB(f),ϵ):
{
x
∈
X
:
∃
x
′
∈
B
(
x
,
ϵ
)
,
s
.
t
.
f
(
x
)
f
(
x
′
)
≤
0
}
\{x \in \mathcal{X}: \exist x'\in \mathbb{B}(x,\epsilon), \mathrm{s.t.} \: f(x)f(x')\le0\}
{x∈X:∃x′∈B(x,ϵ),s.t.f(x)f(x′)≤0} ;
ψ
∗
(
u
)
\psi^*(u)
ψ∗(u):
sup
u
{
u
T
v
−
ψ
(
u
)
}
\sup_u\{u^Tv-\psi(u)\}
supu{uTv−ψ(u)}, 共轭函数;
ϕ
\phi
ϕ: surrogate loss.
Error
R
r
o
b
(
f
)
:
=
E
(
X
,
Y
)
∼
D
1
{
∃
X
′
∈
B
(
X
,
ϵ
)
,
s
.
t
.
f
(
X
′
)
Y
≤
0
}
,
(e.1)
\tag{e.1} \mathcal{R}_{rob}(f):= \mathbb{E}_{(X,Y)\sim \mathcal{D}}\mathbf{1}\{\exist X' \in \mathbb{B}(X, \epsilon), \mathrm{s.t.} \: f(X')Y \le 0\},
Rrob(f):=E(X,Y)∼D1{∃X′∈B(X,ϵ),s.t.f(X′)Y≤0},(e.1)
其中
1
(
⋅
)
\mathbf{1}(\cdot)
1(⋅)表示指示函数, 显然
R
r
o
b
(
f
)
\mathcal{R}_{rob}(f)
Rrob(f)是关于分类器
f
f
f存在adversarial samples 的样本的点的测度.
R
n
a
t
(
f
)
:
=
E
(
X
,
Y
)
∼
D
1
{
f
(
X
)
Y
≤
0
}
,
(e.2)
\tag{e.2} \mathcal{R}_{nat}(f) :=\mathbb{E}_{(X,Y)\sim \mathcal{D}}\mathbf{1}\{f(X)Y \le 0\},
Rnat(f):=E(X,Y)∼D1{f(X)Y≤0},(e.2)
显然
R
n
a
t
(
f
)
\mathcal{R}_{nat}(f)
Rnat(f)是
f
f
f正确分类真实样本的概率, 并且
R
r
o
b
≥
R
n
a
t
\mathcal{R}_{rob} \ge \mathcal{R}_{nat}
Rrob≥Rnat.
R
b
d
y
(
f
)
:
=
E
(
X
,
Y
)
∼
D
1
{
X
∈
B
(
D
B
(
f
)
,
ϵ
)
,
f
(
X
)
Y
>
0
}
,
(e.3)
\tag{e.3} \mathcal{R}_{bdy}(f) :=\mathbb{E}_{(X,Y)\sim \mathcal{D}}\mathbf{1}\{X \in \mathbb{B}(DB(f), \epsilon), \:f(X)Y > 0\},
Rbdy(f):=E(X,Y)∼D1{X∈B(DB(f),ϵ),f(X)Y>0},(e.3)
显然
R
r
o
b
−
R
n
a
t
=
R
b
d
y
.
(1)
\tag{1} \mathcal{R}_{rob}-\mathcal{R}_{nat}=\mathcal{R}_{bdy}.
Rrob−Rnat=Rbdy.(1)
因为想要最优化
0
−
1
0-1
0−1loss是很困难的, 我们往往用替代的loss
ϕ
\phi
ϕ, 定义:
R
ϕ
(
f
)
:
=
E
(
X
,
Y
)
∼
D
ϕ
(
f
(
X
)
Y
)
,
R
ϕ
∗
(
f
)
:
=
min
f
R
ϕ
(
f
)
.
\mathcal{R}_{\phi}(f):= \mathbb{E}_{(X, Y) \sim \mathcal{D}} \phi(f(X)Y), \\ \mathcal{R}^*_{\phi}(f):= \min_f \mathcal{R}_{\phi}(f).
Rϕ(f):=E(X,Y)∼Dϕ(f(X)Y),Rϕ∗(f):=fminRϕ(f).
Classification-calibrated surrogate loss
这部分很重要, 但是篇幅很少, 我看懂, 等回看了引用的论文再讨论.


引理2.1

定理3.1
在假设1的条件下
ϕ
(
0
)
≥
1
\phi(0)\ge1
ϕ(0)≥1, 任意的可测函数
f
:
X
→
R
f:\mathcal{X} \rightarrow \mathbb{R}
f:X→R, 任意的于
X
×
{
±
1
}
\mathcal{X}\times \{\pm 1\}
X×{±1}上的概率分布, 任意的
λ
>
0
\lambda > 0
λ>0, 有
R
r
o
b
(
f
)
−
R
n
a
t
∗
≤
ψ
−
1
(
R
ϕ
(
f
)
−
R
ϕ
∗
)
+
P
r
[
X
∈
B
(
D
B
(
f
)
,
ϵ
)
,
f
(
X
)
Y
>
0
]
≤
ψ
−
1
(
R
ϕ
(
f
)
−
R
ϕ
∗
)
+
E
max
X
′
∈
B
(
X
,
ϵ
)
ϕ
(
f
(
X
′
)
f
(
X
)
/
λ
)
.
\begin{array}{ll} & \mathcal{R}_{rob}(f) - \mathcal{R}_{nat}^* \\ \le & \psi^{-1}(\mathcal{R}_{\phi}(f)-\mathcal{R}_{\phi}^*) + \mathbf{Pr}[X \in \mathbb{B}(DB(f), \epsilon), f(X)Y >0] \\ \le & \psi^{-1}(\mathcal{R}_{\phi}(f)-\mathcal{R}_{\phi}^*) + \mathbb{E} \quad \max _{X' \in \mathbb{B}(X, \epsilon)} \phi(f(X')f(X)/\lambda). \\ \end{array}
≤≤Rrob(f)−Rnat∗ψ−1(Rϕ(f)−Rϕ∗)+Pr[X∈B(DB(f),ϵ),f(X)Y>0]ψ−1(Rϕ(f)−Rϕ∗)+EmaxX′∈B(X,ϵ)ϕ(f(X′)f(X)/λ).
最后一个不等式, 我知道是因为
ϕ
(
f
(
X
′
)
f
(
X
)
/
λ
)
≥
1.
\phi(f(X')f(X)/\lambda) \ge1.
ϕ(f(X′)f(X)/λ)≥1.
定理3.2

结合定理 3.1 , 3.2 3.1, 3.2 3.1,3.2可知, 这个界是紧的.
由此导出的TRADES算法
二分类问题, 最优化上界, 即:

扩展到多分类问题, 只需:

算法如下:

实验概述
5.1: 衡量该算法下, 理论上界的大小差距;
5.2: MNIST, CIFAR10 上衡量
λ
\lambda
λ的作用,
λ
\lambda
λ越大
R
n
a
t
\mathcal{R}_{nat}
Rnat越小,
R
r
o
b
\mathcal{R}_{rob}
Rrob越大, CIFAR10上反映比较明显;
5.3: 在不同adversarial attacks 下不同算法的比较;
5.4: NIPS 2018 Adversarial Vision Challenge.
代码
import torch
import torch.nn as nn
def quireone(func): #a decorator, for easy to define optimizer
def wrapper1(*args, **kwargs):
def wrapper2(arg):
result = func(arg, *args, **kwargs)
return result
wrapper2.__doc__ = func.__doc__
wrapper2.__name__ = func.__name__
return wrapper2
return wrapper1
class AdvTrain:
def __init__(self, eta, k, lam,
net, lr = 0.01, **kwargs):
"""
:param eta: step size for adversarial attacks
:param lr: learning rate
:param k: number of iterations K in inner optimization
:param lam: lambda
:param net: network
:param kwargs: other configs for optim
"""
kwargs.update({'lr':lr})
self.net = net
self.criterion = nn.CrossEntropyLoss()
self.opti = self.optim(self.net.parameters(), **kwargs)
self.eta = eta
self.k = k
self.lam = lam
@quireone
def optim(self, parameters, **kwargs):
"""
quireone is decorator defined below
:param parameters: net.parameteres()
:param kwargs: other configs
:return:
"""
return torch.optim.SGD(parameters, **kwargs)
def normal_perturb(self, x, sigma=1.):
return x + sigma * torch.randn_like(x)
@staticmethod
def calc_jacobian(loss, inp):
jacobian = torch.autograd.grad(loss, inp, retain_graph=True)[0]
return jacobian
@staticmethod
def sgn(matrix):
return torch.sign(matrix)
def pgd(self, inp, y, perturb):
boundary_low = inp - perturb
boundary_up = inp + perturb
inp.requires_grad_(True)
out = self.net(inp)
loss = self.criterion(out, y)
delta = self.sgn(self.calc_jacobian(loss, inp)) * self.eta
inp_new = inp.data
for i in range(self.k):
inp_new = torch.clamp(
inp_new + delta,
boundary_low,
boundary_up
)
return inp_new
def ipgd(self, inps, ys, perturb):
N = len(inps)
adversarial_samples = []
for i in range(N):
inp_new = self.pgd(
inps[[i]], ys[[i]],
perturb
)
adversarial_samples.append(inp_new)
return torch.cat(adversarial_samples)
def train(self, trainloader, epoches=50, perturb=1, normal=1):
for epoch in range(epoches):
running_loss = 0.
for i, data in enumerate(trainloader, 1):
inps, labels = data
adv_inps = self.ipgd(self.normal_perturb(inps, normal),
labels, perturb)
out1 = self.net(inps)
out2 = self.net(adv_inps)
loss1 = self.criterion(out1, labels)
loss2 = self.criterion(out2, labels)
loss = loss1 + loss2
self.opti.zero_grad()
loss.backward()
self.opti.step()
running_loss += loss.item()
if i % 10 is 0:
strings = "epoch {0:<3} part {1:<5} loss: {2:<.7f}\n".format(
epoch, i, running_loss
)
print(strings)
running_loss = 0.
本文探讨了机器学习中鲁棒性和准确性之间的理论权衡,提出了通过拆分鲁棒风险为自然风险和边界风险来优化模型的方法。基于分类校准替代损失函数,导出了TRADES算法,实验证明了在不同对抗性攻击下,该算法的有效性。

3571

被折叠的 条评论
为什么被折叠?



