阶梯函数正则优化
\begin{equation} \min_{\mathbf{x}\in\mathbb{R}^{n}} ~~ f(\mathbf{x}) + \lambda \parallel(\mathbf{B}\mathbf{x}+\mathbf{b})_+\parallel_0 \tag{SFRO} \end{equation}
其中,函数 $f:\mathbb{R}^{n}\rightarrow \mathbb{R}$ 二次连续可微,矩阵 $\mathbf{B}\in\mathbb{R}^{m\times n}$,向量 $\mathbf{b}\in\mathbb{R}^{m}$,罚参数 $\lambda>0$,零范数 $\|\mathbf{z}\|_0$ 表示 $\mathbf{z}$ 中非零元的个数。 记 $\mathbf{z}_+=$ $(\max\{0,z_1\}$ $\ldots$ $\max\{0,z_m\})^\top$,则 $\|\mathbf{z}_+\|_0$ 计算 $\mathbf{z}$ 中正元素的个数。该函数与阶梯(又称 0/1 损失)函数相关,其定义为:$\mathrm{step}(t)=1$ 当 $t>0$,否则 $\mathrm{step}(t)=0$。因此,$\|\mathbf{z}_+\|_0=$ $\mathrm{step}(z_1)+$ $\cdots$ $+\mathrm{step}(z_m)$。
程序包 - SFROpack-Matlab(点击下载)提供了 1 个求解器,其核心算法来自以下文章:
NM01 - S Zhou, L Pan, N Xiu, and H Qi, Quadratic convergence of smoothing Newton’s method for 0/1 loss optimization, SIOPT, 31:3184–3211, 2021.
求解器 $\texttt{NM01}$ 核心算法属于二阶方法,需要用到目标函数值、梯度和海瑟矩阵。基于 Matlab 语言(基于 Python语言,可进行类似定义)下面用1-比特压缩感知(1BCS)作为示例,展示如何为该求解器定义这些内容。 1BCS 问题的目标函数可以参考 1BCS 页面中的模型(SFRO)。下面的 MATLAB 代码定义了模型中目标函数值、梯度和海瑟矩阵,其中,函数句柄 $\texttt{func1BCS}$ 中的输入 $\texttt{x}$ 和 $\texttt{key}$ 为两个变量,其他输入 $\texttt{eps}$、$\texttt{q}$、$\texttt{A}$ 和 $\texttt{c}$ 为模型(SFRO)中给定的参数和数据。这里,字符串变量 $\texttt{key}$ 用于指定计算内容:当 $\texttt{key}$='$\texttt{f}$' 时计算目标函数值;当 $\texttt{key}$='$\texttt{g}$' 时计算梯度;当 $\texttt{key}$='$\texttt{h}$' 时计算海瑟矩阵。当 $\texttt{key}$='$\texttt{a}$' 时,会额外计算一个用户自定义函数。在此示例中,计算的是 1BCS 问题的准确率。这使得用户能够在优化过程中监控自定义指标。
function out = func1BCS(x,key,eps,q,A,c)
switch key
case 'f'; out = sum((x.^2+eps).^(q/2));
case 'g'; out = q*x.*(x.^2+eps).^(q/2-1);
case 'h'; x2 = x.*x; out = diag(((x2+eps).^(q/2-2)).*((q-1)*x2+eps));
case 'a'; acc = @(var)nnz(sign(A*var)-c); out = 1-acc(x)/length(c);
otherwise; out = []; % 'Otherwise' is REQIURED if no key='a'
end
end
如果不需要额外的函数,用户只需定义目标函数值、梯度和海瑟矩阵,然后省略情况 $\texttt{key}$='$\texttt{a}$',如下所示。注意,当情况 $\texttt{key}$='$\texttt{a}$' 被省略时,用户必须保留情况 $\texttt{otherwise}$。
function out = func1BCS(x,key,eps,q)
switch key
case 'f'; out = sum((x.^2+eps).^(q/2));
case 'g'; out = q*x.*(x.^2+eps).^(q/2-1);
case 'h'; x2 = x.*x; out = diag(((x2+eps).^(q/2-2)).*((q-1)*x2+eps));
otherwise; out = [];
end
end
对于 1BCS 模型(SFRO),定义好如上函数后,就可以调用 $\texttt{NM01}$ 来求解该问题。用户只需指定 ($\texttt{func}$, $\texttt{B}$, $\texttt{b}$, $\texttt{lam}$, $\texttt{pars}$),然后运行求解器即可。
% Solving 1 bit compressive sensing using randomly generated data
clc; close all; clear all; addpath(genpath(pwd));
n = 1000;
m = ceil(0.5*n);
s = ceil(0.01*n); % sparsity level
r = 0.01; % flipping ratio
nf = 0.05; % noisy ratio
[A,c,co,xo] = random1bcs('Ind',m,n,s,nf,r,0.5); % data generation
func = @(x,key)func1BCS(x,key,1e-5,0.5,A,c);
B = (-c).*A;
b = (n*8e-5)*ones(m,1);
lam = 1;
pars.tau = 1;
pars.strict =(n<=2000);
out = NM01(func, B, b, lam, pars);
x = refine(out.sol,s,A,c);
PlotRecovery(xo,x,[950,500,500,250],1)
fprintf(' Computational time: %.3fsec\n',out.time);
fprintf(' Signal-to-noise ratio: %.2f\n',-20*log10(norm(x-xo)));
fprintf(' Hamming distance: %.3f\n',nnz(sign(A*x)-c)/m)
fprintf(' Hamming error: %.3f\n',nnz(sign(A*x)-co)/m)
Matlab 版求解器 $\texttt{NM01}$ 的输入与输出(Python 版的输入与输出类似)说明如下,其中输入参数 ($\texttt{func}$, $\texttt{B}$, $\texttt{b}$, $\texttt{lam}$) 为必需项。$\texttt{pars}$ 中的参数为可选项,但设置某些参数可能会提升求解器的性能和解的质量。
function out = NM01(func,B,b,lam,pars)
% -------------------------------------------------------------------------
% This code aims at solving the support vector machine with form
%
% min f(x) + lam * ||(Bx+b)_+||_0
%
% where f is twice continuously differentiable
% lam > 0, B\in\R^{m x n}, b\in\R^{m x 1}
% (z)_+ = (max{0,z_1},...,max{0,z_m})^T
% ||(z)_+ ||_0 counts the number of positive entries of z
% -------------------------------------------------------------------------
% Inputs:
% func: A function handle defines (objective,gradient,Hessain) (REQUIRED)
% B : A matrix \R^{m x n} (REQUIRED)
% b : A vector \R^{m x 1} (REQUIRED)
% lam : The penalty parameter (REQUIRED)
% pars: Parameters are all OPTIONAL
% pars.x0 -- The initial point (default zeros(n,1))
% pars.tau -- A useful paramter (default 1.00)
% pars.mu0 -- A smoothing parameter (default 0.01)
% pars.maxit -- Maximum number of iterations (default 1000)
% pars.tol -- Tolerance of halting conditions (1e-7*sqrt(n*m))
% pars.strict -- = 0, loosely meets halting conditions (default 0)
% = 1, strictly meets halting conditions
% pars.strict=1 is useful for low dimensions
% -------------------------------------------------------------------------
% Outputs:
% out.sol: The solution
% out.obj: The objective function value
% out.time: CPU time
% out.iter: Number of iterations
% -------------------------------------------------------------------------
% Send your comments and suggestions to <<< slzhou2021@163.com >>>
% WARNING: Accuracy may not be guaranteed!!!!!
% -------------------------------------------------------------------------