阶梯函数约束优化
\begin{equation} \min_{\mathbf{x}\in\mathbb{R}^{K}} ~~ f(\mathbf{x}),~~~~ \mbox{s.t.}~~ \parallel\mathbf{G}(\mathbf{x})\parallel_0^+\leq s,~\mathbf{x}\in \Omega \tag{SFCO} \end{equation}
其中, 矩阵 $\mathbf{G}(\mathbf{x})\in\mathbb{R}^{M \times N}$ 的第 $(i,j)$ 个元素为 $G_{ij}(\mathbf{x})$,函数 $f:\mathbb{R}^{K}\rightarrow \mathbb{R}$ 和 $G_{ij}:\mathbb{R}^{K}\rightarrow \mathbb{R}$ 连续可微,最好二次连续可微,正整数 $s\ll n$,集合 $\Omega\subseteq\mathbb{R}^{K}$ 闭凸。度量 $\|\mathbf{Z}\|_0^+$ 计算矩阵 $\mathbf{Z}\in\mathbb{R}^{M \times N}$ 中含有正元素的列的个数,即 \begin{equation}\|\mathbf{Z}\|_0^+= \mathrm{step}\Big(\max_{i=1,\ldots,M} Z_{i1}\Big)+\cdots+\mathrm{step}\Big(\max_{i=1,\ldots,M} Z_{iN}\Big)\nonumber\end{equation} 这里,$\mathrm{step}$ 是阶梯(又称 0/1 损失)函数,定义为:$\mathrm{step}(t)=1$ 当 $t>0$;否则 $\mathrm{step}(t)=0$。当 $M=1$,矩阵 $\mathbf{Z}$ 退化成向量 $\mathbf{z}\in\mathbb{R}^{N}$,如果令 $\mathbf{z}_+=$ $(\max\{0,z_1\}$ $\ldots$ $\max\{0,z_N\})^\top$ 以及零范数 $\parallel\mathbf{z}\parallel_0$ 计算 $\mathbf{z}$ 中非零元个数,则有 \begin{equation*}\|\mathbf{z}\|_0^+= \mathrm{step}(z_1)+\cdots+\mathrm{step}(z_N)=\|\mathbf{z}_+\|_0\end{equation*} 目前, 集合 $\Omega$ 可以是以下集合之一:
- 球: $\lbrace\mathbf{x}: \parallel\mathbf{x}\parallel\leq r\rbrace$,其中 $r>0$:
- 半空间: $\lbrace\mathbf{x}: \mathbf{a}^T\mathbf{x}\leq b\rbrace$,其中 $\mathbf{a}\in\mathbb{R}^{K}$ 和 $b\in\mathbb{R}$
- 超平面: $\lbrace\mathbf{x}: \mathbf{A} \mathbf{x}= \mathbf{b}\rbrace$,其中 $\mathbf{A}\in\mathbb{R}^{S\times K}$ 和 $ \mathbf{b}\in\mathbb{R}^{S}$
- 盒子: $\lbrace\mathbf{x}: l\leq x_i \leq u, i=1,\ldots,K\rbrace$,其中 $l \leq u$ 分别可以取值 $-\infty$ 和 $+\infty$。因此,当 $l=-\infty$ 和 $u=+\infty$, $\Omega$ 为无约束;当 $l=0$ 和 $u=+\infty$, $\Omega$ 为非负约束
程序包 - SFCOpack-Matlab 或 SFCOpack-Python(点击下载)提供了 1 个求解器,其核心算法来自以下文章:
SNSCO - S Zhou, L Pan, N Xiu, and G Li, A 0/1 constrained optimization solving sample average approximation for chance constrained programming, MOR, 2024.
求解器 $\texttt{SNSCO}$ 核心算法属于二阶方法,需要用到目标函数 $f(\mathbf{x})$ 和 约束函数 $\mathbf{G}(\mathbf{x})$ 的函数值、梯度和海瑟矩阵。基于 Matlab 语言(基于 Python语言,可进行类似定义)下面用一个恢复问题作为示例,展示如何为该求解器定义这些内容。 恢复问题的目标函数为 $f(\mathbf{x})=0.5\parallel\mathbf{B}\mathbf{x}-\mathbf{d}\parallel^2$ 和约束函数为 $\mathbf{G}_{ij}(\mathbf{x})= \langle\mathbf{A}_{:(j-1)M+i}, \mathbf{x}\rangle-C_{ij}$。下面两段 MATLAB 代码分别定义了$f$ 和 $\mathbf{G}$ 的函数值、梯度和海瑟矩阵。例如,函数句柄 $\texttt{FuncfRecovery}$ 的输入中,$\texttt{x}$ 为变量,($\texttt{B}$,$\texttt{d}$,$\texttt{BtB}$)为函数 $f(\mathbf{x})$ 中涉及的数据。在调用函数 $\texttt{FuncfRecovery}$ 时,这些数据需要用户自己定义。
function [objef, gradf, hessf] = FuncfRecovery(x,B,d,BtB)
% This code provides information for an objective function
% f(x) = 0.5*||Bx-d||^2
% x is the variable
% (B,d,BtB) are data and need to be input
% B\in R^{m by n}, d\in R^{m by 1}, and BtB = B'*B
Bxd = B*x-d;
objef = norm(Bxd)^2/2; % objective
gradf = (Bxd'*B)'; % gradient
hessf = BtB; % Hessian
clear B d BtB
end
function [G,gradG,gradGW, hessGW] = FuncGRecovery(x,W,Ind,A,C,K,M,N)
% This code provides information for function G(x): R^K -> R^{M x N}
% (x,W,Ind) are variables
% (A,C,K,M,N) are data and parameters, and need to be input
% For each i=1,...,M and j=1,...,N:
% G_{ij}(x) = <A_{:,(j-1)*M+i}, x>^2 - C_{ij}
% where A_{:,(j-1)*M+i} is the ((j-1)*M+i)th column of A
% A \in R^{K by M*N} and C \in R^{M by N}
G0 = x'*A;
G = reshape(G0,M,N);
G = G.^2-C; % function
if nargout > 1
if isempty(Ind)
gradG = [];
gradGW = zeros(K,1); % gradient
hessGW = zeros(K,1); % Hessian
else
AInd = A(:,Ind);
gradG = AInd.*G0(Ind);
WInd = W(Ind);
gradGW = gradG*reshape(WInd,length(Ind),1); % gradient
hessGW = AInd*diag(WInd)*AInd.'; % Hessian
end
end
clear A C K M N
end
定义完函数 $f$ 和 $\mathbf{G}$ 后,用户需要选择约束集 $\Omega$。目前,$\Omega$ 允许取 {'$\texttt{Ball}$', '$\texttt{Box}$', '$\texttt{Halfspace}$', '$\texttt{Hyperplane}$'} 中的一个。而每一个集合会涉及相应的参数。例如,盒子约束需要设置上下界,当下界取 $-\infty$ 和 上界取 $+\infty$ 时, 盒子约束变为无约束;当下界取 $0$ 和 上界取 $+\infty$ 时,盒子约束变为非负约束。对于其他约束,相关参数详见上面的模型介绍。选择完约束集 $\Omega$ 后,就可以调用 $\texttt{SNSCO}$ 来求解问题。下面的 Matlab 代码展示了求解恢复问题的过程。
% demon recovery problems
clc; close all; clear all; addpath(genpath(pwd));
K = 10;
M = 10;
N = 100;
alpha = 0.05;
s = ceil(alpha*N);
test = 2; % Omega = {x|norm(x) <= r} if test = 1
% Omega = [lb,ub]^n if test = 2
% Omega = {x|a'*x <= b} if test = 3
% Omega = {x|Ax = b} if test = 4
sets = {'Ball','Box','Halfspace','Hyperplane'};
switch sets{test}
case 'Ball'
input1 = 2;
input2 = [];
xopt = randn(K,1);
xopt = input1/norm(xopt)*xopt;
case 'Box'
input1 = -2; % can be -inf
input2 = 2; % can be inf
xopt = max(-10,input1)+(min(10,input2)-max(-10,input1))*rand(K,1);
case 'Halfspace'
xopt = rand(K,1);
input1 = randn(K,1);
input2 = sum(input1.*xopt)+rand(1);
case 'Hyperplane'
xopt = randn(K,1);
input1 = randn(ceil(0.5*K),K);
input2 = input1*xopt;
end
% Generate data and define f and G
B = randn(ceil(0.25*K),K)/sqrt(K);
d = B*xopt;
BtB = B'*B;
xi = randn(K,M,N);
T = randperm(N,s);
Mat = rand(M,N);
D = (Mat>=0.5) .* rand(M,N);
D(:,T) = (Mat(:,T)<1/3).*rand(M,nnz(T))-(Mat(:,T)>=2/3).*rand(M,nnz(T));
A = reshape(xi,K,M*N);
C = (squeeze(sum(xi .* xopt, 1))).^2 + D;
Funcf = @(x)FuncfRecovery(x,B,d,BtB); % f(x) = 0.5||Bx-d||^2
FuncG = @(x,W,J)FuncGRecovery(x,W,J,A,C,K,M,N); % G(x)_ij = <A_ij,x>^2-C_ij
% set parameters and call the solver
if alpha > 0.01
pars.tau0 = 0.05+0.05*(test>4);
else
pars.tau0 = 0.01;
pars.thd = 1e-1*(test==4)+1e-2*(test~=4);
end
out = SNSCO(K,M,N,s,Funcf,FuncG,sets{test},input1,input2,pars);
fprintf(' Relative error: %7.3e \n', norm(out.x-xopt)/norm(xopt));
Matlab 版求解器 $\texttt{SNSCO}$ 的输入与输出(Python 版的输入与输出类似)说明如下,其中输入 ($\texttt{K}$, $\texttt{M}$, $\texttt{N}$, $\texttt{s}$, $\texttt{Funcf}$, $\texttt{FuncG}$, $\texttt{FeaSet}$, $\texttt{input1}$, $\texttt{input2}$) 为必需项。$\texttt{pars}$ 中的参数为可选项,但设置某些参数可能会提升求解器的性能和解的质量。 需要注意的是,$\texttt{FeaSet}$ 只能取 {'$\texttt{Ball}$', '$\texttt{Box}$', '$\texttt{Halfspace}$', '$\texttt{Hyperplane}$'} 中的一个。对于每一个集合,求解器设置了两个输入 $\texttt{input1}$ 和 $\texttt{input2}$。当不需要某个输入时,可以设置为空 $\texttt{[ ]}$。例如,当 $\texttt{FeaSet}$='$\texttt{Ball}$' 时, 可设置 $\texttt{input1}$=2 和 $\texttt{input2}$=$\texttt{[ ]}$,表示半径为 2 的球约束。当 $\texttt{FeaSet}$='$\texttt{Box}$' 时, 可设置 $\texttt{input1}$=0 和 $\texttt{input2}$ = $\texttt{Inf}$,表示非负约束。
function out = SNSCO(K,M,N,s,Funcf,FuncG,FeaSet,input1,input2,pars)
% This solver solves 0/1 constrained optimization in the following form:
%
% min_{x\in\R^K} f(x), s.t. \| G(x) \|^+_0<=s, x\in Omega
%
% where
% f(x) : \R^K --> \R
% G(x) : \R^K --> \R^{M-by-N}
% s << N
% \|Z\|^+_0 counts the number of columns with positive maximal values
% Omega is a closed and convex set
% -------------------------------------------------------------------------
% Inputs:
% K : Dimnesion of variable x (REQUIRED)
% M : Row number of G(x) (REQUIRED)
% N : Column number of G(x) (REQUIRED)
% s : An integer in [1,N), typical choice ceil(0.01*N) (REQUIRED)
% Funcf : Function handle of f(x) (REQUIRED)
% FuncG : Function handle of G(x) (REQUIRED)
% FeaSet: Feasible set for x, must be one of: (REQUIRED)
% 'Box' [lb,ub]^K
% 'Ball' {x|norm(x) <= r}
% 'Halfspace' {x|a'*x <= b}
% 'Hyperplane' {x|Ax = b}
% Default: R^K
% input1: A parameter related to FeasSet (REQUIRED)
% input2: A parameter related to FeasSet (REQUIRED)
% pars : All parameters are OPTIONAL
% pars.x0 -- Initial point (default:ones(K,1))
% pars.tau0 -- A vector of a number of \tau0 (default 1)
% e.g.,pars.tau0=logspace(log10(.5),log10(1.75),20)
% pars.tol -- Tolerance of halting condition (default 1e-6*M*N)
% pars.maxit -- Maximum number of iterations (default 2000)
% pars.disp -- Show results or not at each step (default 1)
% -------------------------------------------------------------------------
% Outputs:
% out.x: Solution x
% out.obj: Objective function value f(x)
% out.G: Function value of G(x)
% out.time: CPU time
% out.iter: Number of iterations
% out.error: Error
% out.Error: Error of every iteration
% -------------------------------------------------------------------------
% Send your comments and suggestions to <<< slzhou2021@163.com >>>
% WARNING: Accuracy may not be guaranteed!!!!!
% -------------------------------------------------------------------------
