In this part we will cover constrained optimization problems and penalty methods to solve them.
Further, we will also see some examples of algorithms that use penalty methods to solve constrained optimization problems.
Penalty Functions
Definition 1 ((Naïve) Penalty Function)
Consider the optimization problem,
(PE){minsubject tof(x)x∈S
where S⊂Rn is non-empty, closed, and f:Rn↦R is differentiable.
We introduce a penalty function to the problem to convert it to an unconstrained optimization problem,
minf(x)+χS(x),
where,
χS(x)={0,+∞,x∈Sx∈/S
is the indicator function of the set S.
Note
Since feasibility is top priority, only when achieving feasibility should we focus on minimizing the objective function.
However, this is computationally bad, since the indicator function is non-differentiable, discontinuous, and not finite (although it is convex if S is convex).
Thus, a more logical approach is to use somekind of numerical “warning” before becoming infeasible or near-infeasible, by approximating the indicator function with a numerically better behaving function.
Further, since our original unconstrained problem and the penalty function are convex (and non-decreasing), it means that our constrained problem is also convex.
Theorem 1 (Optimality of Limit Points of SUMT)
Assume global optimal solution exists in original problem.
x∈Sminf(x)
For any ν>0, let xν⋆ be an optimal solution to,
x∈Rnminf(x)+νχS(x)
Then, xˉ, the limit point of {xνk⋆}, as ν→∞, is an optimal solution to the original problem.
Proof
To prove this theorem, we will need the following lemma.
Lemma 1
With the same assumptions as in the above theorem, for all ν1,ν2>0, such that ν1≤ν2, we have that,
f(xν1⋆)≤f(xν2⋆)
For all ν>0, (PνE) is a relaxation of (PE). Thus, from the relaxation theorem,
f(xνk⋆)+νkχS(xνk⋆)≤f(x⋆)
Therefore,
0≤νkχS(xνk⋆)≤f(x⋆)−f(xνk⋆)
By the lemma and the lower bound,
0≤f(x⋆)−f(xν1⋆)
Thus, if limk→∞χS(xνk⋆)=0, since the term is squeezed between two finite values.
By continuity of χS, every limit point of {xνk⋆} is in S.
Further, let xˉ be a limit point of {xνk⋆}, i.e., there exists a subsequence such that,
This shows that f(xˉ)=f(x⋆) and since xˉ∈S, we must have in fact that f(xˉ)=f(x⋆).
Thus, xˉ is global optimal in the original problem ■
Theorem 2 (Convergence to KKT Points of SUMT)
Let f,gi(i=1,…,m),hj(j=1,…,l) be C1 functions.
Further, assume that the penalty function ψ is C1 and that ψ′(s)≥0∀s≥0.
Consider a sequence νk→+∞,
xk stationary in unconstrained problem with νkxk→x^ as k→+∞LICQ holds at x^x^ feasible in original problem⎭⎬⎫⟹x^ stationary (KKT) in (1)Note
From the proof, we obtain estimates of the Lagrange multipliers, the optimality conditions of the constrained problem gives that,
μi⋆λj⋆≈νkψ′(max{0,gi(xk)})≈νkψ′(hj(xk))Note (Computational Considerations for SUMT)
ν large ⟹f(x)+νχS(x) is ill-conditioned and difficult to minimize.
If we increase ν slowly, a good guess is that xνk⋆≈xνk−1⋆.
However, this guess can be improved.
Interior Penalty Methods
Definition 3 (Interior Penalty Functions)
Consider inequality constrained optimization,
x∈Sminf(x),S={x∈Rn∣gi(x)≤0,i=1,…,m}
Assume strictly feasible points exists xˉ∈Rn such that gi(xˉ)<0,i=1,…,m.
We say that interior penalty (barrier) methods approximates S from the inside.
Further, if a globally optimal solution to this problem is on the boundary of the feasible region, the method generates a sequence of interior points that converge to it.
Furthermore, we can approximate χS(x) from above by,
where ϕ:R−↦R+ is continuous and limsk<0,sk→0ϕ(sk)=+∞.
Typical choices are ϕ(s)=−s−1 or ϕ(s)=−log([min{1,−s}]) and the differentiable logarithmic barrier function ϕ(s)=−log(−s).
Again, since gi are convex, phi is convex and non-increasing, it means that χS is also convex.
Theorem 3 (Optimality of Limit Points of Interior Penalty Methods)
Consider the penalty problem,
minf(x)+νχS(x)
global optimal solutions to this problem is also global optimal solutions to the original problem.
Further, the convergence of stationary point also holds,
xk stationary in unconstrained problem with νkxk→x^ as k→+∞LICQ holds at x^⎭⎬⎫⟹x^ stationary (KKT) in (1)
Polynomial Methods for LPs
Definition 4 (Polynomial Interior Point Methods for Linear Programming)
Consider the LP,
minsubject to−bTyATy+s=cs≥0
and the corresponding KKT conditions,
ATy+s=cAx=bxTs=0x,s,≥0
We can apply the barrier method to the LP, taking care of the s≥0 constraint. Thus, the subproblem becomes,