Part 10 - Linear Programming: The Simplex Method

TMA947_10

Introduction

In this part we will cover the Simplex method, which is a very efficient method for solving Linear Programs (LPs) in practice.

Firstly, let’s recap the setting we are working with.

Recall: Linear Program on Standard Form

An LP on standard form is, $$ \begin{align*} \min \ & \mathbf{c}^T \mathbf{x} \newline \text{subject to} \ & A \mathbf{x} = \mathbf{b} \newline \ & \mathbf{x} \geq \mathbf{0} \end{align*} $$ where $A \in \mathbb{R}^{m \times n}$, $\mathbf{b} \in \mathbb{R}^m$, $\mathbf{c} \in \mathbb{R}^n$ and $\mathbf{x} \in \mathbb{R}^n$.

Note

We will also assume that $\mathrm{rank}(A) = m, m \leq n$ and $\mathbf{b} \geq \mathbf{0}$.

Recall: Basic Feasible Solution

Recall the standard form polyhedron $P = \{\mathbf{x} \in \mathbb{R}^n \mid A \mathbf{x} = \mathbf{b}, \ \mathbf{x} \geq \mathbf{0}\}$. A point $\bar{\mathbf{x}}$ is called a basic feasible solution (BFS), if,

$\bar{\mathbf{x}}$ is a basic solution, i.e., $A \bar{\mathbf{x}} = \mathbf{b}$ and the columns of $A$ corresponding to the non-zero components of $\bar{\mathbf{x}}$ are linearly independent,
$\bar{\mathbf{x}} \geq \mathbf{0}$

Note

A BFS is a point in the feasible region (polyhedron) $P$.

The Simplex Method

We will now introduce the Simplex method, which consists of four steps.

Algorithm: The Simplex Method

Determine if the current BFS is optimal. If it is, stop.
If it is not optimal, find a search direction $\mathbf{d}$ to an adjacent BFS that decreases the objective function (adjacent and descent).
Determine the step length $\theta$ to the adjacent BFS in the direction $\mathbf{d}$.
Update to the new BFS $\mathbf{x} + \theta \mathbf{d}$ and go to step 1.

We will now go through each step in detail, however, we will go in the order (2), (1), (3), (4), you will see why soon.

Step 2: Finding an Adjacent Descent Direction

Given our assumption from (1), we know that we are currently at a BFS $\bar{\mathbf{x}} = \begin{bmatrix} \bar{\mathbf{x}}_B \newline \bar{\mathbf{x}}_N \end{bmatrix}$ and $A = \begin{bmatrix} B & N \end{bmatrix}$. Thus, we can anagogously partition our wanted search direction $\mathbf{d} = \begin{bmatrix} \mathbf{d}_B \newline \mathbf{d}_N \end{bmatrix}$.

As we discussed last time, two BFSs are adjacent if all but one column in their corresponding basis matrices are the same. Thus, we can partition $\mathbf{d}_N$ such that,

$$ \mathbf{d}_N = \mathbf{e}_j = \begin{bmatrix} 0 \newline \vdots \newline 1 \newline \vdots \newline 0 \end{bmatrix} $$ where the $1$ is in the $j$-th position, $j \in {1, \ldots, n - m}$, i.e., we are only changing one non-basic variable.

However, $\mathbf{d}_B$ is unknown, but we can find it by using the equality constraint, $$ \begin{align*} A(\bar{\mathbf{x}} + \theta \mathbf{d}) & = \mathbf{b} \newline \underbrace{A \bar{\mathbf{x}}}_{= \mathbf{b}} + \theta A \mathbf{d} & = \mathbf{b} \newline \theta A \mathbf{d} & = \mathbf{0} \newline A \mathbf{d} & = \mathbf{0} \newline \begin{bmatrix} B & N \end{bmatrix} \begin{bmatrix} \mathbf{d}_B \newline \mathbf{d}_N \end{bmatrix} & = \mathbf{0} \newline B \mathbf{d}_B + N \mathbf{d}_N & = \mathbf{0} \newline B \mathbf{d}_B + N \mathbf{e}_j & = \mathbf{0} \newline B \mathbf{d}_B & = -N \mathbf{e}_j \newline \mathbf{d}_B & = -B^{-1} N \mathbf{e}_j \end{align*} $$

Definition: Search Direction

We consider search directions of the form, $$ \mathbf{d} = \begin{bmatrix} -B^{-1} N \mathbf{e}_j \newline \mathbf{e}_j \end{bmatrix}, \quad j \in {1, \ldots, n - m} $$

Further, we want to ensure that the search direction is a descent direction, i.e., $\nabla_{\mathbf{x}} f(\bar{\mathbf{x}})^T \mathbf{d} < 0$, i.e., $\mathbf{c}^T \mathbf{d} < 0$.

Now, consider again analogously partitioning $\mathbf{c} = \begin{bmatrix} \mathbf{c}_B \newline \mathbf{c}_N \end{bmatrix}$.

Then, we have, $$ \begin{align*} \mathbf{c}^T \mathbf{d} & = \mathbf{c}_B^T (-B^{-1} N \mathbf{e}_j) + \mathbf{c}_N^T \mathbf{e}_j \newline & = (-\mathbf{c}_B^T B^{-1} N + \mathbf{c}_N^T) \mathbf{e}_j \newline \end{align*} $$

Definition: Reduced Cost

Let $\bar{\mathbf{x}}$ be a BFS corresponding to the partitioning $A = \begin{bmatrix} B & N \end{bmatrix}$, the reduced costs for this partitioning is defined as, $$ \tilde{\mathbf{c}_N}^T = \mathbf{c}_N^T - \mathbf{c}_B^T B^{-1} N $$ The $j$-th component of the reduced costs $\tilde{c}_{N_j}$ is called the reduced cost of the non-basic variable $x_{N_j}$.

It is possible that there are multiple non-basic variables with negative reduced costs, in that case, which one should we choose?

In general, it will suffice to choose the one with the most negative reduced cost, i.e., $$ j \in \underset{k : (\tilde{\mathbf{c}_N})_k < 0}{\arg \min} (\tilde{\mathbf{c}_N})_k. $$

Let’s now define the optimality condition.

Theorem: Optimality Check

Let $\bar{\mathbf{x}}$ be a BFS corresponding to the partitioning $A = \begin{bmatrix} B & N \end{bmatrix}$, and $\tilde{\mathbf{c}_N}^T = \mathbf{c}_N^T - \mathbf{c}_B^T B^{-1} N$ be the reduced costs for this partitioning, $$ \tilde{\mathbf{c}_N}^T \geq \mathbf{0} \implies \bar{\mathbf{x}} \text{ is optimal} $$

Note

G> > The above statement is a sufficient condition for optimality, which is nice.

Further, note that, if we do not have any degeneracy, then the above statement is a necessary and sufficient condition for optimality, i.e., $$ \tilde{\mathbf{c}_N}^T \geq \mathbf{0} \iff \bar{\mathbf{x}} \text{ is optimal (no degeneracy)} $$

Step 3: Determining the Step Length

We now have a search direction $\mathbf{d} = \begin{bmatrix} -B^{-1} N \mathbf{e}_j \newline \mathbf{e}_j \end{bmatrix}$, and we want to determine the step length $\theta$ such that $\bar{\mathbf{x}} + \theta \mathbf{d}$ is an adjacent BFS.

Let’s write out the update,

$$ \begin{align*} \bar{\mathbf{x}} + \theta \mathbf{d} & = \begin{bmatrix} \bar{\mathbf{x}}_B \newline \bar{\mathbf{x}}_N \end{bmatrix} + \theta \begin{bmatrix} -B^{-1} N \mathbf{e}_j \newline \mathbf{e}_j \end{bmatrix} \newline & = \begin{bmatrix} B^{-1} \mathbf{b} \newline \mathbf{0} \end{bmatrix} + \theta \begin{bmatrix} -B^{-1} N \mathbf{e}_j \newline \mathbf{e}_j \end{bmatrix} \newline & = \begin{bmatrix} B^{-1} \mathbf{b} - \theta B^{-1} N \mathbf{e}_j \newline \mathbf{0} + \theta \mathbf{e}_j \end{bmatrix} \newline \end{align*} $$

Note

Recall that we want $\mathbf{x} \geq \mathbf{0}$, thus we need to ensure that both $\bar{\mathbf{x}}_B - \theta B^{-1} N \mathbf{e}_j \geq \mathbf{0}$ and $\bar{\mathbf{x}}_N + \theta \mathbf{e}_j \geq \mathbf{0}$.

We know by definition of a BFS that $\bar{\mathbf{x}}_B \geq \mathbf{0}$ and $\bar{\mathbf{x}}_N = \mathbf{0}$, thus the second condition is satisfied for any $\theta \geq 0$.

The first condition however can cause problems, if $-B^{-1} N \mathbf{e}_j \leq \mathbf{0}$, then we can increase $\theta$ indefinitely and still satisfy the condition, i.e., the LP is unbounded!

If $-B^{-1} N \mathbf{e}_j \not\leq \mathbf{0}$, then we can find at least one component $i$ such that $(-B^{-1} N \mathbf{e}_j)_i > 0$, and thus we can increase $\theta$ until the $i$-th component of $\bar{\mathbf{x}}_B - \theta B^{-1} N \mathbf{e}_j$ is zero.

Definition: Minimum Ratio Test

$$ \theta^{\star} = \underset{k : (B^{-1} N \mathbf{e}_j)_k > 0}{\min} \frac{(B^{-1} \mathbf{b})_k}{(B^{-1} N \mathbf{e}_j)_k} $$

Note

We note that the index $k$ that attains the minimum is interesting. This index $k$ tells us which basic variable will become non-basic in the next iteration, i.e., which column of $B$ will be replaced by the $j$-th column of $N$.

Step 4: Update to the New BFS

We now have everything we need to perform the update to the new BFS.

Definition: Simplex Update

Assume that we have a search direction $\mathbf{d}$ from (2) and a step length $\theta^{\star}$ from (3), then the new BFS is, $$ \bar{\mathbf{x}}^{\text{new}} = \bar{\mathbf{x}} + \theta^{\star} \mathbf{d} $$ Similarly, we can update the basis matrix $B$ by, $$ \tilde{B}^{-1} b, $$ where $\tilde{B}$ is the basis matrix obtained by replacing the $k$-th column of $B$ with the $j$-th column of $N$, where $k$ is the index that attains the minimum in the minimum ratio test.

Let’s now formally write out the Simplex method.

Algorithm: The Simplex Method

Let $\bar{\mathbf{x}}$ be a initial BFS, corresponding to the partitioning $A = \begin{bmatrix} B & N \end{bmatrix}$.
Compute the reduced costs $\tilde{\mathbf{c}_N}^T = \mathbf{c}_N^T - \mathbf{c}_B^T B^{-1} N$.

If $\tilde{\mathbf{c}_N}^T \geq \mathbf{0}$, stop, $\bar{\mathbf{x}}$ is optimal.
Else, choose $j \in \underset{k : (\tilde{\mathbf{c}_N})_k < 0}{\arg \min} (\tilde{\mathbf{c}_N})_k$.

Compute the search direction $\mathbf{d} = \begin{bmatrix} -B^{-1} N \mathbf{e}_j \newline \mathbf{e}_j \end{bmatrix}$.

If $-B^{-1} N \mathbf{e}_j \leq \mathbf{0}$, stop, the LP is unbounded.

Compute the step length $\theta^{\star} = \underset{k : (B^{-1} N \mathbf{e}_j)_k > 0}{\min} \frac{(B^{-1} \mathbf{b})_k}{(B^{-1} N \mathbf{e}_j)_k}$.
Update to the new BFS $\bar{\mathbf{x}}^{\text{new}} = \bar{\mathbf{x}} + \theta^{\star} \mathbf{d}$ and update the basis matrix $B$ by replacing the $k$-th column of $B$ with the $j$-th column of $N$, where $k$ is the index that attains the minimum in the minimum ratio test.
Repeat from step 2.

We will now state a theorem regarding the convergence of the Simplex method and prove it.

Theorem: Finite Termination of the Simplex Method

If $P$ is nonempty and there is no degeneracy, then the Simplex method terminates in a finite number of iterations with an optimal solution or with the conclusion that the problem is unbounded.

Proof

If a BFS is non-degenerate, then it has exactly $m$ positive components, and hence a unique basis matrix. We know that if the search direction is unbounded, then the LP is unbounded and terminates. Otherwise, the basis is either optimal or $\theta^{\star} > 0$.

Therefore, in each iteration, the objective function is strictly decreased. Hence, no BFS can be revisited, and since there is a finite number of BFSs, the method must terminate in a finite number of iterations.

Note

Remember that there are $\binom{n}{m}$ possible basis matrices, thus at most $\binom{n}{m}$ iterations, i.e., the worst-case complexity is exponential in $n$ and $m$.

Initial BFS and Phase-I problem

We have now covered the Simplex method, however, we have not discussed how to find an initial BFS.

If we are lucky and have an LP of the form, $$ \begin{align*} \min \ & \mathbf{c}_x^T \mathbf{x} + \mathbf{c}_y^T \mathbf{y} \newline \text{subject to} \ & A \mathbf{x} + I \mathbf{y} = \mathbf{b} \newline \ & \mathbf{x}, \mathbf{y} \geq \mathbf{0} \end{align*} $$

We can let $(\mathbf{x} = \mathbf{0}, \mathbf{y} = \mathbf{b})^T$ be our initial BFS, with basis matrix $B = I$.

However, in general, we will not be this lucky, but we can construct an auxiliary LP, called the Phase-I problem, to find an initial BFS.

Definition: Phase-I Problem

Given an LP on standard form, we can construct the Phase-I problem, $$ \begin{align*} \min \ & \mathbf{1}^T \mathbf{y} \newline \text{subject to} \ & A \mathbf{x} + I \mathbf{y} = \mathbf{b} \newline \ & \mathbf{x}, \mathbf{y} \geq \mathbf{0} \end{align*} $$ where $\mathbf{1} = \begin{bmatrix} 1 & 1 & \ldots & 1 \end{bmatrix}^T \in \mathbb{R}^m$ and $\mathbf{y} \in \mathbb{R}^m$ are artificial variables.

Note

The Phase-I problem always has an initial BFS, namely $(\mathbf{x} = \mathbf{0}, \mathbf{y} = \mathbf{b})^T$, with basis matrix $B = I$.

Further, the Phase-I problem is always feasible.

If the optimal value of the Phase-I problem is zero, then the original LP is feasible,
If the optimal value of the Phase-I problem is positive, then the original LP is infeasible.

Example: An example using the Simplex Method

Consider the LP, $$ \begin{align*} \min \ & -x_1 - x_2 \newline \text{subject to} \ & 2x_1 + x_2 \leq 2 \newline \ & -x_1 + x_2 \leq \frac{1}{2} \newline \ & x_1, x_2 \geq 0 \end{align*} $$

Solution

We first convert the LP to standard form by introducing slack variables, $$ \begin{align*} \min \ & -x_1 - x_2 \newline \text{subject to} \ & 2x_1 + x_2 + s_1 = 2 \newline \ & -x_1 + x_2 + s_2 = \frac{1}{2} \newline \ & x_1, x_2, s_1, s_2 \geq 0 \end{align*} $$ Thus, $$ A = \begin{bmatrix} 2 & 1 & 1 & 0 \newline -1 & 1 & 0 & 1 \end{bmatrix}, \quad \mathbf{b} = \begin{bmatrix} 2 \newline \frac{1}{2} \end{bmatrix}, \quad \mathbf{c} = \begin{bmatrix} -1 \newline -1 \newline 0 \newline 0 \end{bmatrix} $$ We can see that, $\mathbf{s} = \mathbf{b}$ is an initial BFS we can use in this case. Thus, for iteration 1, we have, $$ \begin{align*} \mathbf{x}_B & = \begin{bmatrix} s_1 \newline s_2 \end{bmatrix}, \quad \mathbf{x}_N = \begin{bmatrix} x_1 \newline x_2 \end{bmatrix} \newline B & = \begin{bmatrix} 1 & 0 \newline 0 & 1 \end{bmatrix}, \quad N = \begin{bmatrix} 2 & 1 \newline -1 & 1 \end{bmatrix} \end{align*} $$ Check that $\mathbf{x}_B = B^{-1} \mathbf{b} = \mathbf{b} \geq \mathbf{0}$, so we have a valid BFS, $$ \mathbf{x}_B = B^{-1} \mathbf{b} = \mathbf{I} \mathbf{b} = \begin{bmatrix} 2 \newline \frac{1}{2} \end{bmatrix}, \quad \mathbf{x}_N = \mathbf{0} $$ Next, we compute the reduced costs, $$ \begin{align*} \tilde{\mathbf{c}_N}^T & = \mathbf{c}_N^T - \mathbf{c}_B^T B^{-1} N \newline & = \begin{bmatrix} -1 & -1 \end{bmatrix} - \begin{bmatrix} 0 & 0 \end{bmatrix} \mathbf{I} \begin{bmatrix} 2 & 1 \newline -1 & 1 \end{bmatrix} \newline & = \begin{bmatrix} -1 & -1 \end{bmatrix} \end{align*} $$ Since $\tilde{\mathbf{c}_N}^T \not\geq \mathbf{0}$, we are not optimal, and we can choose $j = 1$, i.e., the first nonbasic variable, $(\mathbf{x}_N)_1 = x_1$. Outgoing variable, $$ B^{-1} N \mathbf{e}_1 = \begin{bmatrix} 2 \newline -1 \end{bmatrix} \not\leq \mathbf{0} $$ Thus, it is not unbounded, and we can use the minimum ratio test, $$ \underset{k : (B^{-1} N \mathbf{e}_1)_k > 0}{\min} \frac{(B^{-1} \mathbf{b})_k}{(B^{-1} N \mathbf{e}_1)_k} = \frac{2}{2} = 1 $$ Thus, the minimizing argument is $k = 1$, i.e., the first basic variable, $(\mathbf{x}_B)_1 = s_1$ is the outgoing variable.

Iteration 2, $$ \begin{align*} \mathbf{x}_B & = \begin{bmatrix} x_1 \newline s_2 \end{bmatrix}, \quad \mathbf{x}_N = \begin{bmatrix} x_2 \newline s_1 \end{bmatrix} \newline B & = \begin{bmatrix} 2 & 0 \newline -1 & 1 \end{bmatrix}, \quad N = \begin{bmatrix} 1 & 1 \newline 1 & 0 \end{bmatrix} \newline \mathbf{c}_B & = \begin{bmatrix} -1 \newline 0 \end{bmatrix}, \quad \mathbf{c}_N = \begin{bmatrix} -1 \newline 0 \end{bmatrix} \end{align*} $$ Check that $\mathbf{x}_B = B^{-1} \mathbf{b} \geq \mathbf{0}$, so we have a valid BFS, $$ \mathbf{x}_B = B^{-1} \mathbf{b} = \begin{bmatrix} 1 \newline \frac{3}{2} \end{bmatrix}, \quad \mathbf{x}_N = \mathbf{0} $$ Next, we compute the reduced costs, $$ \begin{align*} \tilde{\mathbf{c}_N}^T & = \mathbf{c}_N^T - \mathbf{c}_B^T B^{-1} N \newline & = \begin{bmatrix} -1 & 0 \end{bmatrix} - \begin{bmatrix} -1 & 0 \end{bmatrix} \begin{bmatrix} \frac{1}{2} & 0 \newline \frac{1}{2} & 1 \end{bmatrix} \begin{bmatrix} 1 & 1 \newline 1 & 0 \end{bmatrix} \newline & = \begin{bmatrix} -1 & 0 \end{bmatrix} - \begin{bmatrix} -\frac{1}{2} & -\frac{1}{2} \end{bmatrix} \newline & = \begin{bmatrix} -\frac{1}{2} & \frac{1}{2} \end{bmatrix} \end{align*} $$ Since $\tilde{\mathbf{c}_N}^T \not\geq \mathbf{0}$, we are not optimal, and we can choose $j = 1$, i.e., the first nonbasic variable, $(\mathbf{x}_N)_1 = x_2$ is the incoming variable. Outgoing variable, $$ B^{-1} N \mathbf{e}_1 = \begin{bmatrix} \frac{1}{2} \newline \frac{3}{2} \end{bmatrix} \not\leq \mathbf{0} $$ Thus, it is not unbounded, and we can use the minimum ratio test, $$ \underset{k : (B^{-1} N \mathbf{e}_1)_k > 0}{\min} \frac{(B^{-1} \mathbf{b})_k}{(B^{-1} N \mathbf{e}_1)_k} = \min \left( \frac{1}{\frac{1}{2}}, \frac{\frac{3}{2}}{\frac{3}{2}} \right) = 1 $$ Thus, the minimizing argument is $k = 2$, i.e., the second basic variable, $(\mathbf{x}_B)_2 = s_2$ is the outgoing variable.

Iteration 3, $$ \begin{align*} \mathbf{x}_B & = \begin{bmatrix} x_1 \newline x_2 \end{bmatrix}, \quad \mathbf{x}_N = \begin{bmatrix} s_1 \newline s_2 \end{bmatrix} \newline B & = \begin{bmatrix} 2 & 1 \newline -1 & 1 \end{bmatrix}, \quad N = \begin{bmatrix} 1 & 0 \newline 0 & 1 \end{bmatrix} \newline \mathbf{c}_B & = \begin{bmatrix} -1 \newline -1 \end{bmatrix}, \quad \mathbf{c}_N = \begin{bmatrix} 0 \newline 0 \end{bmatrix} \end{align*} $$ Check that $\mathbf{x}_B = B^{-1} \mathbf{b} \geq \mathbf{0}$, so we have a valid BFS, $$ \mathbf{x}_B = B^{-1} \mathbf{b} = \begin{bmatrix} \frac{1}{2} \newline 1 \end{bmatrix}, \quad \mathbf{x}_N = \mathbf{0} $$ Next, we compute the reduced costs, $$ \begin{align*} \tilde{\mathbf{c}_N}^T & = \mathbf{c}_N^T - \mathbf{c}_B^T B^{-1} N \newline & = \begin{bmatrix} 0 & 0 \end{bmatrix} - \begin{bmatrix} -1 & -1 \end{bmatrix} \begin{bmatrix} \frac{1}{3} & -\frac{1}{3} \newline \frac{1}{3} & \frac{2}{3} \end{bmatrix} \begin{bmatrix} 1 & 0 \newline 0 & 1 \end{bmatrix} \newline & = \begin{bmatrix} 0 & 0 \end{bmatrix} - \begin{bmatrix} -\frac{2}{3} & \frac{1}{3} \end{bmatrix} \newline & = \begin{bmatrix} \frac{2}{3} & \frac{1}{3} \end{bmatrix} \end{align*} $$ Since $\tilde{\mathbf{c}_N}^T \geq \mathbf{0}$, we are optimal, and we can stop.

This means that $\mathbf{x}^{\star} = \begin{bmatrix} x_1^{\star} & x_2^{\star} & s_1^{\star} & s_2^{\star} \end{bmatrix}^T = \begin{bmatrix} \frac{1}{2} & 1 & 0 & 0 \end{bmatrix}^T$ is an optimal solution to the original LP.