IE 534 Linear Programming Lecture Notes Fall 2011lzwang/LectureNotesOnline.pdfIE 534 Linear...
date post
08Apr2018Category
Documents
view
219download
2
Embed Size (px)
Transcript of IE 534 Linear Programming Lecture Notes Fall 2011lzwang/LectureNotesOnline.pdfIE 534 Linear...
IE 534 Linear Programming Lecture Notes Fall 2011
Lizhi WangIowa State University
1 Introduction
Problem
Model
Algorithm
Solver
An example of a linear program:
max = 2x1 + 3x2 (1)subject to x1 + 4x2 8 (2)
x1 x2 1 (3)x1, x2 0. (4)
Here x1 and x2 are variables whose values need to be determined to maximize the linear function = 2x1 + 3x2 subject to the linear constraints (2)(4).
We can get the optimal solution to LP (1)(4)graphically as illustrated in Figure 1. There arethree steps:
1. Find the region that satisfies all constraints(shaded region in Figure 1),
2. Find the direction that the objective functionis maximizing towards, and
3. Eyeball the optimal solution (for the examplein Figure 1, x1 = 2.4, x
2 = 1.4, and
= 9).
These three steps can only be used to manually findthe optimal solution to linear programs with twovariables. Figure 1: Graphical solution to LP (1)(4)
Copyright Lizhi Wang, 2011. All rights reserved. If you have to print this document, please consider doublesided printing to save some trees.
1
In general, a linear program is given as follows:
maxx
= c1x1 + c2x2 + + cnxn (5)
s. t. a1,1x1 + a1,2x2 + + a1,nxn b1 (6)...
...ai,1x1 + ai,2x2 + + ai,nxn = bi (7)
......
aj,1x1 + aj,2x2 + + aj,nxn bj (8)...
...am,1x1 + am,2x2 + + am,nxn bm (9)
l1 x1 u1 (10)l2 x2 u2 (11)
...ln xn un. (12)
The variables x1, x2, ..., xn are called decision variables, and a, b, c, l, and u are given parameters.
A solution x = [x1, x2, ..., xn]> is a feasible solution if it satisfies all constraints (6)(12).Similarly, a solution is an infeasible solution if it violates any constraint.
The set of all feasible solutions is called the feasible region or feasible set.
If li = and ui = +, then xi is a free or unrestricted variable.
The linear function (5) is the objective function.
represents the value of the objective function, called objective value. For a given solutionx, we use (x) to denote the objective value of x: (x) = c1x1 + c2x2 + + cnxn.
The symbol max indicates that we want to maximize the objective value . We may alsouse min to indicate the opposite.
The symbol s.t. is short for subject to or such that, which starts the list of constraintsthat a feasible solution must satisfy.
A solution x is optimal if it is feasible and (x) (x) for any other feasible solution x.
The solution of an LP has three possibilities
1. There exists an optimal solution, either uniquely or among infinitely many others.
2. LP is infeasible, which means that a feasible solution does not exist.
3. LP is unbounded, which means that for any real number K, there always exists a feasiblesolution x such that (x) > K. In that case, the optimal objective value is said to be +.
2
In many situations, it is convenient to study a linear program in the following standard form:
maxx
= c1x1 + c2x2 + + cnxn (13)
s. t. a1,1x1 + a1,2x2 + + a1,nxn b1 (14)...
...am,1x1 + am,2x2 + + am,nxn bm (15)
x1, x2, , xn 0. (16)
The definition of a standard form is not unique, but may depend on context or personal preference.
The following tricks can be used to transform any linear program to the standard form:
For a minimization objective
min = c1x1 + c2x2 + + cnxn,
we can replace it with
max = c1x1 c2x2 cnxn.
For a greaterthanorequalto constraint
aj,1x1 + aj,2x2 + + aj,nxn bj ,
we can replace it withaj,1x1 aj,2x2 aj,nxn bj .
For an equality constraint
ai,1x1 + ai,2x2 + + ai,nxn = bi,
we can replace it with two lessthanorequalto constraints
ai,1x1 + ai,2x2 + + ai,nxn bi
ai,1x1 ai,2x2 ai,nxn bi.
For a variable with both lower and upper bounds
lk xk uk, (17)
there are four cases:
1. If lk = and uk is finite, we can rewrite (17) as xk uk 0, and then definexk = uk xk. Now, if we substitute xk with uk xk, (17) can be replaced by
xk 0.
2. If lk is finite and uk = +, we can rewrite (17) as 0 xk lk, and then definexk = xk lk. Now, if we substitute xk with xk + lk, (17) can be replaced by
xk 0.
3
3. If both lk and uk are finite, we can rewrite (17) as 0 xk lk uk lk, and then definexk = xk lk. Now, if we substitute xk with xk + lk, (17) can be replaced by
xk uk lk
xk 0.
4. If lk = and uk = +, we can introduce two variables x+k and xk . Now, if we
substitute xk with x+k xk , (17) can be replaced by
x+k 0
xk 0.
We often write the standard form LP (13)(16) in matrix notation:
max = c>xs. t. Ax b
x 0,
where
cn1 =
c1c2...cn
, xn1 =x1x2...xn
, Amn =a1,1 a1,2 a1,na2,1 a2,2 a2,n
......
. . ....
am,1 am,2 am,n
, bm1 =b1b2...bm
.We will use n and m to denote the number of variables and the number of constraints, respectively.
The idea of making decisions to maximize an objective function subject to certain constraintsapplies to more general forms of mathematical programs:
max{f(x) : G(x) 0m1, x X},
where f() : Rn 7 R, G() : Rn 7 Rm, and X is the feasible region.
Linear Programming has a linear objective function f(x) and linear constraints G(x) 0. Forthe linear program (1)(4),
f(x) = 2x1 + 3x2,
G(x) =[
1 41 1
] [x1x2
][
81
],
X = {(x1, x2) : x1 0, x2 0}.
Nonlinear Programming has a nonlinear objective function f(x) and/or nonlinear constraintsG(x) 0. For example,
max 2x21 + 3x2 (18)s. t. x21 + 4x2 8 (19)
x1 x2 1 (20)x1, x2 0. (21)
4
Here,f(x) = 2x21 + 3x2,
G(x) =[
1 00 0
] [x21x22
]+[
0 41 1
] [x1x2
][
81
],
X = {(x1, x2) : x1 0, x2 0}.
This problem is illustrated in Figure 2.
Figure 2: Graphical solution to nonlinear program (18)(21)
Integer Programming requires that some or all of the variables be integers. For example,
max 2x1 + 3x2 (22)s. t. x1 + 4x2 8 (23)
x1 x2 1 (24)x1, x2 {0, 1, 2, ...}. (25)
Here,f(x) = 2x1 + 3x2,
G(x) =[
1 41 1
] [x1x2
][
81
],
X = {(x1, x2) : x1, x2 {0, 1, 2, ...}}.
This problem is illustrated in Figure 3.
5
Figure 3: Graphical solution to integer program (22)(25)
2 Computer Solvers
There are numerous LP computer solvers, among which are GLPK (GNU Linear ProgrammingKit) and MATLAB. Below are examples of solving the LP (1)(4) using GLPK and MATLAB.
2.1 GLPK for linear program
Step 0: Download a compiled version of GLPK for Windows from the course web. A good locationto extract the files to is C:\\User Files. Do NOT extract them to the U: drive.
Step 1: Create a text document named example1.txt in the same folder with GLPK, type thefollowing codes, and save the file.var x1 >= 0;var x2 >= 0;maximizezeta: 2 * x1 + 3 * x2;subject toc1: x1 + 4 * x2
A = [1, 4;1, 1];
b = [8, 1];Aeq = [];beq = [];lx = [0, 0];[x, zeta] = linprog(c, A, b, Aeq, beq, lx)The default LP formulation that MATLAB assumes is min{c>x : Ax b, Aeqx = beq, lx x ux},so we need to give it the negative c. Save this file somewhere, and give it a name, e.g., example1.Do NOT use linprog or quadprog as the name, because they are reserved for MATLAB systemfunctions.
Step 2: Press F5 or hit the run button . The result will be given in the Command Window:Optimization terminated.x =2.40001.4000zeta =9.0000
The optimal solution here is the same as given by GLPK, but the optimal objective value has anopposite sign since MATLAB solves a minimization problem.
For more information about MATLAB, read its help document and/or visit its website:http://www.mathworks.com.
2.3 MATLAB for quadratic program
MATLAB has a function quadprog that solves quadratic programs with linear constraints, forexample
min 0.5x21 + x22 x1x2 2x1 6x2
s. t. x1 + x2 2x1 + 2x2 22x1 + x2 3x1, x2 0.
This QP can be equivalently rewritten in the following matrix form:
min{c>x+
12x>Qx : Ax b, x 0
},
where
c =[26
], Q =
[1 11 2
], A =
1 11 22 1
, b = 22
3
.Here the matrix Q can be obtained as Qi,j =
2f(x)xixj
, in which f(x) denotes the objective function.
The default QP formulation that MATLAB assumes is min{c>x + 0.5x>Qx : Ax b, Aeqx =beq, lx x ux}. So, we can use the following MATLAB codes to solve the problem:
7
A = [1, 1; 1, 2; 2, 1];b = [2, 2, 3];c = [2, 6];Aeq = [];beq = [];Q = [1, 1; 1, 2];lx = zeros(2,1);[x, zeta] = quadprog(Q, c, A, b, Aeq, beq, lx)
The optimal solution is x1 = 0.67, x2 = 1.33, and the optimal objective value is
= 8.22.
2.4 GLPK for mixed integer linear program
Integer programs that contain both continuous and integer variables are called mixed integer programs. GLPK is able to solve mixed integer linear programs. For example, to solve
min x1 2x2 3x3 (26)s. t. x1 + 7x2 8x3 12 (27)
x1 + x2 + 3x3 1 (28)5x2 + 9x3 = 13 (29)
x1 {0, 1}, x2 {0, 1, 2, 3}, x3 0, (30)
we can use the following GLPK codes:var x1 binary;var x2 integer, >=0, =0;minimizezeta: x1  2 * x2  3 * x3;subject toc1: x1 + 7 * x2  8 * x3 = 1;c3: 5 * x2 + 9 * x3 = 13;end;
The optimal solution is x1 = 0, x2 = 2, x
3 = 0.33, and the optimal objective value is
= 5.
2.5 GLPKMEX for mixed integer linear program
GLPKMEX is a MATLAB interface to GLPK, which enables one to use GLPK through MATLABcodes. A copy of GLPKMEX can be downloaded from the course web. With MATLABs currentdirectory being where GLPKMEX is saved, the following MATLAB codes can be used to solve theinteger program (26)(30):c = [1, 2, 3];A = [1, 7, 8;1, 1, 3;0, 5, 9];b = [12, 1, 13];lx = [0, 0, 0];ux = [1, 3, inf];ctype = [U, L, S];
8
vartype = [B, I, C];s = 1;[x, zeta] = glpk(c, A, b, lx, ux, ctype, vartype, s)
For more information about GLPKMEX, read the file glpk.m and/or visit its website:http://glpkmex.sourceforge.net.
3 Modeling
Mathematical programming roots from real life problems where people always have some objectivesto maximize or minimize, but these objectives are kept from going to infinity by various constraints.An important goal of this course is to develop the skills of using mathematical programming (especially linear programming) to formulate, solve, and analyze real life problems. Three examplesare given below.
3.1 Arbitrage in currency exchange
In currency markets, arbitrage is trading among different currencies in order to profit from a ratediscrepancy. For example, if the exchange rates between dollar and euro are 1 dollar 0.7 euroand 1 euro 1.5 dollar, then a profit can be made by trading from dollar to euro and then backto dollar. Obviously, arbitrage opportunities are not supposed to exist, at least in theory. Thefollowing are actual trades made on February 14 of 2002 with minor modification. How to use anLP model to identify if an arbitrage opportunity exists in this example? Notice that an arbitragemay involve more than two currencies, e.g., Dollar Yen Pound Euro Dollar.
from \ to Dollar Euro Pound YenDollar 1 1.1468 0.7003 133.30Euro 0.8716 1 0.6097 116.12
Pound 1.4279 1.6401 1 190.45Yen 0.00750 0.00861 0.00525 1
9
3.2 Turning junk into treasure
A companys large machine broke down. Instead ofbuying a new one, which is very expensive, they findthat there are some old and broken machines of thesame type in the junk yard, so they decide to reassemble a working machine out of the broken ones. Thistype of machine has ten different components, connected one after another like a train as illustrated inFigure 4. Components at the same position from different machines are interchangeable but componentsat different positions are not. A 0 means that thecomponent is broken and a 1 means that the component is still good. They plan to reassemble thegood components to make a complete working machine. The numbers between components representthe disassembling and reassembling costs. We assumethat these costs are the same for all old machines. Forexample, they can make a working machine from (a)and (b), which requires three assembling points at(#2,#3), (#4,#5), and (#8,#9), with a total cost of5 + 11 + 13 = 29.
ttttttttt
ttttttttt
ttttttttt
ttttttttt
(a)
13
15
04
011
19
17
110
113
06
0
(b)
0
1
1
1
0
1
0
0
1
1
(c)
0
0
0
0
1
1
1
1
1
1
(d)
1
1
1
1
0
0
0
0
0
0
Figure 4: Four examples of old machines
Suppose that there are 20 old machines in their junk yard, and the functioning status of theircomponents is given in matrix A, in which each column represents an old machine. The assemblingcosts are given in vector C, with Ci representing the disassembling and reassembling cost betweenAi,j and Ai+1,j for all j. Build a linear programming or integer programming model to find theleast costly way to make a complete working machine using these old ones. Which old machinesshould be used to make it? What is the least possible cost?
A =
0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 10 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 01 0 0 0 0 1 0 0 1 0 1 0 1 0 0 0 1 0 0 00 0 0 0 0 0 0 1 0 0 1 1 1 1 0 0 0 0 0 10 1 0 1 0 1 0 1 0 1 0 1 1 1 1 0 0 0 0 11 0 0 1 1 0 0 0 0 1 1 1 0 0 0 0 1 0 0 10 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 1 1 00 0 1 1 1 0 0 0 0 0 1 0 0 1 0 0 0 1 0 10 0 0 0 1 1 0 1 1 0 1 1 0 0 0 0 0 0 0 0
, C =
354119710136
.
3.3 Old McDonald had a farm
A farmer can grow wheat, corn, and beans on his 500acre farm. The planting costs are $150/acre,$230/acre, and $260/acre, and expected yields are 2.5 tons/acre, 3 tons/acre, and 20 tons/acre,respectively. He needs 200 tons of wheat and 240 tons of corn to feed his cattle. The farmer cansell wheat and corn to a wholesaler at $170/ton and $150/ton. He can also buy wheat and corn
10
from the wholesaler at $238/ton and $210/ton. Beans sell at $36/ton for the first 6000 tons. Dueto economic quotas on bean production, beans in excess of 6000 tons can only be sold at $10/ton.How shall the farmer allocate his 500 acres to maximize his profit? Define
xW , xC , xB: acres of wheat, corn, and beans plantedwW , wC , wB: tons of wheat, corn, and beans sold at favorable price
eB: tons of beans sold at lower priceyW , yC : tons of wheat and corn purchased,
then the profit maximization problem can be formulated as:
max = 150xW 230xC 260xB 238yW + 170wW 210yC + 150wC + 36wB + 10eBs. t. xW + xC + xB 500
2.5xW + yW wW = 2003xC + yC wC = 24020xB wB eB = 0
wB 6000xW , xC , xB, wW , wC , wB, eB, yW , yC 0.
The optimal solution of this problem is xW = 120, xC = 80, x
B = 300, w
W = 100, w
C = 0, w
B =
6000, eB = 0, yW = 0, y
C = 0,
= $118, 600.
While this solution makes sense, it assumes that the yields (2.5 tons/acre, 3 tons/acre, and 20tons/acre) are known. However, yields are greatly dependent on the weather and could increase by20% in a good year and decrease by 20% in a bad one.
If we know in advance that next year will be a good year and the yields become 3 tons/acre,3.6 tons/acre, and 24 tons/acre, then the optimal solution becomes xW = 183.33, x
C = 66.67, x
B =
250, wW = 350, wC = 0, w
B = 6000, e
B = 0, y
W = 0, y
C = 0,
= $167, 667.
If we know in advance that next year will be a bad year and the yields become 2 tons/acre,2.4 tons/acre, and 16 tons/acre, then the optimal solution becomes xW = 100, x
C = 25, x
B =
375, wW = 0, wC = 0, w
B = 6000, e
B = 0, y
W = 0, y
C = 180,
= $59, 950.
However, without knowing the weather of next year in advance, how can we formulate an LPmodel to find the optimal solution that maximizes the expected profit? Suppose the probabilitiesthat next year will be a good, average, or bad year are 0.35, 0.4, and 0.25, respectively. Noticethat the farm allocation decisions (xW , xC , xB) must be made at the beginning of the year withoutknowing the weather, and (yW , wW , yC , wC , wB, eB) can be made after observing the weather.Therefore we can have three recourse solutions of the latter group: (yGW , w
GW , y
GC , w
GC , w
GB , e
GB) for a
good year, (yAW , wAW , y
AC , w
AC , w
AB, e
AB) for an average year, and (y
BW , w
BW , y
BC , w
BC , w
BB , e
BB) for a bad
year. The expected profit can be then defined as:
E[] = 150xW 230xC 260xB+0.35 (238yGW + 170wGW 210yGC + 150wGC + 36wGB + 10eGB)+0.4 (238yAW + 170wAW 210yAC + 150wAC + 36wAB + 10eAB)+0.25 (238yBW + 170wBW 210yBC + 150wBC + 36wBB + 10eBB).
11
3.4 Minimizing a convex piecewise linear objective function
A function f : Rn 7 R is called convex if for all x, y Rn and all [0, 1], we have
f [x+ (1 )y] f(x) + (1 )f(y).
A function f : Rn 7 R is called concave if for all x, y Rn and all [0, 1], we have
f [x+ (1 )y] f(x) + (1 )f(y).
Some properties about convex and concave functions:
1. If f is a convex function, then f is a concave function
2. If f1, f2, ..., fn are all convex functions, then max{f1, f2, ..., fn} is a convex function
3. If f1, f2, ..., fn are all concave functions, then min{f1, f2, ..., fn} is a concave function
It is much easier to find the minimum of a convex function than a concave one. Similarly, it ismuch easier to find the maximum of a concave function than a convex one. Consider the followingproblem:
min f(x1, x2) (31)s. t. x1 + 4x2 8 (32)
x1 x2 1 (33)x1, x2 0, (34)
where the objective function f(x1, x2) is a convex piecewise linear function:
f(x1, x2) = max{x1 3x2,4x1 x2,2x1 + 5x2}.
To reformulate problem (31)(34) as a linear program, we use the fact that max{x13x2,4x1x2,2x1 + 5x2} is equal to the smallest number z that satisfies z x1 3x2, z 4x1 x2 andz 2x1 + 5x2. Therefore, problem (31)(34) is equivalent to the following linear program:
min zs. t. x1 + 4x2 8
x1 x2 1z x1 3x2z 4x1 x2z 2x1 + 5x2
x1, x2 0; z free.
Challenge 1: Can you reformulate the problem if the objective function is redefined as
f(x1, x2) = min{x1 3x2,4x1 x2,2x1 + 5x2}?
12
3.5 Dealing with binary decision variables
Consider the following nonlinear nonconvex binary program:
max x1 2x2 + 5x3 x21 + 5x22 + 17x23 7x1x2 4x1x3 + 11x2x3 (35)s. t. x1 + 4x2 2x3 4 (36)
4x1x2 + 3x23 2x1x3 x2 3 (37){x1, x2, x3} 6= {1, 0, 1} (38)x1, x2, x3 binary. (39)
This problem can be reformulated as an LP with binary decision variables by using the followingtricks.
Replace squared terms x21, x22, and x
23 with x1, x2, and x3, respectively.
For other quadratic terms such as x1x3, replace it with a new binary variable x13 and addtwo new constraints 2x13 x1 + x3 1 + x13.
Replace constraint (38) with a new constraint (1 x1) + x2 + (1 x3) 1.
The resulting model is the following LP with binary decision variables:
max x1 2x2 + 5x3 x1 + 5x2 + 17x3 7x12 4x13 + 11x23 (40)s. t. x1 + 4x2 2x3 4 (41)
4x12 + 3x3 2x13 x2 3 (42)2x12 x1 + x2 1 + x12 (43)2x13 x1 + x3 1 + x13 (44)2x23 x2 + x3 1 + x23 (45)
(1 x1) + x2 + (1 x3) 1 (46)x1, x2, x3, x12, x13, x23 binary. (47)
Challenge 2: The formulation (40)(47) can be further simplified because some constraints areredundant. Without actually solving for the optimal solution, can you identify which constraintscan be removed and still guarantee the model will give the correct optimal solution?
Challenge 3: If we have an x1x2x3 term in the objective function, how can you linearize thisnonlinear term by introducing new variables and constraints?
3.6 At least k out of m constraints must hold
The following problem does not require all constraints to hold, but at least k out of m must do:
max c>x
s. t.
a1x b1a2x b2
...amx bm
at least k out of m constraints must holdx 0.
13
We can reformulate this problem as a mixed integer program by introducing some binaryvariables ys and a constant M whose value is set to be sufficiently large so that the constraintaix bi +M can be ignored because it will never get violated:
max c>xs. t. a1x b1 +M(1 y1)
a2x b2 +M(1 y2)...
amx bm +M(1 ym)y1 + y2 + ...+ ym k
x 0, y binary.
Challenge 4: Can you reformulate the problem if the at least k out of m constraints must holdrequirement is changed to exactly k out of m constraints must hold?
4 Optimality Conditions
Different algorithms may vary in details, but most of them contain the same basic steps:
Step 1: Find a feasible solution to start from.Step 2: Check to see if the current solution is optimal.Step 3: If the solution is optimal then stop. Otherwise find a better solution and go to Step 2.
In order for an algorithm to be thorough, it also needs to be able to identify infeasible or unbounded problems.
We first introduce KKT (KarushKuhnTucker) conditions, which are widely used for checkingthe optimality of a solution to mathematical programs. We consider a general form nonlinearprogram
max{f(x) : G(x) 0m1}, (48)where f() : Rn 7 R and G() : Rn 7 Rm. The KKT conditions are: there exists a vector y Rmsuch that the following constraints hold:
f(x) +m
i=1
yiGi(x) = 0n1 (49)
Gi(x) 0, i = 1, ...,m (50)yi 0,i = 1, ...,m (51)yiGi(x) = 0, i = 1, ...,m. (52)
To interpret KKT conditions from a more intuitive perspective, lets look at the LP (1)(4)again and use it as an example to see why the KKT conditions (49)(52) make sense. If we rewrite
LP (1)(4) in the form of (48), then f(x) = 2x1 + 3x2, G(x) =
G1(x)G2(x)G3(x)G4(x)
=
8 x1 4x21 x1 + x2
x1x2
,m = 4, and n = 2.
14
Figure 4: Figure 1 from a different angle
We look at Figure 1 from a different angle as if the constraints are walls and the objectivefunction is towards the direction of gravity, as is shown in Figure 4. Now, what will happen ifwe put a small pingpong ball inside the twodimensional feasible region and let it go? Since thisenvironment is exactly analogous to LP (1)(4), we can imagine that the ball will end up nowherebut the optimal solution to the LP.
If you still remember physics, in order for the ball to stop moving, all forces that apply to itmust cancel out. The possible forces are: gravity in the direction of f(x), and a force from theconstraint wall in the direction of Gi(x) for all i = 1, ...,m. The magnitude of gravity dependson the mass of the ball, and lets assume its 1; for i = 1, ...,m, let the magnitude of the force fromwall i be denoted by yi. Now, from the physics perspective, the conditions for the ball to stop are:
All forces cancel out: f(x) +m
i=1
yiGi(x) = 0n1 (53)
Ball must stay inside the walls: Gi(x) 0, i = 1, ...,m (54)Force magnitudes are nonnegative: yi 0,i = 1, ...,m (55)
If the ball does not touch a wall,the force from that wall is zero: yiGi(x) = 0, i = 1, ...,m. (56)
Notice that conditions (53)(56) are exactly the same with the KKT conditions (49)(52), which isan interpretation of the KKT conditions from the physics perspective.
KKT conditions (49)(52) can be written as
f(x) +m
i=1
yiGi(x) = 0n1 (57)
0 G(x) y 0. (58)
15
Here indicates the complementarity of two vectors. If vectors a Rn and b Rn are of thesame dimension, then 0 a b 0 means that: (i) a 0, (ii) b 0, and (iii) a>b = 0. Twovectors that satisfy condition (iii) are said to be complementary or perpendicular to each other.
In general, for nonlinear programs, KKT conditions may not be necessary or sufficient optimalityconditions. For example, solution (x1 = 1, x2 = 0) does not satisfy KKT conditions, but it is optimalto
max x1s. t. x2 + (x1 1)3 0
x2 0.
Solution (x1 = 0.5, x2 = 0.25) satisfies KKT conditions, but it is not optimal to
min x2s. t. x21 x1 + x2 0
0 x1 2.
For linear programs, KKT are both necessary and sufficient conditions of optimality, whichmeans that a solution to an LP is optimal if and only if it satisfies KKT conditions. We now derivethe KKT conditions for the standard form linear program
max{c>x : Ax b, x 0}, (59)
which can be written in the form of (48) as
max{f(x) = c>x : G(x) =
[AmnInn
]x+
[bm10n1
] 0(m+n)1
},
then the KKT conditions for a standard linear program becomes: there exist two vectors y Rmand Rn, corresponding to Ax b and x 0, respectively, such that the following constraintsare satisfied:
c+[A> I
] [ y
]= 0 (60)
0 [bAxx
][y
] 0. (61)
Equation (60) is equivalent to cA>y + = 0 or = A>y c, with which we can substitute in(61). Now, (60)(61) are simplified to:
0 [bAxx
][
yA>y c
] 0. (62)
Condition (62) is necessary and sufficient for the optimality of the standard form LP (59), thus itis also called the optimality condition.
As an exercise, lets prove the optimality of the solution (x1 = 2.4, x2 = 1.4) to LP (1)(4)
using KKT conditions. We plug in the numbers of x, f(x) and G(x), then KKT conditions (62)
16
become: there exists a vector y such that the following constraints hold:0000
8 (2.4 + 4 1.4)1 (2.4 1.4)
2.41.4
y1y2
y1 + y2 24y1 y2 3
0000
.It is not hard to find that y =
[1 1
]> satisfies the above condition, which confirms the optimality of (x1 = 2.4, x
2 = 1.4) to LP (1)(4).
As another exercise, lets find the optimality condition of a nonstandard form LP:
min{b> : A> c, 0}. (63)
LP (63) can be equivalently rewritten in the standard form as follows:
max{b> : A> c, 0}. (64)
Lets relate (,A>,c,b) in (64) to (x,A, b, c) in (59), respectively. We also introduce a newvariable to relate to y in (62). Then the optimality condition for (63) and (64) is:
0 [
A+ b
][c+A>
] 0. (65)
It is interesting to notice that optimality conditions (62) and (65) are equivalent to each other inthe sense that if (x, y) is a feasible solution to (62) then ( = y, = x) is also a feasible solutionto (65), and vice versa. For this reason, we can say that LPs (59) and (63) actually share a sameoptimality condition if we simply change the notation to y in (63):
min{b>y : A>y c, y 0}. (66)
Once we find a pair of feasible solutions (x, y) to the optimality condition (62), then x and y arethe optimal solutions to LPs (59) and (66), respectively.
In the optimality condition (62):
0 [bAxx
][
yA>y c
] 0,
the left hand side is simply saying that x must be feasible to (59), and the right hand side is sayingthat y must be feasible to (66). These two problems both have many feasible solutions, but whatmakes the optimal solutions (x, y) optimal is the fact that they also satisfy the complementarityconditions: 0 bAx y 0 and 0 x A>y c 0.
5 Duality
We are ready to introduce perhaps the most important concept in linear programming: duality.Two linear programs, like (59) and (66), that share a same optimality condition in a complementarymanner are called dual to each other: the decision variable y in (66) is the magnitude of force ofthe constraint wall for (59), and vice versa. Either one of the LPs can be called the primal problem
17
and the other is the dual problem. Decision variables of the primal (dual) problem are called primal(dual) variables.
Every linear program has a dual problem, which can be found by reformulating the primalproblem in the same form as either (59) or (66), and then simplifying its corresponding dualproblem. For example, to find the dual problem to
min{c>x : Ax b}, (67)
we first introduce two nonnegative variables x+ and x to replace the free variable x with x =x+ x, and then rewrite (67) as the same form with (66):
min{[
c> c>] [ x+
x
]:[A A
] [ x+x
] b,
[x+
x
] 0}, (68)
whose dual problem is clearly
max{b>y :
[A>A>
]y
[cc
], y 0
}, (69)
which can be simplified asmax
{b>y : A>y = c, y 0
}. (70)
Now, we have found that the dual to (67) is (70).
The following table summarizes the primaldual relations for general form LPs:
max c1x1 + c2x2 + c3x3 b1y1 + b2y2 + b3y3 min
constraints
a1,1x1 + a1,2x2 + a1,3x3 b1a2,1x1 + a2,2x2 + a2,3x3 = b2a3,1x1 + a3,2x2 + a3,3x3 b3
y1 0y2 freey3 0
variablesvariables
x1 0x2 0x3 free
a1,1y1 + a2,1y2 + a3,1y3 c1a1,2y1 + a2,2y2 + a3,2y3 c2a1,3y1 + a2,3y2 + a3,3y3 = c3
constraints
We can use this table to find the dual of a general form LP directly. For example,
(P21): min x1 + 2x2 + 3x3s. t. x1 3x2 = 5
2x1 x2 + 3x3 6x3 4
x1 0x2 0x3 free.
(D21): max 5y1 + 6y2 + 4y3s. t. y1 free
y2 0y3 0y1 + 2y2 13y1 y2 23y2 + y3 = 3.
This table can also be used to check the optimality of a solution to a nonstandard form LP. If(i) x satisfies the constraints of the primal problem, (ii) there exists a y that satisfies the constraintsof the dual, and (iii) the corresponding constraints and variables are complementary to each other,then (x, y) is optimal. For example, we can check the optimality of (x1 = 5, x
2 = 0, x
3 = 4/3)
18
(y1 = 1, y2 = 1, y3 = 0) by observing that (i) x is primal feasible, (ii) y is dual feasible, and (iii)(x1 3x2 5)y1 = (2x1 x2 + 3x3 6)y2 = (x3 4)y3 = x1(y1 + 2y2 1) = x2(3y1 y2 2) =x3(3y
2 + y
3 3) = 0.
The following are some important theorems on duality.
Complementary Slackness: Solutions x and y are optimal to max{c>x : Ax b, x 0} andmin{b>y : A>y c, y 0}, respectively, if and only if condition (62) is met:
0 [bAxx
][
yA>y c
] 0.
Weak Duality: If x and y are feasible solutions to max{c>x : Ax b, x 0} and min{b>y :A>y c, y 0}, respectively, then c>x b>y.
Proof. Since x and y are feasible, we have
0 (bAx)>y = b>y (Ax)>y = b>y y>Ax,
and0 (A>y c)>x = y>Ax c>x.
Therefore, c>x y>Ax b>y.
Strong Duality: If x and y are optimal solutions to max{c>x : Ax b, x 0} and min{b>y :A>y c, y 0}, respectively, then c>x = b>y.
Proof. Since x and y are optimal, by complementary slackness, we have
0 = (bAx)>y = b>y (Ax)>y = b>y y>Ax,
and0 = (A>y c)>x = y>Ax c>x.
Therefore, c>x = y>Ax = b>y.
PrimalDual Possibility Table: Recall that a linear program has three possibilities: finitelyoptimal, infeasible, or unbounded. The primaldual pair has nine combinations, but only four ofthem are possible.
Primal \ Dual Finitely optimal Unbounded InfeasibleFinitely optimal Possible Impossible Impossible
Unbounded Impossible Impossible PossibleInfeasible Impossible Possible Possible
Two examples of primal unbounded and dual infeasible, which are also primal infeasible anddual unbounded:
max x1 + x2s. t. x1 1
x1, x2 0.
min ys. t. y 1
0y 1y 0.
19
max 3x1 + 4x2s. t. x1 + x2 1
2x1 x2 3x1, x2 0.
min y1 3y2s. t. y1 + 2y2 3
y1 y2 4y1, y2 0.
Two examples of both primal and dual are infeasible:
min x1 + 2x2s. t. x1 + x2 = 1
x1 + x2 = 2.
max y1 + 2y2s. t. y1 + y2 = 1
y1 + y2 = 2.
max 2x1 x2s. t. x1 x2 1
x1 + x2 2x1, x2 0.
min y1 2y2s. t. y1 y2 2
y1 + y2 1y1, y2 0.
For an LP to be unbounded, there must exist two things: (i) a feasible solution x0, satisfyingAx0 b, x0 0, and (ii) a direction x, which leads the objective value c>(x0+x) towards infinity without violating any constraints as the step size approaches infinity. This direction is calledan extreme ray. If x satisfies c>x > 0, Ax 0,x 0, then x is called an extreme ray tothe LP maxx{c>x : Ax b, x 0}. In the dual space, if y satisfies b>y < 0, A>y 0,y 0,then y is called an extreme ray to the LP miny{b>y : A>y c, y 0}.
Farkas Lemma: Let A Rmn and b Rm1 be a matrix and a vector, respectively. Thenexactly one of the following two alternatives holds:(a) There exists some x 0 such that Ax b.(b) There exists some y 0 such that A>y 0 and b>y < 0.
Proof. (a) true (b) false: If (a) is true, then for any y 0 such that A>y 0, we haveb>y (Ax)>y = x>A>y 0, which means that (b) is false.
(a) false (b) true: Consider max{0 : Ax b, x 0} and its dual min{b>y : A>y 0, y 0}.If (a) is false, then max{0 : Ax b, x 0} is infeasible. Its dual is either unbounded or infeasible.It is easy to see that y = 0 is a feasible solution to the dual, so it must be unbounded, which meansthat there must exist some y 0 such that A>y 0 and b>y < 0. Therefore, (b) is true.
A Variation of Farkas Lemma: Let A Rmn and b Rm1 be a matrix and a vector,respectively. Then exactly one of the following two alternatives holds:(a) There exists some x 0 such that Ax = b.(b) There exists some y such that A>y 0 and b>y < 0.
Proof. (a) true (b) false: If (a) is true, then for any y such that A>y 0, we have b>y =(Ax)>y = x>A>y 0, which means that (b) is false.
(a) false (b) true: Consider max{0 : Ax = b, x 0} and its dual min{b>y : A>y 0}. If (a)is false, then max{0 : Ax = b, x 0} is infeasible. According to the PrimalDual Possibility Table,its dual is either unbounded or infeasible. It is easy to see that y = 0 is a feasible solution to thedual, so it must be unbounded, which means that there must exist some y such that A>y 0 andb>y < 0. Therefore, (b) is true.
20
Extended PrimalDual Possibility Table: Farkas Lemma enables us to define the primaldual possibility table in a more revealing manner.
We use x0 or y0 to indicate that a feasible solution exists, and x or y to indicate that anextreme ray does not exist. Farkas Lemma basically says that x0 y, y0 x, x0 y, andy0 x. Therefore, we have the following extended primaldual possibility table:
primal dualoptimal x0 + x y + y0 optimal
unbounded x0 + x y + y0 infeasibleinfeasible x0 + x y + y0 infeasibleinfeasible x0 + x y + y0 unbounded
6 Simplex Algorithm
Now that we know what condition a solution needs to satisfy to be an optimal solution, but howdo we come up with such an optimal solution? In this section, we will learn about an algorithmcalled simplex that finds an optimal solution if the LP has one, or determines that the LP is infeasible or unbounded if that is the case. Simplex is perhaps the most important linear programmingalgorithm, and we are going to learn about it in great detail.
The basic idea of simplex is based on the observation that the optimal solution to an LP, ifit exists, occurs at a corner of the feasible region. This can be verified with the LP examples wehave seen in the lecture notes or homework examples. Based on this observation, we can find theoptimal solution by (i) starting from a feasible corner point, and (ii) moving to a better cornerpoint until the current one is already optimal. If we cannot find a starting point, then the LP isinfeasible; if we can optimize the objective value to infinity, then the LP is unbounded.
While the idea may sound simple and intuitive, we need to rigorously establish its theoreticalcorrectness.
6.1 What exactly is a corner point?
Corner point is a nickname of a wellknown concept called basic solution. So lets use basicsolution instead of corner point from now on. A solution x0 is a basic solution if it is uniquelydetermined by its active constraints at equality. An inequality constraint (Ax)i bi is active at x0if it holds at equality: (Ax0)i = bi. An equality constraint (Ax)i = bi is also considered active aslong as the equality holds.
To explain the definition of a basic solution, let us suppose the feasible region is defined byAx b, which already includes any nonnegativity constraints x 0. For any solution x0, letI(x0) be the set of its active constraints: I(x0) = {i : (Ax0)i = bi}. By definition, x0 satisfiesthese constraints at equality: (Ax)I(x0) = bI(x0). If x0 is the only solution determined by thisequation, then it is a basic solution. However, if there exists another solution x1 that also satisfies(Ax1)I(x0) = bI(x0), then x0 is not a basic solution.
Recall from linear algebra that a necessary condition for a linear system of equations Ax = bto have a unique solution x Rn is that, there exist n linearly independent rows in matrix A.
21
Given a finite number of vectors V1, V2, ..., VK Rn, we say that they are linearly dependent ifthere exist real numbers a1, a2, ..., aK such that
Ki=1 ai > 0 and
Ki=1 aiVi = 0. Otherwise, they
are called linearly independent.
As an example, we look at LP (1)(4) and Figure 1 again:
(1) : max 2x1 + 3x2(2) : s. t. x1 + 4x2 8(3) : x1 x2 1(4) : x1, x2 0.
We know intuitively that (x1 = 1, x2 = 0) is a basic solution (because it is a corner point). Nowlets check with the definition. There are two active constraints at this point: x1 x2 1 andx2 0, and (x1 = 1, x2 = 0) is uniquely determined by these two constraints at equality, therefore,it is a basic solution. On the other hand, (x1 = 2, x2 = 1.5) is not a basic solution, because thereis only one active constraint at that point: x1 + 4x2 8, and (x1 = 2, x2 = 1.5) is obviously notthe only solution determined by that constraint at equality.
Also notice that a basic solution is not required by definition to be feasible. Point (x1 = 8, x2 =0) is not a feasible solution, but it is a basic solution because it is uniquely determined by twoactive constraints at equality: x1 + 4x2 8 and x2 0.
6.2 Is the optimal solution always a basic solution?
Unfortunately, the answer is no. First, some LPs may not even have a basic solution. For examplemax{0 : x1, x2 free}, where we are maximizing the constant 0 over the entire twodimensional spacewith no constraints, and there is no basic solution. Secondly, even if an LP has basic solutions,there may also exist an optimal solution that is not a basic solution. For example (x1 = 1, x2 = 0)is an optimal solution to min{x2 : 0 x1 2, x2 0}, but it is not a basic solution.
With that being said, there is some good news that still validates the idea of simplex: (i) If wewrite an LP in the standard form max{c>x : Ax b, x 0}, there always exists a basic solution.As a matter of fact, the origin point (x = 0) is uniquely determined by the active constraints x 0at equality, thus it is a basic solution. (ii) Suppose an LP has at least one basic solution and oneoptimal solution. It can be proved that: if the optimal solution is unique, then it must be a basicsolution; if there are infinitely many optimal solutions, then there exists one that is a basic solution.
22
6.3 How to find basic solutions?
In the simplex context, it is oftentimes more convenient to consider a nonstandard form LP
max{ = c>x : Ax+ w = b, x 0, w 0}, (71)
which is equivalently reformulated from the standard form LP max{c>x : Ax b, x 0} by simplyintroducing a new variable w Rm, called slack variable, to make the inequality constraint Ax ban equality one Ax+ w = b.
We now write (71) in the following matrix form:
max = c>x (72)s. t. Ax = b (73)
x 0, (74)
where c =[
c0m1
], A =
[A Imm
], and x =
[xw
] Rn+m.
Since the dimension of x is (n+m) 1, we need n+m linearly independent active constraintsto uniquely determine a basic solution. We already have m rows in Ax = b, so we need at least nconstraints in x 0 to hold at equality.
Define N as the indices of constraints in x 0 that are set to hold at equality, and B as the indices of other constraints in x 0. Such a pair of (B,N ) is an exclusive and exhaustive partition ofthe set {1, 2, ..., n+m}. For example, if we set xi = 0,i = 1, ..., n and xi 0,i = n+ 1, ..., n+m,then the partition is (B = {n+ 1, ..., n+m},N = {1, ..., n}).
The above conditions are only necessary for a basic solution. To find the sufficient conditionfor a basic solution, we rewrite (72)(74) using the definition of N and B:
max = c>BxB + c>NxN (75)
s. t. ABxB +ANxN = b (76)xB 0, xN 0, (77)
where AB is the collection of columns in A whose indices are in the set B, and xN and cN are thecollections of elements in x and c, respectively, whose indices are in the set N . Here, the value for
23
xN = 0 is uniquely determined. To make sure that the value for xB is also uniquely determined,we should guarantee that the equation ABxB + ANxN = ABxB = b has a unique solution, whichrequires that the matrix AB Rmm be invertible. This can be achieved by choosing m linearlyindependent columns in A as the set of B.
Now we have the necessary and sufficient conditions for a basic partition:
The m elements in the set B should be chosen such that AB is invertible, and (78)The n elements in the set N are then determined by N = {1, ..., n+m}\B. (79)
This partition uniquely determines a basic solution (xB = A1B b, xN = 0).
We define: a partition (B,N ) that satisfies (78) and (79) as a basic partition; B and N in abasic partition are called basis and nonbasis, respectively; and the variables xB and xN are calledbasic variable and nonbasic variable, respectively.
The relationship between a basic partition and a basic solution is that, a basic partition (B,N )uniquely determines a basic solution; for any basic solution x, there exists (uniquely or not) abasic partition that determines this basic solution x. One example of different basic partitions alluniquely determining a same basic solution is the following. Suppose
A =[
1 2 1 03 6 0 1
]and b =
[13
],
then both (B = {1, 3},N = {2, 4}) and (B = {1, 4},N = {2, 3}) uniquely determine the same basicsolution x =
[1 0 0 0
]>.Since a basic partition (B,N ) uniquely determines one basic solution, the number of basic so
lutions is bounded by the number of ways (B,N ) can be selected, which is no more than (n+m)!n!m! .
The set of partitions of {1, 2, ..., n+m} can be divided into the following regions.
A+B+C+D: All possible partitionsB+C+D: Basic partitions, which satisfy (78) and (79)
C+D: Feasible basic partitions with xB = A1B b 0
D: Optimal basic partitions
It is relatively easy to enter region B. The following points will discuss how to enter region Cand then D.
24
6.4 How to find an initial feasible basic solution to start from? What if the LPis infeasible?
The way simplex algorithm proceeds is to start from a feasible basic solution, move from one feasible basic solution to another better feasible basic solution, and finally stop at a feasible basicsolution which is optimal to the LP. (Lets use fbs to abbreviate feasible basic solution). We arenow ready to discuss how to find an initial fbs to start from.
It is not hard to observe that (B = {n + 1, ..., n + m},N = {1, ..., n}) is a basic partition to(71), which corresponds to
B = {n+ 1, ..., n+m}, N = {1, ..., n},cB = 0m1, cN = c,AB = Imm, AN = A,xB = b, xN = 0n1.
The partition (B,N ) is indeed a basic partition because AB = Imm is invertible. If b 0, then(x = 0, w = b) is also a feasible basic solution.
However, if bi < 0 for some i, then x 0 is violated, and the basic solution is not feasible.To obtain an fbs in such a case, we need to use a little trick called the bigM method. DefineK = {i : bi < 0,i = 1, ...,m}, and let k = K. Now we consider a new problem
max
{ = c>xM
ki=1
ti : Ax+ w +Ht = b;x,w, t 0
}. (80)
Here M is an extremely large finite constant, vector t Rk1 is a new variable called artificialvariable, and matrix H Rmk is defined as
Hi,j ={1, if i = K(j);0, otherwise.
For example, consider the following LP as an instance of (71):
max = 5x1 + 4x2 + 3x3s. t. 2x1 + 3x2 + x3 + w4 = 5
4x1 + x2 + 2x3 + w5 = 113x1 + 4x2 + 2x3 + w6 = 8x1, x2, x3, w4, w5, w6 0.
Then (80) corresponds to the following formulation:
max = 5x1 + 4x2 + 3x3 1000t7 1000t8s. t. 2x1 + 3x2 + x3 + w4 = 5
4x1 + x2 + 2x3 + w5 t7 = 113x1 + 4x2 + 2x3 + w6 t8 = 8x1, x2, x3, w4, w5, w6, t7, t8 0.
25
In LP (80), since there are k artificial variables but no additional constraints, the dimensionsof its partition should be N  = n + k and B = m. The way we initialize this partition is to addall the artificial variables to B and then move those indices that correspond to bi < 0 from B to N .It is not hard to see that the basic partition (N = {1, ..., n} {n+K},B = {1, ..., n+m+ k}\N )uniquely determines an fbs (xN = 0, xB = A
1B b = b). We can use this fbs as a starting point to
solve (80) by following the rest of the simplex steps. In the format of (72)(74), (80) has
c =
c0m1M 1k1
, A = [ A Imm Hmk ] , and x = xw
t
Rn+m+k.The relation between optimal solutions to (71) and (80) is given in the following propositions.
Proposition 1. Solution (x, w) is optimal to (71) if and only if there exists a finitely large Msuch that (x, w, t = 0) is optimal to (80).
Proof. (): Prove by construction. Let y be the dual optimal solution to (71), which is min{ =b>y : A>y c, y 0}. Set M = maxi=1,...,m yi , and we know that y is also feasible to the dual of(80), which is min
{ = b>y : A>y c,H>y M1k1; y 0
}. Since (x, w, t = 0) and y are
respectively feasible to (80) and its dual with the same objective value, they are also respectivelyoptimal.
(): Prove by contradiction. Suppose (x, w, t = 0) is optimal to (80) but (x, w) is notoptimal to (71), then there must exist a feasible solution (x, w) to (71) with c>x > c>x. However,this implies that (x, w, t = 0) is a better solution than (x, w, t = 0) to (80), because (x, w, t =0) is feasible to (80) and c>x > c>x. This contradicts the assumption that (x, w, t = 0) is anoptimal solution to (80).
In the simplex algorithm, there is a way to make sure that M is sufficiently large. So if we solve(80) with the simplex algorithm and get an optimal solution (x, w, t) with ti > 0 for some i, then(71) is infeasible.
Proposition 2. If for a sufficiently large M , (80) possesses an optimal solution (x, w, t) withti > 0 for some i, then (71) is infeasible.
Proof. By Proposition 1, (71) does not have an optimal solution. Therefore, it suffices to show that(71) is not unbounded. Let y be the optimal dual solution to (80), then it is feasible to the dualof (71), which means that (71) is not unbounded.
If (80) is unbounded, then (71) could be either infeasible or unbounded, and we need to solvethe following LP to verify:
min
{ =
ki=1
ti : Ax+ w +Ht = b;x,w, t 0
}. (81)
If t = 0 is an optimal solution to (81), then (71) is unbounded; otherwise (71) is infeasible.
The possibilities of (80) and their implications of (71) are summarized as the following:
26
(x, w, t) optimal to (80) (80) unbounded t optimal to (81)t = 0 t 6= 0 t = 0 t 6= 0
(x, w) optimal to (71)
(71) infeasible
(71) unbounded
(80) is optimal with (x, w, t = 0) (71) is optimal with (x, w).
(80) is optimal with (x, w, t) and ti > 0 for some i (71) is infeasible.
(80) is unbounded and t = 0 is optimal to (81) (71) is unbounded.
(80) is unbounded and t = 0 is not optimal to (81) (71) is infeasible.
LP (80) cannot be infeasible, because (xN = 0, xB = b) is a feasible solution to (80).
The following diagram provides an overview of the Simplex algorithm. Here we refer to thethree LPs (71), (80), and (81) as LP0, LP1, and LP2, respectively.
LP0
?HHHH
HHHH
b 0? N
Y?
?HHHH
HHHH
optimal?N
Y?
LP0optimal

HHHHHH
HH
unbounded?
N
Y?
LP0unbounded
6 improve
 LP1
?
?HHHH
HHHH
optimal?N
Yt = 0 t 6= 0LP0optimal
@@R
LP0infeasible

HHHHHH
HH
unbounded?
N
Y
6 improve
 LP2
?
?HHHH
HHHH
optimal?N
Yt = 0 t 6= 0

6 improve
LP0unbounded
@@R
LP0infeasible
LP0 may be optimal, unbounded, or infeasibleLP1 may be optimal or unbounded, never infeasibleLP2 must be optimal, never unbounded or infeasible
6.5 How to tell if the current fbs is optimal or not?
One way to check the optimality of a solution is to check the optimality condition (62). In thesimplex algorithm, there is a more convenient way to check the optimality by reformulating (75)
27
(77) as:
max = c>BA1B b+ (c
>N c>BA
1B AN )xN (82)
s. t. xB = A1B bA
1B ANxN (83)
xB 0, xN 0. (84)
Equations (82) and (83) are called a dictionary :
= c>BA1B b+ (c
>N c>BA
1B AN )xN
xB = A1B bA
1B ANxN .
The term (c>N c>BA1B AN ) is called the reduced cost.
Proposition 3. In (82), for a given feasible basic partition (B,N ), if we have
(c>N c>BA1B AN ) 0,
then the fbs (xB = A1B b, xN = 0) is optimal to (82)(84).
Proof. Prove by contradiction. Suppose (xB = A1B b, xN = 0) is not optimal to (82)(84), and there
exists another fbs x with (x) > (x). This implies that
(x) (x)= (c>N c>BA
1B AN )x
N (c>N c>BA
1B AN )xN
= (c>N c>BA1B AN )x
N
> 0,
which is impossible since (c>N c>BA1B AN ) 0 and xN 0.
Notice that reduced cost being nonpositive is a sufficient not necessary condition for the optimality of an fbs. For example, consider the following LP:
max = x1 + 2x2 + 3x3s. t. x1 3x3 + w4 = 0
7x1 + 2x2 + 5x3 + w5 = 1x1, x2, x3, w4, w5 0.
The feasible basic partition (B1 = {1, 2},N 1 = {3, 4, 5}) determines the following dictionary:
= 1 20x3 + 6w4 w5x1 = 3x3 w4x2 = 0.5 13x3 + 3.5w4 0.5w5,
which gives the fbs (x1 = 0, x2 = 0.5, x3 = 0). Although the reduced cost does contain a positiveterm, this fbs is actually optimal. To see this, consider another feasible basic partition (B2 ={2, 4},N 2 = {1, 3, 5}), which determines the following dictionary:
= 1 6x1 2x3 w5x2 = 0.5 3.5x1 2.5x3 0.5w5w4 = x1 + 3x3.
It gives the same fbs (x1 = 0, x2 = 0.5, x3 = 0). Since it has a nonpositive reduced cost, we knowthis fbs is optimal.
28
6.6 How to find a better fbs if the current one is not optimal?
Because of the close relation between an fbs and a feasible basic partition (B,N ), the search fora better fbs (and eventually the optimal one) is equivalent to the search for a better feasiblebasic partition. The way simplex updates a feasible basic partition is by switching one pair ofnumbers between the current basis B and nonbasis N at a time. If the current basic partitionis (B0,N 0), then we select an i from B0 and a j from N 0, and update the basic partition as(B1 = B0\{i} {j},N 1 = N 0\{j} {i}). The variable xj is called the entering variable,because it will enter the basis. Similarly, xi is called the leaving variable. Geometrically, such anupdate means a move from an fbs to an adjacent one. Of course, we need to make sure that thenew fbs is no worse than the current one in terms of the objective value.
Now lets assume that we have obtained a feasible basic partition (B,N ) which is not optimal,then we will have to update the basic partition by selecting an entering variable and a leavingvariable. The rule for selecting the entering and leaving variables is called a pivoting rule. Thereare various pivoting rules, one of which called Blands rule is introduced below:
Entering variable xj : Choose j = min{j N : (c>N c>BA1B AN )j > 0}. (85)
Leaving variable xi : Choose i = min
{i argmaxiB
(A1B AN )i,j
(A1B b)i
}. (86)
After the entering and leaving variables are chosen, we get an updated partition. Blands ruleensures that the new partition is a better feasible basic partition. The calculation for determining the leaving variable is called the ratio test, because we are trying to find the largest ratio of(A1B AN )i,j
(A1B b)i
. The i that achieves the largest ratio is said to be the winner of the ratio test, and xi
will be the leaving variable. If there is a tie in the ratio test, the smallest winner i will be selected.
The rationale for Blands rule is explained in the following example:
max 5x1 + 4x2 + 3x3 (87)s. t. 2x1 + 3x2 + x3 5 (88)
4x1 + x2 + 2x3 11 (89)3x1 + 4x2 + 2x3 8 (90)
x1, x2, x3 0. (91)
We start by introducing new variables w4, w5, w6, to reformulate (88)(90) into equality constraints:
max 5x1 + 4x2 + 3x3 (92)s. t. 2x1 + 3x2 + x3 + w4 = 5 (93)
4x1 + x2 + 2x3 + w5 = 11 (94)3x1 + 4x2 + 2x3 + w6 = 8 (95)x1, x2, x3, w4, w5, w6 0. (96)
Since the righthandside values are all positive, the first fbs is easy to find: x1 = x2 = x3 = 0, w4 =
29
5, w5 = 11, w6 = 8. In the context of (82)(84),
B = {4, 5, 6} N = {1, 2, 3}
cB =
000
cN = 54
3
AB =
1 0 00 1 00 0 1
AN = 2 3 14 1 2
3 4 2
xB =
5118
xN = 00
0
.
The dictionary is = 5x1 + 4x2 + 3x3w4 = 5 2x1 3x2 x3w5 = 11 4x1 x2 2x3w6 = 8 3x1 4x2 2x3
,
and in matrix form, it is
= 0 +[
5 4 3] [
x1 x2 x3]> w4w5
w6
= 511
8
2 3 14 1 2
3 4 2
x1x2x3
.Since (c>N c>BA
1B AN ) =
[5 4 3
], the current fbs is not optimal, because we can improve the
objective value by increasing the values of x1, x2, x3 from zero to positive. According to Blandsrule (85), we choose j = 1, thus x1 will enter the basis. But by how much could x1 increase fromzero? This is limited by the equation w4w5
w6
= 511
8
2 3 14 1 2
3 4 2
x1x2x3
in the dictionary, because changing x1 will affect the column
243
and thus the values of w4, w5, w6,which should be nonnegative. Therefore, the new value of x1 will be set to the largest numbersuch that the constraints w4, w5, w6 0 are still satisfied. This is the reason for Blands rule (86).We choose the smallest i such that
i argmaxiB(A1B AN )i,j
(A1B b)i= argmax
{25,
411,38
},
so i = 4. The basic partition is updated as B = {1, 5, 6},N = {4, 2, 3}. If we repeat this procedure,
30
which is called an iteration in the simplex algorithm, we get the following dictionaries.
Second iteration: = 12.5 +[2.5 3.5 0.5
] [w4 x2 x3
]> x1w5w6
= 2.51
0.5
0.5 1.5 0.52 5 01.5 0.5 0.5
w4x2x3
,Third iteration: = 13 +
[1 3 1
] [w4 x2 w6
]> x1w5x3
= 21
1
2 2 12 5 03 1 2
w4x2w6
.After the third iteration, (c>N c>BA
1B AN ) =
[1 3 1
]< 0, thus we know that the current
basic partition B = {1, 5, 3},N = {4, 2, 6} is optimal, so is the current fbs x1 = 2, x2 = 0, x3 =1, w4 = 0, w
5 = 1, w
6 = 0.
6.7 How to identify an unbounded LP?
We can identify an unbounded LP when we use Blands rule (86) to determine the leaving variable
and the maximum ratio argmaxiB(A1B AN )i,j
(A1B b)i
is not positive.
The possibilities of the ratio (A1B AN )i,j
(A1B b)i
are summarized in the following table, where 1 and 1represent finite positive and negative numbers, respectively.
case (A1B b)i (A1B AN )i,j
(A1B AN )i,j
(A1B b)i
updated xj note
1 0 0 undefined + unbounded2 0 1 + 0 degenerate3 0 1 + unbounded4 1 0 0 + unbounded5 1 1 1 1 new basic partition6 1 1 1 + unbounded
There are six cases in this table. The second and third columns come from the dictionary
= c>BA1B b+ (c
>N c>BA
1B AN )xN
xB = A1B bA
1B ANxN ;
the fourth column is the ratio test result in Blands rule (86); the fifth column calculates the largestvalue of xj such that constraint xB 0 still holds; and the sixth column explains the implicationif the corresponding xj becomes the entering variable. If case 1, 3, 4, or 6 is the winner of the ratiotest, then we claim the LP is unbounded.
6.8 Will the algorithm ever terminate?
If appropriate pivoting rules are used, the algorithm will terminate finitely. First of all, as long asthe numbers of variables and constraints are finite, there is only a finite number of basic solutions,so the algorithm does not need to visit infinitely many fbss to reach the optimal one.
31
Proposition 4. If the dimensions of matrix A in (71) are finite, then the number of basic solutionsto (71) is finite.
Proof. Since any basic solution can be determined by a basic partition (B,N ), the number of basicsolutions is bounded by the number of ways (B,N ) can be selected, which is no more than (n+m)!n!m! .Therefore, the number of basic solutions is finite.
Secondly, it can be proved (refer to the text for the proof) that if Blands rule is used, thealgorithm will terminate in finitely many iterations. Under some other pivoting rules (for example the largest coefficient rule: choose j = min
{j argmaxjN (c>N c>BA
1B A)j > 0
}for the
entering variable, and choose i = min{i argmaxiB
(A1B AN )i,j
(A1B b)i
}for the leaving variable),
however, the algorithm could get stuck in a so called cycle and never terminate. A cycle occurs when the algorithm moves to a basic partition which has been visited before. For example,(B0,N 0) (B1,N 1) (B2,N 2) ... (Bk,N k) (B1,N 1) (B2,N 2).... A cycling exampleusing the largest coefficient rule can be found in the text on page 31.
The reason a cycle occurs is because of degeneracy, which occurs when the updated basicpartition determines a same fbs as the previous basic partition does. For example, in this problem
max 10x1 57x2 9x3 24x4s. t. 0.5x1 5.5x2 2.5x3 + 9x4 + w5 = 0
0.5x1 1.5x2 0.5x3 + x4 + w6 = 0x1 + w7 = 1
x1, x2, x3, x4, w5, w6, w7 0,
A =
0.5 5.5 2.5 9 1 0 00.5 1.5 0.5 1 0 1 01 0 0 0 0 0 1
,the following basic partitions all determine a same fbs x1 = 0, x2 = 0, x3 = 0, x4 = 0, w5 = 0, w6 =0, w7 = 1:
(B1 = {5, 6, 7},N 1 = {1, 2, 3, 4})(B2 = {1, 6, 7},N 2 = {5, 2, 3, 4})(B3 = {1, 2, 7},N 3 = {5, 6, 3, 4})(B4 = {3, 2, 7},N 4 = {5, 6, 1, 4})(B5 = {3, 4, 7},N 5 = {5, 6, 1, 2})(B6 = {5, 4, 7},N 6 = {3, 6, 1, 2}).
From another perspective, a solution x to (72)(74) is degenerate if there are more than n+mactive constraints at x. This implies that not only xN = 0, but also there exists some i B suchthat xi = 0. In the dictionary, we can tell that degeneracy occurs if (A
1B b)i = 0 for some i B. In
the above example, the dictionary is
= 0 +[
10 57 9 24] [
x1 x2 x3 x4]> w5w6
w7
= 00
1
0.5 5.5 2.5 90.5 1.5 0.5 1
1 0 0 0
x1x2x3x4
.32
The entering variable is clearly x1, which should move from nonbasis to basis in the updated basicpartition. However, there are zeros in the constant vector on the right hand side of the secondequation in the dictionary, which indicates that the current solution x1 = 0, x2 = 0, x3 = 0, x4 =0, w5 = 0, w6 = 0, w7 = 1 is a degenerate one. As a result, x1 has to stay zero in the basis, otherwisew5 will become negative due to the constraint w5 = 0 0.5x1.
Using Blands rule, we may also move from one basic partition to another that determines asame fbs, but we will not come back to the same basic partition that has been visited before. Sincethere is only a finite number of basic partitions, we will eventually move out of the cycle towardsthe optimal basic partition.
6.9 How efficient is the simplex algorithm?
Simplex updates the fbs iteratively to find the optimal solution, thus its efficiency depends on
how many iterations are needed. The total number of basic solutions is bounded by(n+mm
)=
(n+m)!n!m! . If m = n, then this number becomes
(2nn
). It can be proved that 12n2
2n (
2nn
) 22n,
which means that the number of basic solutions increases exponentially with the size of the problem (number of variables and constraints). Some pivoting rules could cause cycling and neverterminate; even for existing noncycling rules, people have found instances (refer to Section 4.4 ofVanderbei for an example) where the simplex algorithm will have to visit every single basic solutionbefore it finds the optimal one. However, it is an open question whether there exists a pivoting rulethat will never require an exponential number of iterations.
There does exist an algorithm called the ellipsoid method that can solve linear programs polynomially (never requires exponentially many iterations). Theoretically, the ellipsoid method has abetter worstcase efficiency than the simplex algorithm, however, the practical performance of thesimplex algorithm is much better than that of the ellipsoid method, and that is one of the reasonsthe simplex algorithm is more widely used.
6.10 Summary
The simplex algorithm is summarized with the following steps, as illustrated in Figure 5.
Step 1 Reformulation: Write the LP in standard form max{ = c>x : Ax b, x 0}, reformulate it as max{ = c>x : Ax+ w = b, x 0, w 0}, and then go to Step 2.
Step 2 Initialization:
(2a) If b 0, then reformulate the problem as
max{ = c>x : Ax = b, x 0},
where
c =[
c0m1
], A =
[A Imm
], and x =
[xw
] Rn+m.
The initial basic partition is (B = {n + 1, ..., n + m},N = {1, ..., n}), and the initial fbs is(xB = A
1B b = b, xN = 0). Go to Step 3.
33
(2b) If bi < 0 for some i, then use the bigM method to find the initial basic partition and theinitial fbs. Define K = {i : bi < 0,i = 1, ...,m}, and let k = K. Reformulate the problem as
max{ = c>x : Ax = b, x 0},
where
c =
c0m1M 1k1
, A = [ A Imm H ] , and x = xw
t
R(n+m+k)1.When compared with other finite constants finitely many times, M is always assumed to bebigger. Matrix H Rmk is defined as
Hi,j ={1, if i = K(j);0, otherwise.
The initial basic partition is (N = {1, ..., n}{n+K(1), ..., n+K(k)},B = {1, ..., n+m+k}\N ),and the initial fbs is (xB = A
1B b = b, xN = 0). Go to Step 3.
Step 3 Optimality check:
(3a) If (c>N c>BA1B AN )i > 0 for some i N , then go to Step 4.
(3b) If (c>N c>BA1B AN ) 0 and bigM method was not used in Step 2, then stop, and the
optimal solution is (xB = A1B b, x
N = 0),
= c>BA1B b.
(3c) If (c>N c>BA1B AN ) 0, bigM method was used in Step 2, and xi = 0 for all i = n+m+
1, ..., n+m+ k, then stop, and the optimal solution is (xB = A1B b, x
N = 0),
= c>BA1B b.
(3d) If (c>N c>BA1B AN ) 0, bigM method was used in Step 2, and xi > 0 for some i
{n+m+ 1, ..., n+m+ k}, then stop, and the LP is infeasible.
Step 4 Improvement: Choose
j = min{j N : (c>N c>BA1B AN )j > 0},
and
i = min
{i argmaxiB
(A1B AN )i,j
(A1B b)i
}.
(4a) If maxiB(A1B AN )i,j
(A1B b)i
= + or finite positive, then update the basic partition and the fbs
(B = B\{i} {j},N = N\{j} {i}),
(xB = A1B b, xN = 0),
and go back to Step 3.
(4b) If maxiB(A1B AN )i,j
(A1B b)i
= , 0, finite negative, or undefined(
00
), and bigM method was not
used in Step 2, then stop, and the LP is unbounded.
34
(4c) If maxiB(A1B AN )i,j
(A1B b)i
= , 0, finite negative, or undefined(
00
), bigM method was used in
Step 2, then solve the feasibility problem (81). If t = 0 is an optimal solution to (81) thenthe original LP is unbounded.
(4d) If maxiB(A1B AN )i,j
(A1B b)i
= , 0, finite negative, or undefined(
00
), bigM method was used in
Step 2, then solve the feasibility problem (81). If t = 0 is not an optimal solution to (81)then the original LP is infeasible.
Figure 5: The simplex algorithm diagram
Lets look at an example of using the simplex algorithm to solve an LP.
max 10x1 + 12x2 + 12x3 (97)s. t. x1 + x2 + x3 1
x1 + 2x2 + 2x3 202x1 + x2 + 2x3 202x1 + 2x2 + x3 20x1, x2, x3 0.
As is illustrated in Figure 6, the feasible region is within the corner points {A=(1,0,0), B=(0,1,0),E=(0,0,1), F=(10,0,0), C=(0,10,0), G=(0,0,10), D=(4,4,4)}.
Step 1
max =[
10 12 12 0 0 0 0] [
x1 x2 x3 w4 w5 w6 w7]>
s. t.
1 1 1 1 0 0 01 2 2 0 1 0 02 1 2 0 0 1 02 2 1 0 0 0 1
x1x2x3w4w5w6w7
=
1202020
[x1 x2 x3 w4 w5 w6 w7
]> 0.
35
Figure 6: Graphic representation of LP (84)
Step 2(b)
max{ = c>x : Ax = b, x 0},
where c =[
10 12 12 0 0 0 0 M]>, A =
1 1 1 1 0 0 0 11 2 2 0 1 0 0 02 1 2 0 0 1 0 02 2 1 0 0 0 1 0
, x =[x1 x2 x3 w4 w5 w6 w7 t8
]>, and b =1202020
. Initialize (B = {5, 6, 7, 8},N = {1, 2, 3, 4}).The current basic solution (x5,6,7,8 = A
1B b =
[20 20 20 1
]>, x1,2,3,4 = 0) corresponds to
the yellow corner point O in Figure 6. Since bigM is being used and t8 = x8 = 1 > 0, this basicsolution is not feasible. But the bigM method will lead to a feasible basic solution.
Step 3(a)
c>N c>BA1B AN =
[10 +M 12 +M 12 +M M
]. Go to Step 4.
Step 4(a)
36
The dictionary is
= M +[
10 +M 12 +M 12 +M M]
x1x2x3x4
x5x6x7x8
=
2020201
1 2 2 02 1 2 02 2 1 01 1 1 1
x1x2x3x4
.The entering variable is x1, and the leaving variable is x8. Set (B = {5, 6, 7, 1},N = {8, 2, 3, 4}).
The current fbs (x5,6,7,1 = A1B b =
[19 18 18 1
]>, x8,2,3,4 = 0) corresponds to the blue
corner point A in Figure 6. The bigM has found an fbs solution.
Step 3(a)
c>N c>BA1B AN =
[M 10 2 2 10
]. Go to Step 4.
Step 4(a)
The dictionary is
= 10 +[M 10 2 2 10
] x8x2x3x4
x5x6x7x1
=
1918181
1 1 1 12 1 0 22 0 1 21 1 1 1
x8x2x3x4
.The entering variable is x2, and the leaving variable is x1. Set (B = {5, 6, 7, 2},N = {8, 1, 3, 4}).
The current fbs (x5,6,7,2 = A1B b =
[18 19 18 1
]>, x8,1,3,4 = 0) corresponds to the blue
corner point B in Figure 6.
Step 3(a)
c>N c>BA1B AN =
[M 12, 2, 0, 12
]. Go to Step 4.
Step 4(a)
37
The dictionary is
= 12 +[M 12, 2, 0, 12
] x8x1x3x4
x5x6x7x2
=
1819181
2 1 0 21 1 1 12 0 1 21 1 1 1
x8x1x3x4
.The entering variable is x4, and the leaving variable is x5. Set (B = {4, 6, 7, 2},N = {8, 1, 3, 5}).
The current fbs (x4,6,7,2 = A1B b =
[9 10 0 10
]>, x8,1,3,5 = 0) corresponds to the blue
corner point C in Figure 6.
Step 3(a)
c>N c>BA1B AN =
[M 4 0 6
]. Go to Step 4.
Step 4(a)
The dictionary is
= 120 +[M 4 0 6
] x8x1x3x5
x4x6x7x2
=
910010
1 0.5 0 0.50 1.5 1 0.50 1 1 10 0.5 1 0.5
x8x1x3x5
.The entering variable is x1, and the leaving variable is x7. Set (B = {4, 6, 1, 2},N = {8, 7, 3, 5}).
The current fbs (x4,6,1,2 = A1B b =
[9 10 0 10
]>, x8,7,3,5 = 0) corresponds to the blue
corner point C in Figure 6. Because of degeneracy, the simplex algorithm has updated the basicpartition but it corresponds to the same fbs.
Step 3(a)
c>N c>BA1B AN =
[M 4 4 2
]. Go to Step 4.
Step 4(a)
38
The dictionary is
= 120 +[M 4 4 2
] x8x7x3x5
x4x6x1x2
=
910010
1 0.5 0.5 00 1.5 2.5 10 1 1 10 0.5 1.5 1
x8x7x3x5
.The entering variable is x3, and the leaving variable is x6. Set (B = {4, 3, 1, 2},N = {8, 7, 6, 5}).
The current fbs (x4,3,1,2 = A1B b =
[11 4 4 4
]>, x8,7,6,5 = 0) corresponds to the red corner
point D in Figure 6. It will be verified in the next step that this fbs is optimal.
Step 3(c)
c>N c>BA1B AN =
[M 1.6 1.6 3.6
] 0, bigM method was used in Step 2, and
x8 = 0, so stop, and the optimal solution is (x1,2,3,4 = A1B b =
[4 4 4 11
]>,
x5,6,7,8 = 0), = c>BA
1B b = 136.
7 Dual Simplex
Let us look at the simplex algorithm from the duality perspective. First, we transform the primalmax{c>x : Ax b;x 0} and dual min{b>y : A>y c; y 0} into max{c>x : Ax+ w = b;x,w 0} and min{b>y : A>y z = c; y, z 0}, respectively. Then, the optimality condition becomes
0 [xw
][zy
] 0.
We can further transform these problems into max{ = c>x : Ax = b, x 0} and max{ = b>z :Az = c, z 0}, where
c =[
c0m1
], A = [A, Imm], x =
[xw
], b =
[0n1b
], A = [Inn, A>], z =
[zy
].
For an optimal basic partition (N ,B), the optimality condition becomes[xN = 0xB 0
][zN 0zB = 0
].
Notice that nonbasic dual variable zN 0 and basic dual variable zB = 0 may look counterintuitive, but that is because the nonbasis N and basis B are originally defined for the primalproblem, thus we need to pay special attention when use the same basic partition on the dual.
The dictionary can also be defined for the dual problem using a basic partition (N ,B):
= b>N A1N c+ (b>B b>N A1N AB)zB (98)
zN = A1N c A1N ABzB. (99)
39
Sometimes it is more convenient to apply the simplex algorithm to the dual LP instead of theprimal. For example, suppose we have a primal LP as follows
max =[1 1
] [x1 x2
]>s. t.
2 12 41 3
[ x1x2
]
487
[x1 x2
]> 0.When we apply the simplex algorithm, we need to use the bigM method, because b =
[4 8 7
]>has negative elements. However, if we look at the dual
min =[
4 8 7] [
y3 y4 y5]>
s. t.[2 2 11 4 3
] y3y4y5
[ 11]
[y3 y4 y5
]> 0,whose standard form LP is
max =[4 8 7
] [y3 y4 y5
]>s. t.
[2 2 11 4 3
] y3y4y5
[ 11
][y3 y4 y5
]> 0,we can easily find an initial fbs to start the simplex algorithm. Now let us apply the simplexalgorithm to the dual and at the same time keep track of what happens to the primal.
40
Apply simplex to the dual What happens to the primal
max =[
0 0 4 8 7]z1z2y3y4y5
s. t.[1 0 2 2 10 1 1 4 3
]z1z2y3y4y5
=[11
][z1 z2 y3 y4 y5
]> 0.
max =[1 1 0 0 0
]x1x2w3w4w5
s. t.
2 1 1 0 02 4 0 1 01 3 0 0 1
x1x2w3w4w5
= 487
[x1 x2 w3 w4 w5
]> 0.max{ = b>z : Az = c, z 0},where max{ = c>x : Ax = b, x 0},where
b =[
0 0 4 8 7]>
A =[1 0 2 2 10 1 1 4 3
]z =
[z1 z2 y3 y4 y5
]>c =
[1 1 0 0 0
]>A =
2 1 1 0 02 4 0 1 01 3 0 0 1
x =
[x1 x2 w3 w4 w5
]>Dual dictionary: = b>N A
1N c+ (b
>B b>N A
1N AB)zB
zN = A1N c A1N ABzB
Primal dictionary: = c>BA
1B b+ (c
>N c>BA
1B AN )xN
xB = A1B bA
1B ANxN
iteration 1: B = {3, 4, 5},N = {1, 2}
= 0 +[4 8 7
] z3z4z5
[z1z2
]=[
11
][
2 2 11 4 3
] z3z4z5
iteration 1: B = {3, 4, 5},N = {1, 2}
= 0 +[1 1
] [ x1x2
] x3x4x5
= 487
2 12 41 3
[ x1x2
]iteration 2: B = {3, 1, 5},N = {4, 2}
= 4 +[12 4 3
] z3z1z5
[z4z2
]=[
0.53
][
1 0.5 0.55 2 1
] z3z1z5
iteration 2: B = {3, 1, 5},N = {4, 2}
= 4 +[0.5 3
] [ x4x2
] x3x1x5
= 1243
1 50.5 20.5 1
[ x4x2
]iteration 3: B = {3, 1, 4},N = {5, 2}
= 7 +[18 7 6
] z3z1z4
[z5z2
]=[
14
][
2 1 27 3 2
] z3z1z4
iteration 3: B = {3, 1, 4},N = {5, 2}
= 7 +[1 4
] [ x5x2
] x3x1x4
= 187
6
2 71 32 2
[ x5x2
]
From the above table, we have some observations: (1) if b has a negative element but c 0, wecan avoid the bigM method by applying the simplex algorithm to the dual. (2) When we apply
41
the simplex algorithm to the dual, the reduced costs (c>N c>BA1B AN ) in the corresponding primal
dictionaries are always nonpositive but the basic variables xB = A1B b are always infeasible until
the last iteration. (3) The primal and dual dictionaries have a negative transpose relationship:
[b>N A
1N c (b
>B b>N A
1N AB)
A1N c A1N AB
]=
[c>BA
1B b (c
>N c>BA
1B AN )
A1B b A
1B AN
]>.
Based on the above observations, we could design a special rule of selecting the entering andleaving variables so that we could go through the iterations on the primal side (righthandsidecolumn of the above table) without writing down the dual iterations. An algorithm that appliesthis idea is called the dual simplex algorithm.
Although point (3) has been observed throughout all iterations of the example, its theoreticalcorrectness is not obvious. The following table defines the problem and then the observation isgiven as a proposition.
Primal:max = c>xs. t. Ax bx 0
Dual:min = b>ys. t. A>y c
y 0
max = c>xs. t. Ax+ w = b
x, w 0
max = b>ys. t. A>y z = c
y, z 0
Define:
c =[
c0m1
]:=[cNcB
]A = [A, Imm] := [AN , AB]
x =[xw
]:=[xNxB
]Define:
b =[
0n1b
]:=[bNbB
]A = [Inn, A>] := [AN , AB]
z =[zy
]:=[zNzB
]
max{ = c>x : Ax = b, x 0} max{ = b>z : Az = c, z 0}
max = c>NxN + c>BxB
s. t. ANxN +ABxB = bx 0
max = b>N zN + b>B zBs. t. AN zN + ABzB = c
z 0
Primal dictionary = c>BA
1B b+ (c
>N c>BA
1B AN )xN
xB = A1B bA
1B ANxN
Dual dictionary = b>N A
1N c+ (b
>B b>N A
1N AB)zB
zN = A1N c A1N ABzB
42
Proposition 5. The primal and dual dictionaries derived above have a negativetranspose relationship: [
c>BA1B b (c
>N c>BA
1B AN )
A1B b A
1B AN
]=
[b>N A
1N c (b
>B b>N A
1N AB)
A1N c A1N AB
]>.
Proof. Bottom right: A1B AN = (A1N AB)>.First, we have
AA> = [A, I][I, A>]> = 0 and AA> = [AN , AB][AN , AB]> = AN A>N +ABA>B = 0.
Multiplying both sides of the last equation by A1B and (A>N )1, we have
A1B [AN A
>N +ABA
>B ](A
>N )1 = A1B AN + A
>B (A
>N )1 = 0.
Bottom left: A1B b = (b>B b>N A1N AB)
>.Top right: (c>N c>BA
1B AN ) = (A1N c)>.
Top left: c>BA1B b = (b>N A
1N c)
>.
From the above point, we can see that in the primal dictionary
= c>BA1B b+ (c
>N c>BA
1B AN )xN
xB = A1B bA
1B ANxN ,
the optimality condition in the simplex algorithm (c>N c>BA1B AN ) 0 is actually the dual fea
sibility zN = A1N c 0. When the optimal dictionary is obtained, we get not only the primal optimal solution (xB = A
1B b, xN = 0) but also the dual optimal solution (zN = A
1N c =
(c>N c>BA1B AN ), zB = 0). Complementary slackness is satisfied since[
xB 0xN = 0
][zB = 0zN 0
].
An alternative to the bigM method is the following. Consider
(P0): max{c>x : Ax b, x 0}
and its dual(D0): min{b>y : A>y c, y 0}.
Step 1 If b 0, then apply the simplex algorithm on (P0) with the initial feasible basic solutionx = 0. Otherwise go to Step 2.
Step 2 If c 0, then apply the simplex algorithm on (D0) with the initial feasible basic solutiony = 0. Otherwise go to Step 3.
Step 3 Construct the following LP:
(P1): max{0 : Ax b, x 0}
and its dual(D1): min{b>y : A>y 0, y 0}.
Apply the simplex algorithm on (D1) with the initial feasible basic solution y = 0. If (D1) isoptimal, go to Step 4; if (D1) is unbounded, go to Step 5.
43
Step 4 Let y1 and x1 be optimal solutions to (D1) and (P1), respectively. Apply the simplexalgorithm on (P0) with x1 as the initial feasible basic solution.
Step 5 If (D1) is unbounded, then (P1) is infeasible, thus (P0) is also infeasible.
For example, we have the following (P0) with b1 < 0 and c2 > 0.
max =[14 6
] [x1 x2
]>s. t.
7 167 913 16
[ x1x2
]
478
[x1 x2
]> 0.We construct (D1) as
min =[4 7 8
] [y3 y4 y5
]>s. t.
[7 7 1316 9 16
] y3y4y5
[ 00
][y3 y4 y5
]> 0.We solve (D1), and it turns out to be unbounded, which means that (P0) is infeasible.
8 Sensitivity Analysis
Suppose we have found an optimal solution x to an LP max{c>x : Ax b, x 0} using thesimplex algorithm, and then we change some of the parameters (A, b, c) and have a new LP. Wehave the following questions: Is the optimal basic partition for the original LP still optimal to thenew LP? If the answer is yes, then the optimal solution for the new LP can be easily obtained. Ifthe answer is no, do we need to solve the new LP from the very beginning, or can we use the originalLPs optimal solution to find the new LPs optimal solution faster? This type of postoptimalityanalysis is called sensitivity analysis. We focus our discussion on the following four cases.
8.1 Changing the objective coefficients: c c+ c
Changing the objective coefficients will not affect the primal feasibility, but could possibly affectthe dual feasibility. The new dual dictionary is
= b>N A1N (c+ c) + (b
>B b>N A
1N AB)zB
zN = A1N (c+ c) A1N ABzB.
The range of for the original LPs optimal basic partition to remain optimal to the new LPcan be obtained by solving the following inequality:
A1N (c+ c) 0.
If A1N (c + c) 0 holds, then the original LPs optimal basic partition is still optimal tothe new LP.
44
Otherwise, the new dictionary doesnt have dual feasibility, but primal feasibility remains.We can find the optimal solution for the new LP by simply applying the simplex iterationsto the new primal dictionary. Since we dont need to start over from step 1 of the simplexalgorithm, we could find the optimal solution to the new LP faster.
For example, consider the following LP:
max =[1 1
] [x1 x2
]>s. t.
2 12 41 3
[ x1x2
]
487
[x1 x2
]> 0.The optimal basic partition is (B = {3, 1, 4},N = {5, 2}). Suppose we change c =
[1 1
]>to c+ c with c =
[1 0
]>, and we want to find the range of that keeps the partitionoptimal. The dual dictionary is:
= 7 7+[18 7 6
] z3z1z4
[z5z2
]=
[1 4 3
][
2 1 27 3 2
] z3z1z4
.For the partition to remain optimal, we need 1 0 and 4 3 0, thus 1.
Suppose = 1.1, then the primal dictionary becomes:
= 0.7 +[
0.1 0.7] [ x5
x2
] x3x1x4
= 187
6
2 71 32 2
[ x5x2
].
We can then conclude that changing c from[1 1
]to[
0.1 1]
would make the LPunbounded.
8.2 Changing the righthandside: b b+ b
Changing the righthandside will not affect the dual feasibility, but could possibly affect the primalfeasibility. The new primal dictionary is
= c>BA1B (b+ b) + (c
>N c>BA
1B AN )xN
xB = A1B (b+ b)A
1B ANxN .
The range of for the original LPs optimal basic partition to remain optimal to the new LPcan be obtained by solving the following inequality:
A1B (b+ b) 0.
45
If A1B (b + b) 0 holds, then the original LPs optimal basic partition is still optimal tothe new LP. The new optimal primal solution becomes (xB = A
1B (b+b), xN = 0), but the
optimal dual solution stays the same. From strong duality, the new optimal objective value = (b+ b)>y, where y is the optimal dual solution to the original problem. As long asthe basic partition stays optimal, y gives the rate at which b changes the objective value.Therefore, y is also known as the shadow price.
Otherwise if A1B (b+b) 0 doesnt hold, the new dictionary doesnt have primal feasibility,but dual feasibility remains. We can find the optimal solution for the new LP by applyingthe simplex iterations to the dual dictionary. Since we dont need to start over from step 1of the simplex algorithm, we could find the optimal solution to the new LP faster.
Suppose we change b =[
4 8 7]> to b + b with b = [ 6 5 2 ]>, and we
want to find the range of that keeps the partition optimal. The primal dictionary is:
= 7 2+[1 4
] [ x5x2
] x3x1x4
= 18 27 + 2
6
2 71 32 2
[ x5x2
].
For the partition to remain optimal, we need 18 2 0, 7 + 2 0, and 6 0, thus3.5 6.
8.3 Adding k new constraints:
A A =
A
a>m+1...
a>m+k
and b b =
bbm+1
...bm+k
Corresponding to the k new constraints, there should be k new slack variables in the primal
problem and k new dual variables in the dual. If we add all the k new variables into the basisB = B {n+m+ 1, ..., n+m+ k} and look at the new dual dictionary
= b>N A1N c+ (b
>B b>N A
1N A
B)zBzN = A1N c A
1N A
B zB ,
the dual feasibility will not be affected, but primal feasibility could be affected.
If the original optimal solution x satisfies the new constraints:
a>m+1...
a>m+k
x bm+1...bm+k
,then x is still optimal.
Otherwise, we can find the optimal solution for the new LP by applying the simplex iterationsto the new dual dictionary. Since we dont need to start over from step 1 of the simplexalgorithm, we could find the optimal solution to the new LP faster.
46
Suppose we add a new constraint x1 + x2 6. The optimal solution to the original LPis x1 = 7, x
2 = 0, which violates the new constraint. Therefore, the original optimal basic
partition is no longer optimal. We update the dual dictionary as follows:
= 7 +[18 7 6 1
] z3z1z4z6
[z5z2
]=[
14
][
2 1 2 17 3 2 4
]z3z1z4z6
.Obviously, the dual LP is unbounded, thus the primal is infeasible with the new constraint.
8.4 Adding k new variables:
x x =
x
xn+1...
xn+k
, A A = [ A an+1 an+k ], and c c =
ccn+1
...cn+k
If we add all the k new variables into the nonbasis N = N {n + m + 1, ..., n + m + k},
the primal feasibility will not be affected, but dual feasibility could be affected. The new primaldictionary is
= c>BA1B b+ (c
>N c>BA
1B A
N )xN xB = A
1B bA
1B A
N xN .
If the original dual optimal solution y satisfies the new constraints:
a>n+1...
a>n+k
y cn+1...cn+k
,then (x, y) is still optimal.
Otherwise, we can find the optimal solution for the new LP by simply applying the simplexiterations to the new primal dictionary. Since we dont need to start over from step 1 of thesimplex algorithm, we could find the optimal solution to the new LP faster.
Suppose we add a new variable x3, expand A to A =
2 1 32 4 21 3 1
, and change cto c =
[1 1 1
]>. Since the dual optimal solution y = [ 0 0 1 ]> satisfies theconstraint
[3 2 1
]y 1, we conclude that the optimal solution to the original LP
remains optimal with x3 = 0.
9 Decomposition
Special techniques can be applied to solve some largescale LP problems with special structures,which would be much slower or even impossible to solve otherwise. Three examples of thesetechniques, namely Benders decomposition, DantzigWolfe decomposition, and column generationare briefly introduced here.
47
9.1 Benders decomposition
Consider the following LP problem:
max =[c>0 c
>1 c
>2 c>k
]x0x1x2...xk
(100)
s. t.
A0 0 0 0
A1 D1 0. . . 0
A2 0 D2. . . 0
......
. . . . . ....
Ak 0 0 Dk
x0x1x2...xk
b0b1b2...bk
[x0 x1 x2 xk
]> 0,where ci Rni1, Ai Rmini , bi Rmi1, xi Rni1 for i = 1, 2, ..., k. If k, mis and nis arelarge, this problem could be extremely hard to solve due to its dimension. However, we can takeadvantage of its special structure and solve this large problem by solving much smaller problemsiteratively as follows.
Step 1 Solve the following master problem:
max = c>0 x0 + z1 + z2 + + zk (101)s. t. A0x0 b0
Constraints (106) and (108) from Step 2.x0 0; z1, z2, ..., zk free.
If the master problem is infeasible, then we stop and conclude that the original LP (100)is infeasible. Otherwise, let (x0, z
1 , z2 , ..., z
k) be the optimal solution to (101), which could
possibly include + values. For example, in the first few steps, there may be no constraintsto keep some zis from approaching +. In that case, we simply let those zis be + andonly maximize over the constrained variables. Here zi represents an optimistic estimate ofc>i x
i in (100).
Step 2 For i = 1, 2, ..., k, solve the following subproblems:
max i = c>i xi (102)s. t. Dixi bi Aix0
xi 0.
These subproblems can be solved in parallel to save computation time. The results of thesesubproblems fall into one of the following three cases.
(Case I:) If (102) is infeasible, then there must exist an extreme ray for the dual to (102)
min i = (bi Aix0)>yi (103)s. t. D>i yi ci
yi 0.
48
We can obtain an extreme ray yi of (103) by solving the following problem:
min (bi Aix0)>yi (104)s. t. D>i yi 0
0mi1 yi 1mi1.
We should send a message back to the master problem that x0 should not make the subproblems infeasible. The ideal message would be
(bi Aix0)>yi(x0) 0, (105)
where yi(x0) is the optimal solutio