IE 534 Linear Programming Lecture Notes Fall 2011lzwang/LectureNotesOnline.pdfIE 534 Linear...

of 66/66
IE 534 Linear Programming Lecture Notes Fall 2011 * Lizhi Wang Iowa State University 1 Introduction Problem Model Algorithm Solver An example of a linear program: max ζ =2x 1 +3x 2 (1) subject to x 1 +4x 2 8 (2) x 1 - x 2 1 (3) x 1 ,x 2 0. (4) Here x 1 and x 2 are variables whose values need to be determined to maximize the linear function ζ =2x 1 +3x 2 subject to the linear constraints (2)-(4). We can get the optimal solution to LP (1)-(4) graphically as illustrated in Figure 1. There are three steps: 1. Find the region that satisfies all constraints (shaded region in Figure 1), 2. Find the direction that the objective function is maximizing towards, and 3. Eyeball the optimal solution (for the example in Figure 1, x * 1 =2.4,x * 2 =1.4, and ζ * = 9). These three steps can only be used to manually find the optimal solution to linear programs with two variables. Figure 1: Graphical solution to LP (1)-(4) * Copyright Lizhi Wang, 2011. All rights reserved. If you have to print this document, please consider double- sided printing to save some trees. 1
  • date post

    08-Apr-2018
  • Category

    Documents

  • view

    219
  • download

    2

Embed Size (px)

Transcript of IE 534 Linear Programming Lecture Notes Fall 2011lzwang/LectureNotesOnline.pdfIE 534 Linear...

  • IE 534 Linear Programming Lecture Notes Fall 2011

    Lizhi WangIowa State University

    1 Introduction

    Problem-

    Model-

    Algorithm-

    Solver

    An example of a linear program:

    max = 2x1 + 3x2 (1)subject to x1 + 4x2 8 (2)

    x1 x2 1 (3)x1, x2 0. (4)

    Here x1 and x2 are variables whose values need to be determined to maximize the linear function = 2x1 + 3x2 subject to the linear constraints (2)-(4).

    We can get the optimal solution to LP (1)-(4)graphically as illustrated in Figure 1. There arethree steps:

    1. Find the region that satisfies all constraints(shaded region in Figure 1),

    2. Find the direction that the objective functionis maximizing towards, and

    3. Eyeball the optimal solution (for the examplein Figure 1, x1 = 2.4, x

    2 = 1.4, and

    = 9).

    These three steps can only be used to manually findthe optimal solution to linear programs with twovariables. Figure 1: Graphical solution to LP (1)-(4)

    Copyright Lizhi Wang, 2011. All rights reserved. If you have to print this document, please consider double-sided printing to save some trees.

    1

  • In general, a linear program is given as follows:

    maxx

    = c1x1 + c2x2 + + cnxn (5)

    s. t. a1,1x1 + a1,2x2 + + a1,nxn b1 (6)...

    ...ai,1x1 + ai,2x2 + + ai,nxn = bi (7)

    ......

    aj,1x1 + aj,2x2 + + aj,nxn bj (8)...

    ...am,1x1 + am,2x2 + + am,nxn bm (9)

    l1 x1 u1 (10)l2 x2 u2 (11)

    ...ln xn un. (12)

    The variables x1, x2, ..., xn are called decision variables, and a, b, c, l, and u are given param-eters.

    A solution x = [x1, x2, ..., xn]> is a feasible solution if it satisfies all constraints (6)-(12).Similarly, a solution is an infeasible solution if it violates any constraint.

    The set of all feasible solutions is called the feasible region or feasible set.

    If li = and ui = +, then xi is a free or unrestricted variable.

    The linear function (5) is the objective function.

    represents the value of the objective function, called objective value. For a given solutionx, we use (x) to denote the objective value of x: (x) = c1x1 + c2x2 + + cnxn.

    The symbol max indicates that we want to maximize the objective value . We may alsouse min to indicate the opposite.

    The symbol s.t. is short for subject to or such that, which starts the list of constraintsthat a feasible solution must satisfy.

    A solution x is optimal if it is feasible and (x) (x) for any other feasible solution x.

    The solution of an LP has three possibilities

    1. There exists an optimal solution, either uniquely or among infinitely many others.

    2. LP is infeasible, which means that a feasible solution does not exist.

    3. LP is unbounded, which means that for any real number K, there always exists a feasiblesolution x such that (x) > K. In that case, the optimal objective value is said to be +.

    2

  • In many situations, it is convenient to study a linear program in the following standard form:

    maxx

    = c1x1 + c2x2 + + cnxn (13)

    s. t. a1,1x1 + a1,2x2 + + a1,nxn b1 (14)...

    ...am,1x1 + am,2x2 + + am,nxn bm (15)

    x1, x2, , xn 0. (16)

    The definition of a standard form is not unique, but may depend on context or personal pref-erence.

    The following tricks can be used to transform any linear program to the standard form:

    For a minimization objective

    min = c1x1 + c2x2 + + cnxn,

    we can replace it with

    max = c1x1 c2x2 cnxn.

    For a greater-than-or-equal-to constraint

    aj,1x1 + aj,2x2 + + aj,nxn bj ,

    we can replace it withaj,1x1 aj,2x2 aj,nxn bj .

    For an equality constraint

    ai,1x1 + ai,2x2 + + ai,nxn = bi,

    we can replace it with two less-than-or-equal-to constraints

    ai,1x1 + ai,2x2 + + ai,nxn bi

    ai,1x1 ai,2x2 ai,nxn bi.

    For a variable with both lower and upper bounds

    lk xk uk, (17)

    there are four cases:

    1. If lk = and uk is finite, we can rewrite (17) as xk uk 0, and then definexk = uk xk. Now, if we substitute xk with uk xk, (17) can be replaced by

    xk 0.

    2. If lk is finite and uk = +, we can rewrite (17) as 0 xk lk, and then definexk = xk lk. Now, if we substitute xk with xk + lk, (17) can be replaced by

    xk 0.

    3

  • 3. If both lk and uk are finite, we can rewrite (17) as 0 xk lk uk lk, and then definexk = xk lk. Now, if we substitute xk with xk + lk, (17) can be replaced by

    xk uk lk

    xk 0.

    4. If lk = and uk = +, we can introduce two variables x+k and xk . Now, if we

    substitute xk with x+k xk , (17) can be replaced by

    x+k 0

    xk 0.

    We often write the standard form LP (13)-(16) in matrix notation:

    max = c>xs. t. Ax b

    x 0,

    where

    cn1 =

    c1c2...cn

    , xn1 =x1x2...xn

    , Amn =a1,1 a1,2 a1,na2,1 a2,2 a2,n

    ......

    . . ....

    am,1 am,2 am,n

    , bm1 =b1b2...bm

    .We will use n and m to denote the number of variables and the number of constraints, respectively.

    The idea of making decisions to maximize an objective function subject to certain constraintsapplies to more general forms of mathematical programs:

    max{f(x) : G(x) 0m1, x X},

    where f() : Rn 7 R, G() : Rn 7 Rm, and X is the feasible region.

    Linear Programming has a linear objective function f(x) and linear constraints G(x) 0. Forthe linear program (1)-(4),

    f(x) = 2x1 + 3x2,

    G(x) =[

    1 41 1

    ] [x1x2

    ][

    81

    ],

    X = {(x1, x2) : x1 0, x2 0}.

    Nonlinear Programming has a nonlinear objective function f(x) and/or nonlinear constraintsG(x) 0. For example,

    max 2x21 + 3x2 (18)s. t. x21 + 4x2 8 (19)

    x1 x2 1 (20)x1, x2 0. (21)

    4

  • Here,f(x) = 2x21 + 3x2,

    G(x) =[

    1 00 0

    ] [x21x22

    ]+[

    0 41 1

    ] [x1x2

    ][

    81

    ],

    X = {(x1, x2) : x1 0, x2 0}.

    This problem is illustrated in Figure 2.

    Figure 2: Graphical solution to nonlinear program (18)-(21)

    Integer Programming requires that some or all of the variables be integers. For example,

    max 2x1 + 3x2 (22)s. t. x1 + 4x2 8 (23)

    x1 x2 1 (24)x1, x2 {0, 1, 2, ...}. (25)

    Here,f(x) = 2x1 + 3x2,

    G(x) =[

    1 41 1

    ] [x1x2

    ][

    81

    ],

    X = {(x1, x2) : x1, x2 {0, 1, 2, ...}}.

    This problem is illustrated in Figure 3.

    5

  • Figure 3: Graphical solution to integer program (22)-(25)

    2 Computer Solvers

    There are numerous LP computer solvers, among which are GLPK (GNU Linear ProgrammingKit) and MATLAB. Below are examples of solving the LP (1)-(4) using GLPK and MATLAB.

    2.1 GLPK for linear program

    Step 0: Download a compiled version of GLPK for Windows from the course web. A good locationto extract the files to is C:\\User Files. Do NOT extract them to the U: drive.

    Step 1: Create a text document named example1.txt in the same folder with GLPK, type thefollowing codes, and save the file.var x1 >= 0;var x2 >= 0;maximizezeta: 2 * x1 + 3 * x2;subject toc1: x1 + 4 * x2

  • A = [1, 4;1, -1];

    b = [8, 1];Aeq = [];beq = [];lx = [0, 0];[x, zeta] = linprog(-c, A, b, Aeq, beq, lx)The default LP formulation that MATLAB assumes is min{c>x : Ax b, Aeqx = beq, lx x ux},so we need to give it the negative c. Save this file somewhere, and give it a name, e.g., example1.Do NOT use linprog or quadprog as the name, because they are reserved for MATLAB systemfunctions.

    Step 2: Press F5 or hit the run button . The result will be given in the Command Window:Optimization terminated.x =2.40001.4000zeta =-9.0000

    The optimal solution here is the same as given by GLPK, but the optimal objective value has anopposite sign since MATLAB solves a minimization problem.

    For more information about MATLAB, read its help document and/or visit its website:http://www.mathworks.com.

    2.3 MATLAB for quadratic program

    MATLAB has a function quadprog that solves quadratic programs with linear constraints, forexample

    min 0.5x21 + x22 x1x2 2x1 6x2

    s. t. x1 + x2 2x1 + 2x2 22x1 + x2 3x1, x2 0.

    This QP can be equivalently rewritten in the following matrix form:

    min{c>x+

    12x>Qx : Ax b, x 0

    },

    where

    c =[26

    ], Q =

    [1 11 2

    ], A =

    1 11 22 1

    , b = 22

    3

    .Here the matrix Q can be obtained as Qi,j =

    2f(x)xixj

    , in which f(x) denotes the objective function.

    The default QP formulation that MATLAB assumes is min{c>x + 0.5x>Qx : Ax b, Aeqx =beq, lx x ux}. So, we can use the following MATLAB codes to solve the problem:

    7

  • A = [1, 1; -1, 2; 2, 1];b = [2, 2, 3];c = [-2, -6];Aeq = [];beq = [];Q = [1, -1; -1, 2];lx = zeros(2,1);[x, zeta] = quadprog(Q, c, A, b, Aeq, beq, lx)

    The optimal solution is x1 = 0.67, x2 = 1.33, and the optimal objective value is

    = 8.22.

    2.4 GLPK for mixed integer linear program

    Integer programs that contain both continuous and integer variables are called mixed integer pro-grams. GLPK is able to solve mixed integer linear programs. For example, to solve

    min x1 2x2 3x3 (26)s. t. x1 + 7x2 8x3 12 (27)

    x1 + x2 + 3x3 1 (28)5x2 + 9x3 = 13 (29)

    x1 {0, 1}, x2 {0, 1, 2, 3}, x3 0, (30)

    we can use the following GLPK codes:var x1 binary;var x2 integer, >=0, =0;minimizezeta: x1 - 2 * x2 - 3 * x3;subject toc1: x1 + 7 * x2 - 8 * x3 = 1;c3: 5 * x2 + 9 * x3 = 13;end;

    The optimal solution is x1 = 0, x2 = 2, x

    3 = 0.33, and the optimal objective value is

    = 5.

    2.5 GLPKMEX for mixed integer linear program

    GLPKMEX is a MATLAB interface to GLPK, which enables one to use GLPK through MATLABcodes. A copy of GLPKMEX can be downloaded from the course web. With MATLABs currentdirectory being where GLPKMEX is saved, the following MATLAB codes can be used to solve theinteger program (26)-(30):c = [1, -2, -3];A = [1, 7, -8;-1, 1, 3;0, 5, 9];b = [12, 1, 13];lx = [0, 0, 0];ux = [1, 3, inf];ctype = [U, L, S];

    8

  • vartype = [B, I, C];s = 1;[x, zeta] = glpk(c, A, b, lx, ux, ctype, vartype, s)

    For more information about GLPKMEX, read the file glpk.m and/or visit its website:http://glpkmex.sourceforge.net.

    3 Modeling

    Mathematical programming roots from real life problems where people always have some objectivesto maximize or minimize, but these objectives are kept from going to infinity by various constraints.An important goal of this course is to develop the skills of using mathematical programming (es-pecially linear programming) to formulate, solve, and analyze real life problems. Three examplesare given below.

    3.1 Arbitrage in currency exchange

    In currency markets, arbitrage is trading among different currencies in order to profit from a ratediscrepancy. For example, if the exchange rates between dollar and euro are 1 dollar 0.7 euroand 1 euro 1.5 dollar, then a profit can be made by trading from dollar to euro and then backto dollar. Obviously, arbitrage opportunities are not supposed to exist, at least in theory. Thefollowing are actual trades made on February 14 of 2002 with minor modification. How to use anLP model to identify if an arbitrage opportunity exists in this example? Notice that an arbitragemay involve more than two currencies, e.g., Dollar Yen Pound Euro Dollar.

    from \ to Dollar Euro Pound YenDollar 1 1.1468 0.7003 133.30Euro 0.8716 1 0.6097 116.12

    Pound 1.4279 1.6401 1 190.45Yen 0.00750 0.00861 0.00525 1

    9

  • 3.2 Turning junk into treasure

    A companys large machine broke down. Instead ofbuying a new one, which is very expensive, they findthat there are some old and broken machines of thesame type in the junk yard, so they decide to reassem-ble a working machine out of the broken ones. Thistype of machine has ten different components, con-nected one after another like a train as illustrated inFigure 4. Components at the same position from dif-ferent machines are interchangeable but componentsat different positions are not. A 0 means that thecomponent is broken and a 1 means that the com-ponent is still good. They plan to reassemble thegood components to make a complete working ma-chine. The numbers between components representthe disassembling and reassembling costs. We assumethat these costs are the same for all old machines. Forexample, they can make a working machine from (a)and (b), which requires three assembling points at(#2,#3), (#4,#5), and (#8,#9), with a total cost of5 + 11 + 13 = 29.

    ttttttttt

    ttttttttt

    ttttttttt

    ttttttttt

    (a)

    13

    15

    04

    011

    19

    17

    110

    113

    06

    0

    (b)

    0

    1

    1

    1

    0

    1

    0

    0

    1

    1

    (c)

    0

    0

    0

    0

    1

    1

    1

    1

    1

    1

    (d)

    1

    1

    1

    1

    0

    0

    0

    0

    0

    0

    Figure 4: Four examples of old machines

    Suppose that there are 20 old machines in their junk yard, and the functioning status of theircomponents is given in matrix A, in which each column represents an old machine. The assemblingcosts are given in vector C, with Ci representing the disassembling and reassembling cost betweenAi,j and Ai+1,j for all j. Build a linear programming or integer programming model to find theleast costly way to make a complete working machine using these old ones. Which old machinesshould be used to make it? What is the least possible cost?

    A =

    0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 10 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 01 0 0 0 0 1 0 0 1 0 1 0 1 0 0 0 1 0 0 00 0 0 0 0 0 0 1 0 0 1 1 1 1 0 0 0 0 0 10 1 0 1 0 1 0 1 0 1 0 1 1 1 1 0 0 0 0 11 0 0 1 1 0 0 0 0 1 1 1 0 0 0 0 1 0 0 10 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 1 1 00 0 1 1 1 0 0 0 0 0 1 0 0 1 0 0 0 1 0 10 0 0 0 1 1 0 1 1 0 1 1 0 0 0 0 0 0 0 0

    , C =

    354119710136

    .

    3.3 Old McDonald had a farm

    A farmer can grow wheat, corn, and beans on his 500-acre farm. The planting costs are $150/acre,$230/acre, and $260/acre, and expected yields are 2.5 tons/acre, 3 tons/acre, and 20 tons/acre,respectively. He needs 200 tons of wheat and 240 tons of corn to feed his cattle. The farmer cansell wheat and corn to a wholesaler at $170/ton and $150/ton. He can also buy wheat and corn

    10

  • from the wholesaler at $238/ton and $210/ton. Beans sell at $36/ton for the first 6000 tons. Dueto economic quotas on bean production, beans in excess of 6000 tons can only be sold at $10/ton.How shall the farmer allocate his 500 acres to maximize his profit? Define

    xW , xC , xB: acres of wheat, corn, and beans plantedwW , wC , wB: tons of wheat, corn, and beans sold at favorable price

    eB: tons of beans sold at lower priceyW , yC : tons of wheat and corn purchased,

    then the profit maximization problem can be formulated as:

    max = 150xW 230xC 260xB 238yW + 170wW 210yC + 150wC + 36wB + 10eBs. t. xW + xC + xB 500

    2.5xW + yW wW = 2003xC + yC wC = 24020xB wB eB = 0

    wB 6000xW , xC , xB, wW , wC , wB, eB, yW , yC 0.

    The optimal solution of this problem is xW = 120, xC = 80, x

    B = 300, w

    W = 100, w

    C = 0, w

    B =

    6000, eB = 0, yW = 0, y

    C = 0,

    = $118, 600.

    While this solution makes sense, it assumes that the yields (2.5 tons/acre, 3 tons/acre, and 20tons/acre) are known. However, yields are greatly dependent on the weather and could increase by20% in a good year and decrease by 20% in a bad one.

    If we know in advance that next year will be a good year and the yields become 3 tons/acre,3.6 tons/acre, and 24 tons/acre, then the optimal solution becomes xW = 183.33, x

    C = 66.67, x

    B =

    250, wW = 350, wC = 0, w

    B = 6000, e

    B = 0, y

    W = 0, y

    C = 0,

    = $167, 667.

    If we know in advance that next year will be a bad year and the yields become 2 tons/acre,2.4 tons/acre, and 16 tons/acre, then the optimal solution becomes xW = 100, x

    C = 25, x

    B =

    375, wW = 0, wC = 0, w

    B = 6000, e

    B = 0, y

    W = 0, y

    C = 180,

    = $59, 950.

    However, without knowing the weather of next year in advance, how can we formulate an LPmodel to find the optimal solution that maximizes the expected profit? Suppose the probabilitiesthat next year will be a good, average, or bad year are 0.35, 0.4, and 0.25, respectively. Noticethat the farm allocation decisions (xW , xC , xB) must be made at the beginning of the year withoutknowing the weather, and (yW , wW , yC , wC , wB, eB) can be made after observing the weather.Therefore we can have three recourse solutions of the latter group: (yGW , w

    GW , y

    GC , w

    GC , w

    GB , e

    GB) for a

    good year, (yAW , wAW , y

    AC , w

    AC , w

    AB, e

    AB) for an average year, and (y

    BW , w

    BW , y

    BC , w

    BC , w

    BB , e

    BB) for a bad

    year. The expected profit can be then defined as:

    E[] = 150xW 230xC 260xB+0.35 (238yGW + 170wGW 210yGC + 150wGC + 36wGB + 10eGB)+0.4 (238yAW + 170wAW 210yAC + 150wAC + 36wAB + 10eAB)+0.25 (238yBW + 170wBW 210yBC + 150wBC + 36wBB + 10eBB).

    11

  • 3.4 Minimizing a convex piecewise linear objective function

    A function f : Rn 7 R is called convex if for all x, y Rn and all [0, 1], we have

    f [x+ (1 )y] f(x) + (1 )f(y).

    A function f : Rn 7 R is called concave if for all x, y Rn and all [0, 1], we have

    f [x+ (1 )y] f(x) + (1 )f(y).

    Some properties about convex and concave functions:

    1. If f is a convex function, then f is a concave function

    2. If f1, f2, ..., fn are all convex functions, then max{f1, f2, ..., fn} is a convex function

    3. If f1, f2, ..., fn are all concave functions, then min{f1, f2, ..., fn} is a concave function

    It is much easier to find the minimum of a convex function than a concave one. Similarly, it ismuch easier to find the maximum of a concave function than a convex one. Consider the followingproblem:

    min f(x1, x2) (31)s. t. x1 + 4x2 8 (32)

    x1 x2 1 (33)x1, x2 0, (34)

    where the objective function f(x1, x2) is a convex piecewise linear function:

    f(x1, x2) = max{x1 3x2,4x1 x2,2x1 + 5x2}.

    To reformulate problem (31)-(34) as a linear program, we use the fact that max{x13x2,4x1x2,2x1 + 5x2} is equal to the smallest number z that satisfies z x1 3x2, z 4x1 x2 andz 2x1 + 5x2. Therefore, problem (31)-(34) is equivalent to the following linear program:

    min zs. t. x1 + 4x2 8

    x1 x2 1z x1 3x2z 4x1 x2z 2x1 + 5x2

    x1, x2 0; z free.

    Challenge 1: Can you reformulate the problem if the objective function is redefined as

    f(x1, x2) = min{x1 3x2,4x1 x2,2x1 + 5x2}?

    12

  • 3.5 Dealing with binary decision variables

    Consider the following nonlinear nonconvex binary program:

    max x1 2x2 + 5x3 x21 + 5x22 + 17x23 7x1x2 4x1x3 + 11x2x3 (35)s. t. x1 + 4x2 2x3 4 (36)

    4x1x2 + 3x23 2x1x3 x2 3 (37){x1, x2, x3} 6= {1, 0, 1} (38)x1, x2, x3 binary. (39)

    This problem can be reformulated as an LP with binary decision variables by using the followingtricks.

    Replace squared terms x21, x22, and x

    23 with x1, x2, and x3, respectively.

    For other quadratic terms such as x1x3, replace it with a new binary variable x13 and addtwo new constraints 2x13 x1 + x3 1 + x13.

    Replace constraint (38) with a new constraint (1 x1) + x2 + (1 x3) 1.

    The resulting model is the following LP with binary decision variables:

    max x1 2x2 + 5x3 x1 + 5x2 + 17x3 7x12 4x13 + 11x23 (40)s. t. x1 + 4x2 2x3 4 (41)

    4x12 + 3x3 2x13 x2 3 (42)2x12 x1 + x2 1 + x12 (43)2x13 x1 + x3 1 + x13 (44)2x23 x2 + x3 1 + x23 (45)

    (1 x1) + x2 + (1 x3) 1 (46)x1, x2, x3, x12, x13, x23 binary. (47)

    Challenge 2: The formulation (40)-(47) can be further simplified because some constraints areredundant. Without actually solving for the optimal solution, can you identify which constraintscan be removed and still guarantee the model will give the correct optimal solution?

    Challenge 3: If we have an x1x2x3 term in the objective function, how can you linearize thisnonlinear term by introducing new variables and constraints?

    3.6 At least k out of m constraints must hold

    The following problem does not require all constraints to hold, but at least k out of m must do:

    max c>x

    s. t.

    a1x b1a2x b2

    ...amx bm

    at least k out of m constraints must holdx 0.

    13

  • We can reformulate this problem as a mixed integer program by introducing some binaryvariables ys and a constant M whose value is set to be sufficiently large so that the constraintaix bi +M can be ignored because it will never get violated:

    max c>xs. t. a1x b1 +M(1 y1)

    a2x b2 +M(1 y2)...

    amx bm +M(1 ym)y1 + y2 + ...+ ym k

    x 0, y binary.

    Challenge 4: Can you reformulate the problem if the at least k out of m constraints must holdrequirement is changed to exactly k out of m constraints must hold?

    4 Optimality Conditions

    Different algorithms may vary in details, but most of them contain the same basic steps:

    Step 1: Find a feasible solution to start from.Step 2: Check to see if the current solution is optimal.Step 3: If the solution is optimal then stop. Otherwise find a better solution and go to Step 2.

    In order for an algorithm to be thorough, it also needs to be able to identify infeasible or un-bounded problems.

    We first introduce KKT (Karush-Kuhn-Tucker) conditions, which are widely used for checkingthe optimality of a solution to mathematical programs. We consider a general form nonlinearprogram

    max{f(x) : G(x) 0m1}, (48)where f() : Rn 7 R and G() : Rn 7 Rm. The KKT conditions are: there exists a vector y Rmsuch that the following constraints hold:

    f(x) +m

    i=1

    yiGi(x) = 0n1 (49)

    Gi(x) 0, i = 1, ...,m (50)yi 0,i = 1, ...,m (51)yiGi(x) = 0, i = 1, ...,m. (52)

    To interpret KKT conditions from a more intuitive perspective, lets look at the LP (1)-(4)again and use it as an example to see why the KKT conditions (49)-(52) make sense. If we rewrite

    LP (1)-(4) in the form of (48), then f(x) = 2x1 + 3x2, G(x) =

    G1(x)G2(x)G3(x)G4(x)

    =

    8 x1 4x21 x1 + x2

    x1x2

    ,m = 4, and n = 2.

    14

  • Figure 4: Figure 1 from a different angle

    We look at Figure 1 from a different angle as if the constraints are walls and the objectivefunction is towards the direction of gravity, as is shown in Figure 4. Now, what will happen ifwe put a small pingpong ball inside the two-dimensional feasible region and let it go? Since thisenvironment is exactly analogous to LP (1)-(4), we can imagine that the ball will end up nowherebut the optimal solution to the LP.

    If you still remember physics, in order for the ball to stop moving, all forces that apply to itmust cancel out. The possible forces are: gravity in the direction of f(x), and a force from theconstraint wall in the direction of Gi(x) for all i = 1, ...,m. The magnitude of gravity dependson the mass of the ball, and lets assume its 1; for i = 1, ...,m, let the magnitude of the force fromwall i be denoted by yi. Now, from the physics perspective, the conditions for the ball to stop are:

    All forces cancel out: f(x) +m

    i=1

    yiGi(x) = 0n1 (53)

    Ball must stay inside the walls: Gi(x) 0, i = 1, ...,m (54)Force magnitudes are nonnegative: yi 0,i = 1, ...,m (55)

    If the ball does not touch a wall,the force from that wall is zero: yiGi(x) = 0, i = 1, ...,m. (56)

    Notice that conditions (53)-(56) are exactly the same with the KKT conditions (49)-(52), which isan interpretation of the KKT conditions from the physics perspective.

    KKT conditions (49)-(52) can be written as

    f(x) +m

    i=1

    yiGi(x) = 0n1 (57)

    0 G(x) y 0. (58)

    15

  • Here indicates the complementarity of two vectors. If vectors a Rn and b Rn are of thesame dimension, then 0 a b 0 means that: (i) a 0, (ii) b 0, and (iii) a>b = 0. Twovectors that satisfy condition (iii) are said to be complementary or perpendicular to each other.

    In general, for nonlinear programs, KKT conditions may not be necessary or sufficient optimalityconditions. For example, solution (x1 = 1, x2 = 0) does not satisfy KKT conditions, but it is optimalto

    max x1s. t. x2 + (x1 1)3 0

    x2 0.

    Solution (x1 = 0.5, x2 = 0.25) satisfies KKT conditions, but it is not optimal to

    min x2s. t. x21 x1 + x2 0

    0 x1 2.

    For linear programs, KKT are both necessary and sufficient conditions of optimality, whichmeans that a solution to an LP is optimal if and only if it satisfies KKT conditions. We now derivethe KKT conditions for the standard form linear program

    max{c>x : Ax b, x 0}, (59)

    which can be written in the form of (48) as

    max{f(x) = c>x : G(x) =

    [AmnInn

    ]x+

    [bm10n1

    ] 0(m+n)1

    },

    then the KKT conditions for a standard linear program becomes: there exist two vectors y Rmand Rn, corresponding to Ax b and x 0, respectively, such that the following constraintsare satisfied:

    c+[A> I

    ] [ y

    ]= 0 (60)

    0 [bAxx

    ][y

    ] 0. (61)

    Equation (60) is equivalent to cA>y + = 0 or = A>y c, with which we can substitute in(61). Now, (60)-(61) are simplified to:

    0 [bAxx

    ][

    yA>y c

    ] 0. (62)

    Condition (62) is necessary and sufficient for the optimality of the standard form LP (59), thus itis also called the optimality condition.

    As an exercise, lets prove the optimality of the solution (x1 = 2.4, x2 = 1.4) to LP (1)-(4)

    using KKT conditions. We plug in the numbers of x, f(x) and G(x), then KKT conditions (62)

    16

  • become: there exists a vector y such that the following constraints hold:0000

    8 (2.4 + 4 1.4)1 (2.4 1.4)

    2.41.4

    y1y2

    y1 + y2 24y1 y2 3

    0000

    .It is not hard to find that y =

    [1 1

    ]> satisfies the above condition, which confirms the opti-mality of (x1 = 2.4, x

    2 = 1.4) to LP (1)-(4).

    As another exercise, lets find the optimality condition of a non-standard form LP:

    min{b> : A> c, 0}. (63)

    LP (63) can be equivalently rewritten in the standard form as follows:

    max{b> : A> c, 0}. (64)

    Lets relate (,A>,c,b) in (64) to (x,A, b, c) in (59), respectively. We also introduce a newvariable to relate to y in (62). Then the optimality condition for (63) and (64) is:

    0 [

    A+ b

    ][c+A>

    ] 0. (65)

    It is interesting to notice that optimality conditions (62) and (65) are equivalent to each other inthe sense that if (x, y) is a feasible solution to (62) then ( = y, = x) is also a feasible solutionto (65), and vice versa. For this reason, we can say that LPs (59) and (63) actually share a sameoptimality condition if we simply change the notation to y in (63):

    min{b>y : A>y c, y 0}. (66)

    Once we find a pair of feasible solutions (x, y) to the optimality condition (62), then x and y arethe optimal solutions to LPs (59) and (66), respectively.

    In the optimality condition (62):

    0 [bAxx

    ][

    yA>y c

    ] 0,

    the left hand side is simply saying that x must be feasible to (59), and the right hand side is sayingthat y must be feasible to (66). These two problems both have many feasible solutions, but whatmakes the optimal solutions (x, y) optimal is the fact that they also satisfy the complementarityconditions: 0 bAx y 0 and 0 x A>y c 0.

    5 Duality

    We are ready to introduce perhaps the most important concept in linear programming: duality.Two linear programs, like (59) and (66), that share a same optimality condition in a complementarymanner are called dual to each other: the decision variable y in (66) is the magnitude of force ofthe constraint wall for (59), and vice versa. Either one of the LPs can be called the primal problem

    17

  • and the other is the dual problem. Decision variables of the primal (dual) problem are called primal(dual) variables.

    Every linear program has a dual problem, which can be found by reformulating the primalproblem in the same form as either (59) or (66), and then simplifying its corresponding dualproblem. For example, to find the dual problem to

    min{c>x : Ax b}, (67)

    we first introduce two non-negative variables x+ and x to replace the free variable x with x =x+ x, and then rewrite (67) as the same form with (66):

    min{[

    c> c>] [ x+

    x

    ]:[A A

    ] [ x+x

    ] b,

    [x+

    x

    ] 0}, (68)

    whose dual problem is clearly

    max{b>y :

    [A>A>

    ]y

    [cc

    ], y 0

    }, (69)

    which can be simplified asmax

    {b>y : A>y = c, y 0

    }. (70)

    Now, we have found that the dual to (67) is (70).

    The following table summarizes the primal-dual relations for general form LPs:

    max c1x1 + c2x2 + c3x3 b1y1 + b2y2 + b3y3 min

    constraints

    a1,1x1 + a1,2x2 + a1,3x3 b1a2,1x1 + a2,2x2 + a2,3x3 = b2a3,1x1 + a3,2x2 + a3,3x3 b3

    y1 0y2 freey3 0

    variablesvariables

    x1 0x2 0x3 free

    a1,1y1 + a2,1y2 + a3,1y3 c1a1,2y1 + a2,2y2 + a3,2y3 c2a1,3y1 + a2,3y2 + a3,3y3 = c3

    constraints

    We can use this table to find the dual of a general form LP directly. For example,

    (P21): min x1 + 2x2 + 3x3s. t. x1 3x2 = 5

    2x1 x2 + 3x3 6x3 4

    x1 0x2 0x3 free.

    (D21): max 5y1 + 6y2 + 4y3s. t. y1 free

    y2 0y3 0y1 + 2y2 13y1 y2 23y2 + y3 = 3.

    This table can also be used to check the optimality of a solution to a non-standard form LP. If(i) x satisfies the constraints of the primal problem, (ii) there exists a y that satisfies the constraintsof the dual, and (iii) the corresponding constraints and variables are complementary to each other,then (x, y) is optimal. For example, we can check the optimality of (x1 = 5, x

    2 = 0, x

    3 = 4/3)

    18

  • (y1 = 1, y2 = 1, y3 = 0) by observing that (i) x is primal feasible, (ii) y is dual feasible, and (iii)(x1 3x2 5)y1 = (2x1 x2 + 3x3 6)y2 = (x3 4)y3 = x1(y1 + 2y2 1) = x2(3y1 y2 2) =x3(3y

    2 + y

    3 3) = 0.

    The following are some important theorems on duality.

    Complementary Slackness: Solutions x and y are optimal to max{c>x : Ax b, x 0} andmin{b>y : A>y c, y 0}, respectively, if and only if condition (62) is met:

    0 [bAxx

    ][

    yA>y c

    ] 0.

    Weak Duality: If x and y are feasible solutions to max{c>x : Ax b, x 0} and min{b>y :A>y c, y 0}, respectively, then c>x b>y.

    Proof. Since x and y are feasible, we have

    0 (bAx)>y = b>y (Ax)>y = b>y y>Ax,

    and0 (A>y c)>x = y>Ax c>x.

    Therefore, c>x y>Ax b>y.

    Strong Duality: If x and y are optimal solutions to max{c>x : Ax b, x 0} and min{b>y :A>y c, y 0}, respectively, then c>x = b>y.

    Proof. Since x and y are optimal, by complementary slackness, we have

    0 = (bAx)>y = b>y (Ax)>y = b>y y>Ax,

    and0 = (A>y c)>x = y>Ax c>x.

    Therefore, c>x = y>Ax = b>y.

    Primal-Dual Possibility Table: Recall that a linear program has three possibilities: finitelyoptimal, infeasible, or unbounded. The primal-dual pair has nine combinations, but only four ofthem are possible.

    Primal \ Dual Finitely optimal Unbounded InfeasibleFinitely optimal Possible Impossible Impossible

    Unbounded Impossible Impossible PossibleInfeasible Impossible Possible Possible

    Two examples of primal unbounded and dual infeasible, which are also primal infeasible anddual unbounded:

    max x1 + x2s. t. x1 1

    x1, x2 0.

    min ys. t. y 1

    0y 1y 0.

    19

  • max 3x1 + 4x2s. t. x1 + x2 1

    2x1 x2 3x1, x2 0.

    min y1 3y2s. t. y1 + 2y2 3

    y1 y2 4y1, y2 0.

    Two examples of both primal and dual are infeasible:

    min x1 + 2x2s. t. x1 + x2 = 1

    x1 + x2 = 2.

    max y1 + 2y2s. t. y1 + y2 = 1

    y1 + y2 = 2.

    max 2x1 x2s. t. x1 x2 1

    x1 + x2 2x1, x2 0.

    min y1 2y2s. t. y1 y2 2

    y1 + y2 1y1, y2 0.

    For an LP to be unbounded, there must exist two things: (i) a feasible solution x0, satisfyingAx0 b, x0 0, and (ii) a direction x, which leads the objective value c>(x0+x) towards infin-ity without violating any constraints as the step size approaches infinity. This direction is calledan extreme ray. If x satisfies c>x > 0, Ax 0,x 0, then x is called an extreme ray tothe LP maxx{c>x : Ax b, x 0}. In the dual space, if y satisfies b>y < 0, A>y 0,y 0,then y is called an extreme ray to the LP miny{b>y : A>y c, y 0}.

    Farkas Lemma: Let A Rmn and b Rm1 be a matrix and a vector, respectively. Thenexactly one of the following two alternatives holds:(a) There exists some x 0 such that Ax b.(b) There exists some y 0 such that A>y 0 and b>y < 0.

    Proof. (a) true (b) false: If (a) is true, then for any y 0 such that A>y 0, we haveb>y (Ax)>y = x>A>y 0, which means that (b) is false.

    (a) false (b) true: Consider max{0 : Ax b, x 0} and its dual min{b>y : A>y 0, y 0}.If (a) is false, then max{0 : Ax b, x 0} is infeasible. Its dual is either unbounded or infeasible.It is easy to see that y = 0 is a feasible solution to the dual, so it must be unbounded, which meansthat there must exist some y 0 such that A>y 0 and b>y < 0. Therefore, (b) is true.

    A Variation of Farkas Lemma: Let A Rmn and b Rm1 be a matrix and a vector,respectively. Then exactly one of the following two alternatives holds:(a) There exists some x 0 such that Ax = b.(b) There exists some y such that A>y 0 and b>y < 0.

    Proof. (a) true (b) false: If (a) is true, then for any y such that A>y 0, we have b>y =(Ax)>y = x>A>y 0, which means that (b) is false.

    (a) false (b) true: Consider max{0 : Ax = b, x 0} and its dual min{b>y : A>y 0}. If (a)is false, then max{0 : Ax = b, x 0} is infeasible. According to the Primal-Dual Possibility Table,its dual is either unbounded or infeasible. It is easy to see that y = 0 is a feasible solution to thedual, so it must be unbounded, which means that there must exist some y such that A>y 0 andb>y < 0. Therefore, (b) is true.

    20

  • Extended Primal-Dual Possibility Table: Farkas Lemma enables us to define the primal-dual possibility table in a more revealing manner.

    We use x0 or y0 to indicate that a feasible solution exists, and x or y to indicate that anextreme ray does not exist. Farkas Lemma basically says that x0 y, y0 x, x0 y, andy0 x. Therefore, we have the following extended primal-dual possibility table:

    primal dualoptimal x0 + x y + y0 optimal

    unbounded x0 + x y + y0 infeasibleinfeasible x0 + x y + y0 infeasibleinfeasible x0 + x y + y0 unbounded

    6 Simplex Algorithm

    Now that we know what condition a solution needs to satisfy to be an optimal solution, but howdo we come up with such an optimal solution? In this section, we will learn about an algorithmcalled simplex that finds an optimal solution if the LP has one, or determines that the LP is infea-sible or unbounded if that is the case. Simplex is perhaps the most important linear programmingalgorithm, and we are going to learn about it in great detail.

    The basic idea of simplex is based on the observation that the optimal solution to an LP, ifit exists, occurs at a corner of the feasible region. This can be verified with the LP examples wehave seen in the lecture notes or homework examples. Based on this observation, we can find theoptimal solution by (i) starting from a feasible corner point, and (ii) moving to a better cornerpoint until the current one is already optimal. If we cannot find a starting point, then the LP isinfeasible; if we can optimize the objective value to infinity, then the LP is unbounded.

    While the idea may sound simple and intuitive, we need to rigorously establish its theoreticalcorrectness.

    6.1 What exactly is a corner point?

    Corner point is a nickname of a well-known concept called basic solution. So lets use basicsolution instead of corner point from now on. A solution x0 is a basic solution if it is uniquelydetermined by its active constraints at equality. An inequality constraint (Ax)i bi is active at x0if it holds at equality: (Ax0)i = bi. An equality constraint (Ax)i = bi is also considered active aslong as the equality holds.

    To explain the definition of a basic solution, let us suppose the feasible region is defined byAx b, which already includes any non-negativity constraints x 0. For any solution x0, letI(x0) be the set of its active constraints: I(x0) = {i : (Ax0)i = bi}. By definition, x0 satisfiesthese constraints at equality: (Ax)I(x0) = bI(x0). If x0 is the only solution determined by thisequation, then it is a basic solution. However, if there exists another solution x1 that also satisfies(Ax1)I(x0) = bI(x0), then x0 is not a basic solution.

    Recall from linear algebra that a necessary condition for a linear system of equations Ax = bto have a unique solution x Rn is that, there exist n linearly independent rows in matrix A.

    21

  • Given a finite number of vectors V1, V2, ..., VK Rn, we say that they are linearly dependent ifthere exist real numbers a1, a2, ..., aK such that

    Ki=1 |ai| > 0 and

    Ki=1 aiVi = 0. Otherwise, they

    are called linearly independent.

    As an example, we look at LP (1)-(4) and Figure 1 again:

    (1) : max 2x1 + 3x2(2) : s. t. x1 + 4x2 8(3) : x1 x2 1(4) : x1, x2 0.

    We know intuitively that (x1 = 1, x2 = 0) is a basic solution (because it is a corner point). Nowlets check with the definition. There are two active constraints at this point: x1 x2 1 andx2 0, and (x1 = 1, x2 = 0) is uniquely determined by these two constraints at equality, therefore,it is a basic solution. On the other hand, (x1 = 2, x2 = 1.5) is not a basic solution, because thereis only one active constraint at that point: x1 + 4x2 8, and (x1 = 2, x2 = 1.5) is obviously notthe only solution determined by that constraint at equality.

    Also notice that a basic solution is not required by definition to be feasible. Point (x1 = 8, x2 =0) is not a feasible solution, but it is a basic solution because it is uniquely determined by twoactive constraints at equality: x1 + 4x2 8 and x2 0.

    6.2 Is the optimal solution always a basic solution?

    Unfortunately, the answer is no. First, some LPs may not even have a basic solution. For examplemax{0 : x1, x2 free}, where we are maximizing the constant 0 over the entire two-dimensional spacewith no constraints, and there is no basic solution. Secondly, even if an LP has basic solutions,there may also exist an optimal solution that is not a basic solution. For example (x1 = 1, x2 = 0)is an optimal solution to min{x2 : 0 x1 2, x2 0}, but it is not a basic solution.

    With that being said, there is some good news that still validates the idea of simplex: (i) If wewrite an LP in the standard form max{c>x : Ax b, x 0}, there always exists a basic solution.As a matter of fact, the origin point (x = 0) is uniquely determined by the active constraints x 0at equality, thus it is a basic solution. (ii) Suppose an LP has at least one basic solution and oneoptimal solution. It can be proved that: if the optimal solution is unique, then it must be a basicsolution; if there are infinitely many optimal solutions, then there exists one that is a basic solution.

    22

  • 6.3 How to find basic solutions?

    In the simplex context, it is oftentimes more convenient to consider a non-standard form LP

    max{ = c>x : Ax+ w = b, x 0, w 0}, (71)

    which is equivalently reformulated from the standard form LP max{c>x : Ax b, x 0} by simplyintroducing a new variable w Rm, called slack variable, to make the inequality constraint Ax ban equality one Ax+ w = b.

    We now write (71) in the following matrix form:

    max = c>x (72)s. t. Ax = b (73)

    x 0, (74)

    where c =[

    c0m1

    ], A =

    [A Imm

    ], and x =

    [xw

    ] Rn+m.

    Since the dimension of x is (n+m) 1, we need n+m linearly independent active constraintsto uniquely determine a basic solution. We already have m rows in Ax = b, so we need at least nconstraints in x 0 to hold at equality.

    Define N as the indices of constraints in x 0 that are set to hold at equality, and B as the in-dices of other constraints in x 0. Such a pair of (B,N ) is an exclusive and exhaustive partition ofthe set {1, 2, ..., n+m}. For example, if we set xi = 0,i = 1, ..., n and xi 0,i = n+ 1, ..., n+m,then the partition is (B = {n+ 1, ..., n+m},N = {1, ..., n}).

    The above conditions are only necessary for a basic solution. To find the sufficient conditionfor a basic solution, we rewrite (72)-(74) using the definition of N and B:

    max = c>BxB + c>NxN (75)

    s. t. ABxB +ANxN = b (76)xB 0, xN 0, (77)

    where AB is the collection of columns in A whose indices are in the set B, and xN and cN are thecollections of elements in x and c, respectively, whose indices are in the set N . Here, the value for

    23

  • xN = 0 is uniquely determined. To make sure that the value for xB is also uniquely determined,we should guarantee that the equation ABxB + ANxN = ABxB = b has a unique solution, whichrequires that the matrix AB Rmm be invertible. This can be achieved by choosing m linearlyindependent columns in A as the set of B.

    Now we have the necessary and sufficient conditions for a basic partition:

    The m elements in the set B should be chosen such that AB is invertible, and (78)The n elements in the set N are then determined by N = {1, ..., n+m}\B. (79)

    This partition uniquely determines a basic solution (xB = A1B b, xN = 0).

    We define: a partition (B,N ) that satisfies (78) and (79) as a basic partition; B and N in abasic partition are called basis and non-basis, respectively; and the variables xB and xN are calledbasic variable and non-basic variable, respectively.

    The relationship between a basic partition and a basic solution is that, a basic partition (B,N )uniquely determines a basic solution; for any basic solution x, there exists (uniquely or not) abasic partition that determines this basic solution x. One example of different basic partitions alluniquely determining a same basic solution is the following. Suppose

    A =[

    1 2 1 03 6 0 1

    ]and b =

    [13

    ],

    then both (B = {1, 3},N = {2, 4}) and (B = {1, 4},N = {2, 3}) uniquely determine the same basicsolution x =

    [1 0 0 0

    ]>.Since a basic partition (B,N ) uniquely determines one basic solution, the number of basic so-

    lutions is bounded by the number of ways (B,N ) can be selected, which is no more than (n+m)!n!m! .

    The set of partitions of {1, 2, ..., n+m} can be divided into the following regions.

    A+B+C+D: All possible partitionsB+C+D: Basic partitions, which satisfy (78) and (79)

    C+D: Feasible basic partitions with xB = A1B b 0

    D: Optimal basic partitions

    It is relatively easy to enter region B. The following points will discuss how to enter region Cand then D.

    24

  • 6.4 How to find an initial feasible basic solution to start from? What if the LPis infeasible?

    The way simplex algorithm proceeds is to start from a feasible basic solution, move from one fea-sible basic solution to another better feasible basic solution, and finally stop at a feasible basicsolution which is optimal to the LP. (Lets use fbs to abbreviate feasible basic solution). We arenow ready to discuss how to find an initial fbs to start from.

    It is not hard to observe that (B = {n + 1, ..., n + m},N = {1, ..., n}) is a basic partition to(71), which corresponds to

    B = {n+ 1, ..., n+m}, N = {1, ..., n},cB = 0m1, cN = c,AB = Imm, AN = A,xB = b, xN = 0n1.

    The partition (B,N ) is indeed a basic partition because AB = Imm is invertible. If b 0, then(x = 0, w = b) is also a feasible basic solution.

    However, if bi < 0 for some i, then x 0 is violated, and the basic solution is not feasible.To obtain an fbs in such a case, we need to use a little trick called the big-M method. DefineK = {i : bi < 0,i = 1, ...,m}, and let k = |K|. Now we consider a new problem

    max

    { = c>xM

    ki=1

    ti : Ax+ w +Ht = b;x,w, t 0

    }. (80)

    Here M is an extremely large finite constant, vector t Rk1 is a new variable called artificialvariable, and matrix H Rmk is defined as

    Hi,j ={1, if i = K(j);0, otherwise.

    For example, consider the following LP as an instance of (71):

    max = 5x1 + 4x2 + 3x3s. t. 2x1 + 3x2 + x3 + w4 = 5

    4x1 + x2 + 2x3 + w5 = 113x1 + 4x2 + 2x3 + w6 = 8x1, x2, x3, w4, w5, w6 0.

    Then (80) corresponds to the following formulation:

    max = 5x1 + 4x2 + 3x3 1000t7 1000t8s. t. 2x1 + 3x2 + x3 + w4 = 5

    4x1 + x2 + 2x3 + w5 t7 = 113x1 + 4x2 + 2x3 + w6 t8 = 8x1, x2, x3, w4, w5, w6, t7, t8 0.

    25

  • In LP (80), since there are k artificial variables but no additional constraints, the dimensionsof its partition should be |N | = n + k and |B| = m. The way we initialize this partition is to addall the artificial variables to B and then move those indices that correspond to bi < 0 from B to N .It is not hard to see that the basic partition (N = {1, ..., n} {n+K},B = {1, ..., n+m+ k}\N )uniquely determines an fbs (xN = 0, xB = A

    1B b = |b|). We can use this fbs as a starting point to

    solve (80) by following the rest of the simplex steps. In the format of (72)-(74), (80) has

    c =

    c0m1M 1k1

    , A = [ A Imm Hmk ] , and x = xw

    t

    Rn+m+k.The relation between optimal solutions to (71) and (80) is given in the following propositions.

    Proposition 1. Solution (x, w) is optimal to (71) if and only if there exists a finitely large Msuch that (x, w, t = 0) is optimal to (80).

    Proof. (): Prove by construction. Let y be the dual optimal solution to (71), which is min{ =b>y : A>y c, y 0}. Set M = maxi=1,...,m yi , and we know that y is also feasible to the dual of(80), which is min

    { = b>y : A>y c,H>y M1k1; y 0

    }. Since (x, w, t = 0) and y are

    respectively feasible to (80) and its dual with the same objective value, they are also respectivelyoptimal.

    (): Prove by contradiction. Suppose (x, w, t = 0) is optimal to (80) but (x, w) is notoptimal to (71), then there must exist a feasible solution (x, w) to (71) with c>x > c>x. However,this implies that (x, w, t = 0) is a better solution than (x, w, t = 0) to (80), because (x, w, t =0) is feasible to (80) and c>x > c>x. This contradicts the assumption that (x, w, t = 0) is anoptimal solution to (80).

    In the simplex algorithm, there is a way to make sure that M is sufficiently large. So if we solve(80) with the simplex algorithm and get an optimal solution (x, w, t) with ti > 0 for some i, then(71) is infeasible.

    Proposition 2. If for a sufficiently large M , (80) possesses an optimal solution (x, w, t) withti > 0 for some i, then (71) is infeasible.

    Proof. By Proposition 1, (71) does not have an optimal solution. Therefore, it suffices to show that(71) is not unbounded. Let y be the optimal dual solution to (80), then it is feasible to the dualof (71), which means that (71) is not unbounded.

    If (80) is unbounded, then (71) could be either infeasible or unbounded, and we need to solvethe following LP to verify:

    min

    { =

    ki=1

    ti : Ax+ w +Ht = b;x,w, t 0

    }. (81)

    If t = 0 is an optimal solution to (81), then (71) is unbounded; otherwise (71) is infeasible.

    The possibilities of (80) and their implications of (71) are summarized as the following:

    26

  • (x, w, t) optimal to (80) (80) unbounded t optimal to (81)t = 0 t 6= 0 t = 0 t 6= 0

    (x, w) optimal to (71)

    (71) infeasible

    (71) unbounded

    (80) is optimal with (x, w, t = 0) (71) is optimal with (x, w).

    (80) is optimal with (x, w, t) and ti > 0 for some i (71) is infeasible.

    (80) is unbounded and t = 0 is optimal to (81) (71) is unbounded.

    (80) is unbounded and t = 0 is not optimal to (81) (71) is infeasible.

    LP (80) cannot be infeasible, because (xN = 0, xB = |b|) is a feasible solution to (80).

    The following diagram provides an overview of the Simplex algorithm. Here we refer to thethree LPs (71), (80), and (81) as LP0, LP1, and LP2, respectively.

    LP0

    ?HHHH

    HHHH

    b 0? N

    Y?

    ?HHHH

    HHHH

    optimal?N

    Y?

    LP0optimal

    -

    HHHHHH

    HH

    unbounded?

    N

    Y?

    LP0unbounded

    6 improve

    - LP1

    ?

    ?HHHH

    HHHH

    optimal?N

    Yt = 0 t 6= 0LP0optimal

    @@R

    LP0infeasible

    -

    HHHHHH

    HH

    unbounded?

    N

    Y

    6 improve

    - LP2

    ?

    ?HHHH

    HHHH

    optimal?N

    Yt = 0 t 6= 0

    -

    6 improve

    LP0unbounded

    @@R

    LP0infeasible

    LP0 may be optimal, unbounded, or infeasibleLP1 may be optimal or unbounded, never infeasibleLP2 must be optimal, never unbounded or infeasible

    6.5 How to tell if the current fbs is optimal or not?

    One way to check the optimality of a solution is to check the optimality condition (62). In thesimplex algorithm, there is a more convenient way to check the optimality by reformulating (75)-

    27

  • (77) as:

    max = c>BA1B b+ (c

    >N c>BA

    1B AN )xN (82)

    s. t. xB = A1B bA

    1B ANxN (83)

    xB 0, xN 0. (84)

    Equations (82) and (83) are called a dictionary :

    = c>BA1B b+ (c

    >N c>BA

    1B AN )xN

    xB = A1B bA

    1B ANxN .

    The term (c>N c>BA1B AN ) is called the reduced cost.

    Proposition 3. In (82), for a given feasible basic partition (B,N ), if we have

    (c>N c>BA1B AN ) 0,

    then the fbs (xB = A1B b, xN = 0) is optimal to (82)-(84).

    Proof. Prove by contradiction. Suppose (xB = A1B b, xN = 0) is not optimal to (82)-(84), and there

    exists another fbs x with (x) > (x). This implies that

    (x) (x)= (c>N c>BA

    1B AN )x

    N (c>N c>BA

    1B AN )xN

    = (c>N c>BA1B AN )x

    N

    > 0,

    which is impossible since (c>N c>BA1B AN ) 0 and xN 0.

    Notice that reduced cost being non-positive is a sufficient not necessary condition for the opti-mality of an fbs. For example, consider the following LP:

    max = x1 + 2x2 + 3x3s. t. x1 3x3 + w4 = 0

    7x1 + 2x2 + 5x3 + w5 = 1x1, x2, x3, w4, w5 0.

    The feasible basic partition (B1 = {1, 2},N 1 = {3, 4, 5}) determines the following dictionary:

    = 1 20x3 + 6w4 w5x1 = 3x3 w4x2 = 0.5 13x3 + 3.5w4 0.5w5,

    which gives the fbs (x1 = 0, x2 = 0.5, x3 = 0). Although the reduced cost does contain a positiveterm, this fbs is actually optimal. To see this, consider another feasible basic partition (B2 ={2, 4},N 2 = {1, 3, 5}), which determines the following dictionary:

    = 1 6x1 2x3 w5x2 = 0.5 3.5x1 2.5x3 0.5w5w4 = x1 + 3x3.

    It gives the same fbs (x1 = 0, x2 = 0.5, x3 = 0). Since it has a non-positive reduced cost, we knowthis fbs is optimal.

    28

  • 6.6 How to find a better fbs if the current one is not optimal?

    Because of the close relation between an fbs and a feasible basic partition (B,N ), the search fora better fbs (and eventually the optimal one) is equivalent to the search for a better feasiblebasic partition. The way simplex updates a feasible basic partition is by switching one pair ofnumbers between the current basis B and non-basis N at a time. If the current basic partitionis (B0,N 0), then we select an i from B0 and a j from N 0, and update the basic partition as(B1 = B0\{i} {j},N 1 = N 0\{j} {i}). The variable xj is called the entering variable,because it will enter the basis. Similarly, xi is called the leaving variable. Geometrically, such anupdate means a move from an fbs to an adjacent one. Of course, we need to make sure that thenew fbs is no worse than the current one in terms of the objective value.

    Now lets assume that we have obtained a feasible basic partition (B,N ) which is not optimal,then we will have to update the basic partition by selecting an entering variable and a leavingvariable. The rule for selecting the entering and leaving variables is called a pivoting rule. Thereare various pivoting rules, one of which called Blands rule is introduced below:

    Entering variable xj : Choose j = min{j N : (c>N c>BA1B AN )j > 0}. (85)

    Leaving variable xi : Choose i = min

    {i argmaxiB

    (A1B AN )i,j

    (A1B b)i

    }. (86)

    After the entering and leaving variables are chosen, we get an updated partition. Blands ruleensures that the new partition is a better feasible basic partition. The calculation for determin-ing the leaving variable is called the ratio test, because we are trying to find the largest ratio of(A1B AN )i,j

    (A1B b)i

    . The i that achieves the largest ratio is said to be the winner of the ratio test, and xi

    will be the leaving variable. If there is a tie in the ratio test, the smallest winner i will be selected.

    The rationale for Blands rule is explained in the following example:

    max 5x1 + 4x2 + 3x3 (87)s. t. 2x1 + 3x2 + x3 5 (88)

    4x1 + x2 + 2x3 11 (89)3x1 + 4x2 + 2x3 8 (90)

    x1, x2, x3 0. (91)

    We start by introducing new variables w4, w5, w6, to reformulate (88)-(90) into equality constraints:

    max 5x1 + 4x2 + 3x3 (92)s. t. 2x1 + 3x2 + x3 + w4 = 5 (93)

    4x1 + x2 + 2x3 + w5 = 11 (94)3x1 + 4x2 + 2x3 + w6 = 8 (95)x1, x2, x3, w4, w5, w6 0. (96)

    Since the right-hand-side values are all positive, the first fbs is easy to find: x1 = x2 = x3 = 0, w4 =

    29

  • 5, w5 = 11, w6 = 8. In the context of (82)-(84),

    B = {4, 5, 6} N = {1, 2, 3}

    cB =

    000

    cN = 54

    3

    AB =

    1 0 00 1 00 0 1

    AN = 2 3 14 1 2

    3 4 2

    xB =

    5118

    xN = 00

    0

    .

    The dictionary is = 5x1 + 4x2 + 3x3w4 = 5 2x1 3x2 x3w5 = 11 4x1 x2 2x3w6 = 8 3x1 4x2 2x3

    ,

    and in matrix form, it is

    = 0 +[

    5 4 3] [

    x1 x2 x3]> w4w5

    w6

    = 511

    8

    2 3 14 1 2

    3 4 2

    x1x2x3

    .Since (c>N c>BA

    1B AN ) =

    [5 4 3

    ], the current fbs is not optimal, because we can improve the

    objective value by increasing the values of x1, x2, x3 from zero to positive. According to Blandsrule (85), we choose j = 1, thus x1 will enter the basis. But by how much could x1 increase fromzero? This is limited by the equation w4w5

    w6

    = 511

    8

    2 3 14 1 2

    3 4 2

    x1x2x3

    in the dictionary, because changing x1 will affect the column

    243

    and thus the values of w4, w5, w6,which should be non-negative. Therefore, the new value of x1 will be set to the largest numbersuch that the constraints w4, w5, w6 0 are still satisfied. This is the reason for Blands rule (86).We choose the smallest i such that

    i argmaxiB(A1B AN )i,j

    (A1B b)i= argmax

    {25,

    411,38

    },

    so i = 4. The basic partition is updated as B = {1, 5, 6},N = {4, 2, 3}. If we repeat this procedure,

    30

  • which is called an iteration in the simplex algorithm, we get the following dictionaries.

    Second iteration: = 12.5 +[2.5 3.5 0.5

    ] [w4 x2 x3

    ]> x1w5w6

    = 2.51

    0.5

    0.5 1.5 0.52 5 01.5 0.5 0.5

    w4x2x3

    ,Third iteration: = 13 +

    [1 3 1

    ] [w4 x2 w6

    ]> x1w5x3

    = 21

    1

    2 2 12 5 03 1 2

    w4x2w6

    .After the third iteration, (c>N c>BA

    1B AN ) =

    [1 3 1

    ]< 0, thus we know that the current

    basic partition B = {1, 5, 3},N = {4, 2, 6} is optimal, so is the current fbs x1 = 2, x2 = 0, x3 =1, w4 = 0, w

    5 = 1, w

    6 = 0.

    6.7 How to identify an unbounded LP?

    We can identify an unbounded LP when we use Blands rule (86) to determine the leaving variable

    and the maximum ratio argmaxiB(A1B AN )i,j

    (A1B b)i

    is not positive.

    The possibilities of the ratio (A1B AN )i,j

    (A1B b)i

    are summarized in the following table, where 1 and 1represent finite positive and negative numbers, respectively.

    case (A1B b)i (A1B AN )i,j

    (A1B AN )i,j

    (A1B b)i

    updated xj note

    1 0 0 undefined + unbounded2 0 1 + 0 degenerate3 0 1 + unbounded4 1 0 0 + unbounded5 1 1 1 1 new basic partition6 1 1 1 + unbounded

    There are six cases in this table. The second and third columns come from the dictionary

    = c>BA1B b+ (c

    >N c>BA

    1B AN )xN

    xB = A1B bA

    1B ANxN ;

    the fourth column is the ratio test result in Blands rule (86); the fifth column calculates the largestvalue of xj such that constraint xB 0 still holds; and the sixth column explains the implicationif the corresponding xj becomes the entering variable. If case 1, 3, 4, or 6 is the winner of the ratiotest, then we claim the LP is unbounded.

    6.8 Will the algorithm ever terminate?

    If appropriate pivoting rules are used, the algorithm will terminate finitely. First of all, as long asthe numbers of variables and constraints are finite, there is only a finite number of basic solutions,so the algorithm does not need to visit infinitely many fbss to reach the optimal one.

    31

  • Proposition 4. If the dimensions of matrix A in (71) are finite, then the number of basic solutionsto (71) is finite.

    Proof. Since any basic solution can be determined by a basic partition (B,N ), the number of basicsolutions is bounded by the number of ways (B,N ) can be selected, which is no more than (n+m)!n!m! .Therefore, the number of basic solutions is finite.

    Secondly, it can be proved (refer to the text for the proof) that if Blands rule is used, thealgorithm will terminate in finitely many iterations. Under some other pivoting rules (for exam-ple the largest coefficient rule: choose j = min

    {j argmaxjN (c>N c>BA

    1B A)j > 0

    }for the

    entering variable, and choose i = min{i argmaxiB

    (A1B AN )i,j

    (A1B b)i

    }for the leaving variable),

    however, the algorithm could get stuck in a so called cycle and never terminate. A cycle oc-curs when the algorithm moves to a basic partition which has been visited before. For example,(B0,N 0) (B1,N 1) (B2,N 2) ... (Bk,N k) (B1,N 1) (B2,N 2).... A cycling exampleusing the largest coefficient rule can be found in the text on page 31.

    The reason a cycle occurs is because of degeneracy, which occurs when the updated basicpartition determines a same fbs as the previous basic partition does. For example, in this problem

    max 10x1 57x2 9x3 24x4s. t. 0.5x1 5.5x2 2.5x3 + 9x4 + w5 = 0

    0.5x1 1.5x2 0.5x3 + x4 + w6 = 0x1 + w7 = 1

    x1, x2, x3, x4, w5, w6, w7 0,

    A =

    0.5 5.5 2.5 9 1 0 00.5 1.5 0.5 1 0 1 01 0 0 0 0 0 1

    ,the following basic partitions all determine a same fbs x1 = 0, x2 = 0, x3 = 0, x4 = 0, w5 = 0, w6 =0, w7 = 1:

    (B1 = {5, 6, 7},N 1 = {1, 2, 3, 4})(B2 = {1, 6, 7},N 2 = {5, 2, 3, 4})(B3 = {1, 2, 7},N 3 = {5, 6, 3, 4})(B4 = {3, 2, 7},N 4 = {5, 6, 1, 4})(B5 = {3, 4, 7},N 5 = {5, 6, 1, 2})(B6 = {5, 4, 7},N 6 = {3, 6, 1, 2}).

    From another perspective, a solution x to (72)-(74) is degenerate if there are more than n+mactive constraints at x. This implies that not only xN = 0, but also there exists some i B suchthat xi = 0. In the dictionary, we can tell that degeneracy occurs if (A

    1B b)i = 0 for some i B. In

    the above example, the dictionary is

    = 0 +[

    10 57 9 24] [

    x1 x2 x3 x4]> w5w6

    w7

    = 00

    1

    0.5 5.5 2.5 90.5 1.5 0.5 1

    1 0 0 0

    x1x2x3x4

    .32

  • The entering variable is clearly x1, which should move from non-basis to basis in the updated basicpartition. However, there are zeros in the constant vector on the right hand side of the secondequation in the dictionary, which indicates that the current solution x1 = 0, x2 = 0, x3 = 0, x4 =0, w5 = 0, w6 = 0, w7 = 1 is a degenerate one. As a result, x1 has to stay zero in the basis, otherwisew5 will become negative due to the constraint w5 = 0 0.5x1.

    Using Blands rule, we may also move from one basic partition to another that determines asame fbs, but we will not come back to the same basic partition that has been visited before. Sincethere is only a finite number of basic partitions, we will eventually move out of the cycle towardsthe optimal basic partition.

    6.9 How efficient is the simplex algorithm?

    Simplex updates the fbs iteratively to find the optimal solution, thus its efficiency depends on

    how many iterations are needed. The total number of basic solutions is bounded by(n+mm

    )=

    (n+m)!n!m! . If m = n, then this number becomes

    (2nn

    ). It can be proved that 12n2

    2n (

    2nn

    ) 22n,

    which means that the number of basic solutions increases exponentially with the size of the prob-lem (number of variables and constraints). Some pivoting rules could cause cycling and neverterminate; even for existing non-cycling rules, people have found instances (refer to Section 4.4 ofVanderbei for an example) where the simplex algorithm will have to visit every single basic solutionbefore it finds the optimal one. However, it is an open question whether there exists a pivoting rulethat will never require an exponential number of iterations.

    There does exist an algorithm called the ellipsoid method that can solve linear programs poly-nomially (never requires exponentially many iterations). Theoretically, the ellipsoid method has abetter worst-case efficiency than the simplex algorithm, however, the practical performance of thesimplex algorithm is much better than that of the ellipsoid method, and that is one of the reasonsthe simplex algorithm is more widely used.

    6.10 Summary

    The simplex algorithm is summarized with the following steps, as illustrated in Figure 5.

    Step 1 Reformulation: Write the LP in standard form max{ = c>x : Ax b, x 0}, reformu-late it as max{ = c>x : Ax+ w = b, x 0, w 0}, and then go to Step 2.

    Step 2 Initialization:

    (2a) If b 0, then reformulate the problem as

    max{ = c>x : Ax = b, x 0},

    where

    c =[

    c0m1

    ], A =

    [A Imm

    ], and x =

    [xw

    ] Rn+m.

    The initial basic partition is (B = {n + 1, ..., n + m},N = {1, ..., n}), and the initial fbs is(xB = A

    1B b = b, xN = 0). Go to Step 3.

    33

  • (2b) If bi < 0 for some i, then use the big-M method to find the initial basic partition and theinitial fbs. Define K = {i : bi < 0,i = 1, ...,m}, and let k = |K|. Reformulate the problem as

    max{ = c>x : Ax = b, x 0},

    where

    c =

    c0m1M 1k1

    , A = [ A Imm H ] , and x = xw

    t

    R(n+m+k)1.When compared with other finite constants finitely many times, M is always assumed to bebigger. Matrix H Rmk is defined as

    Hi,j ={1, if i = K(j);0, otherwise.

    The initial basic partition is (N = {1, ..., n}{n+K(1), ..., n+K(k)},B = {1, ..., n+m+k}\N ),and the initial fbs is (xB = A

    1B b = |b|, xN = 0). Go to Step 3.

    Step 3 Optimality check:

    (3a) If (c>N c>BA1B AN )i > 0 for some i N , then go to Step 4.

    (3b) If (c>N c>BA1B AN ) 0 and big-M method was not used in Step 2, then stop, and the

    optimal solution is (xB = A1B b, x

    N = 0),

    = c>BA1B b.

    (3c) If (c>N c>BA1B AN ) 0, big-M method was used in Step 2, and xi = 0 for all i = n+m+

    1, ..., n+m+ k, then stop, and the optimal solution is (xB = A1B b, x

    N = 0),

    = c>BA1B b.

    (3d) If (c>N c>BA1B AN ) 0, big-M method was used in Step 2, and xi > 0 for some i

    {n+m+ 1, ..., n+m+ k}, then stop, and the LP is infeasible.

    Step 4 Improvement: Choose

    j = min{j N : (c>N c>BA1B AN )j > 0},

    and

    i = min

    {i argmaxiB

    (A1B AN )i,j

    (A1B b)i

    }.

    (4a) If maxiB(A1B AN )i,j

    (A1B b)i

    = + or finite positive, then update the basic partition and the fbs

    (B = B\{i} {j},N = N\{j} {i}),

    (xB = A1B b, xN = 0),

    and go back to Step 3.

    (4b) If maxiB(A1B AN )i,j

    (A1B b)i

    = , 0, finite negative, or undefined(

    00

    ), and big-M method was not

    used in Step 2, then stop, and the LP is unbounded.

    34

  • (4c) If maxiB(A1B AN )i,j

    (A1B b)i

    = , 0, finite negative, or undefined(

    00

    ), big-M method was used in

    Step 2, then solve the feasibility problem (81). If t = 0 is an optimal solution to (81) thenthe original LP is unbounded.

    (4d) If maxiB(A1B AN )i,j

    (A1B b)i

    = , 0, finite negative, or undefined(

    00

    ), big-M method was used in

    Step 2, then solve the feasibility problem (81). If t = 0 is not an optimal solution to (81)then the original LP is infeasible.

    Figure 5: The simplex algorithm diagram

    Lets look at an example of using the simplex algorithm to solve an LP.

    max 10x1 + 12x2 + 12x3 (97)s. t. x1 + x2 + x3 1

    x1 + 2x2 + 2x3 202x1 + x2 + 2x3 202x1 + 2x2 + x3 20x1, x2, x3 0.

    As is illustrated in Figure 6, the feasible region is within the corner points {A=(1,0,0), B=(0,1,0),E=(0,0,1), F=(10,0,0), C=(0,10,0), G=(0,0,10), D=(4,4,4)}.

    Step 1

    max =[

    10 12 12 0 0 0 0] [

    x1 x2 x3 w4 w5 w6 w7]>

    s. t.

    1 1 1 1 0 0 01 2 2 0 1 0 02 1 2 0 0 1 02 2 1 0 0 0 1

    x1x2x3w4w5w6w7

    =

    1202020

    [x1 x2 x3 w4 w5 w6 w7

    ]> 0.

    35

  • Figure 6: Graphic representation of LP (84)

    Step 2(b)

    max{ = c>x : Ax = b, x 0},

    where c =[

    10 12 12 0 0 0 0 M]>, A =

    1 1 1 1 0 0 0 11 2 2 0 1 0 0 02 1 2 0 0 1 0 02 2 1 0 0 0 1 0

    , x =[x1 x2 x3 w4 w5 w6 w7 t8

    ]>, and b =1202020

    . Initialize (B = {5, 6, 7, 8},N = {1, 2, 3, 4}).The current basic solution (x5,6,7,8 = A

    1B b =

    [20 20 20 1

    ]>, x1,2,3,4 = 0) corresponds to

    the yellow corner point O in Figure 6. Since big-M is being used and t8 = x8 = 1 > 0, this basicsolution is not feasible. But the big-M method will lead to a feasible basic solution.

    Step 3(a)

    c>N c>BA1B AN =

    [10 +M 12 +M 12 +M M

    ]. Go to Step 4.

    Step 4(a)

    36

  • The dictionary is

    = M +[

    10 +M 12 +M 12 +M M]

    x1x2x3x4

    x5x6x7x8

    =

    2020201

    1 2 2 02 1 2 02 2 1 01 1 1 1

    x1x2x3x4

    .The entering variable is x1, and the leaving variable is x8. Set (B = {5, 6, 7, 1},N = {8, 2, 3, 4}).

    The current fbs (x5,6,7,1 = A1B b =

    [19 18 18 1

    ]>, x8,2,3,4 = 0) corresponds to the blue

    corner point A in Figure 6. The big-M has found an fbs solution.

    Step 3(a)

    c>N c>BA1B AN =

    [M 10 2 2 10

    ]. Go to Step 4.

    Step 4(a)

    The dictionary is

    = 10 +[M 10 2 2 10

    ] x8x2x3x4

    x5x6x7x1

    =

    1918181

    1 1 1 12 1 0 22 0 1 21 1 1 1

    x8x2x3x4

    .The entering variable is x2, and the leaving variable is x1. Set (B = {5, 6, 7, 2},N = {8, 1, 3, 4}).

    The current fbs (x5,6,7,2 = A1B b =

    [18 19 18 1

    ]>, x8,1,3,4 = 0) corresponds to the blue

    corner point B in Figure 6.

    Step 3(a)

    c>N c>BA1B AN =

    [M 12, 2, 0, 12

    ]. Go to Step 4.

    Step 4(a)

    37

  • The dictionary is

    = 12 +[M 12, 2, 0, 12

    ] x8x1x3x4

    x5x6x7x2

    =

    1819181

    2 1 0 21 1 1 12 0 1 21 1 1 1

    x8x1x3x4

    .The entering variable is x4, and the leaving variable is x5. Set (B = {4, 6, 7, 2},N = {8, 1, 3, 5}).

    The current fbs (x4,6,7,2 = A1B b =

    [9 10 0 10

    ]>, x8,1,3,5 = 0) corresponds to the blue

    corner point C in Figure 6.

    Step 3(a)

    c>N c>BA1B AN =

    [M 4 0 6

    ]. Go to Step 4.

    Step 4(a)

    The dictionary is

    = 120 +[M 4 0 6

    ] x8x1x3x5

    x4x6x7x2

    =

    910010

    1 0.5 0 0.50 1.5 1 0.50 1 1 10 0.5 1 0.5

    x8x1x3x5

    .The entering variable is x1, and the leaving variable is x7. Set (B = {4, 6, 1, 2},N = {8, 7, 3, 5}).

    The current fbs (x4,6,1,2 = A1B b =

    [9 10 0 10

    ]>, x8,7,3,5 = 0) corresponds to the blue

    corner point C in Figure 6. Because of degeneracy, the simplex algorithm has updated the basicpartition but it corresponds to the same fbs.

    Step 3(a)

    c>N c>BA1B AN =

    [M 4 4 2

    ]. Go to Step 4.

    Step 4(a)

    38

  • The dictionary is

    = 120 +[M 4 4 2

    ] x8x7x3x5

    x4x6x1x2

    =

    910010

    1 0.5 0.5 00 1.5 2.5 10 1 1 10 0.5 1.5 1

    x8x7x3x5

    .The entering variable is x3, and the leaving variable is x6. Set (B = {4, 3, 1, 2},N = {8, 7, 6, 5}).

    The current fbs (x4,3,1,2 = A1B b =

    [11 4 4 4

    ]>, x8,7,6,5 = 0) corresponds to the red corner

    point D in Figure 6. It will be verified in the next step that this fbs is optimal.

    Step 3(c)

    c>N c>BA1B AN =

    [M 1.6 1.6 3.6

    ] 0, big-M method was used in Step 2, and

    x8 = 0, so stop, and the optimal solution is (x1,2,3,4 = A1B b =

    [4 4 4 11

    ]>,

    x5,6,7,8 = 0), = c>BA

    1B b = 136.

    7 Dual Simplex

    Let us look at the simplex algorithm from the duality perspective. First, we transform the primalmax{c>x : Ax b;x 0} and dual min{b>y : A>y c; y 0} into max{c>x : Ax+ w = b;x,w 0} and min{b>y : A>y z = c; y, z 0}, respectively. Then, the optimality condition becomes

    0 [xw

    ][zy

    ] 0.

    We can further transform these problems into max{ = c>x : Ax = b, x 0} and max{ = b>z :Az = c, z 0}, where

    c =[

    c0m1

    ], A = [A, Imm], x =

    [xw

    ], b =

    [0n1b

    ], A = [Inn, A>], z =

    [zy

    ].

    For an optimal basic partition (N ,B), the optimality condition becomes[xN = 0xB 0

    ][zN 0zB = 0

    ].

    Notice that non-basic dual variable zN 0 and basic dual variable zB = 0 may look counter-intuitive, but that is because the non-basis N and basis B are originally defined for the primalproblem, thus we need to pay special attention when use the same basic partition on the dual.

    The dictionary can also be defined for the dual problem using a basic partition (N ,B):

    = b>N A1N c+ (b>B b>N A1N AB)zB (98)

    zN = A1N c A1N ABzB. (99)

    39

  • Sometimes it is more convenient to apply the simplex algorithm to the dual LP instead of theprimal. For example, suppose we have a primal LP as follows

    max =[1 1

    ] [x1 x2

    ]>s. t.

    2 12 41 3

    [ x1x2

    ]

    487

    [x1 x2

    ]> 0.When we apply the simplex algorithm, we need to use the big-M method, because b =

    [4 8 7

    ]>has negative elements. However, if we look at the dual

    min =[

    4 8 7] [

    y3 y4 y5]>

    s. t.[2 2 11 4 3

    ] y3y4y5

    [ 11]

    [y3 y4 y5

    ]> 0,whose standard form LP is

    max =[4 8 7

    ] [y3 y4 y5

    ]>s. t.

    [2 2 11 4 3

    ] y3y4y5

    [ 11

    ][y3 y4 y5

    ]> 0,we can easily find an initial fbs to start the simplex algorithm. Now let us apply the simplexalgorithm to the dual and at the same time keep track of what happens to the primal.

    40

  • Apply simplex to the dual What happens to the primal

    max =[

    0 0 4 8 7]z1z2y3y4y5

    s. t.[1 0 2 2 10 1 1 4 3

    ]z1z2y3y4y5

    =[11

    ][z1 z2 y3 y4 y5

    ]> 0.

    max =[1 1 0 0 0

    ]x1x2w3w4w5

    s. t.

    2 1 1 0 02 4 0 1 01 3 0 0 1

    x1x2w3w4w5

    = 487

    [x1 x2 w3 w4 w5

    ]> 0.max{ = b>z : Az = c, z 0},where max{ = c>x : Ax = b, x 0},where

    b =[

    0 0 4 8 7]>

    A =[1 0 2 2 10 1 1 4 3

    ]z =

    [z1 z2 y3 y4 y5

    ]>c =

    [1 1 0 0 0

    ]>A =

    2 1 1 0 02 4 0 1 01 3 0 0 1

    x =

    [x1 x2 w3 w4 w5

    ]>Dual dictionary: = b>N A

    1N c+ (b

    >B b>N A

    1N AB)zB

    zN = A1N c A1N ABzB

    Primal dictionary: = c>BA

    1B b+ (c

    >N c>BA

    1B AN )xN

    xB = A1B bA

    1B ANxN

    iteration 1: B = {3, 4, 5},N = {1, 2}

    = 0 +[4 8 7

    ] z3z4z5

    [z1z2

    ]=[

    11

    ][

    2 2 11 4 3

    ] z3z4z5

    iteration 1: B = {3, 4, 5},N = {1, 2}

    = 0 +[1 1

    ] [ x1x2

    ] x3x4x5

    = 487

    2 12 41 3

    [ x1x2

    ]iteration 2: B = {3, 1, 5},N = {4, 2}

    = 4 +[12 4 3

    ] z3z1z5

    [z4z2

    ]=[

    0.53

    ][

    1 0.5 0.55 2 1

    ] z3z1z5

    iteration 2: B = {3, 1, 5},N = {4, 2}

    = 4 +[0.5 3

    ] [ x4x2

    ] x3x1x5

    = 1243

    1 50.5 20.5 1

    [ x4x2

    ]iteration 3: B = {3, 1, 4},N = {5, 2}

    = 7 +[18 7 6

    ] z3z1z4

    [z5z2

    ]=[

    14

    ][

    2 1 27 3 2

    ] z3z1z4

    iteration 3: B = {3, 1, 4},N = {5, 2}

    = 7 +[1 4

    ] [ x5x2

    ] x3x1x4

    = 187

    6

    2 71 32 2

    [ x5x2

    ]

    From the above table, we have some observations: (1) if b has a negative element but c 0, wecan avoid the big-M method by applying the simplex algorithm to the dual. (2) When we apply

    41

  • the simplex algorithm to the dual, the reduced costs (c>N c>BA1B AN ) in the corresponding primal

    dictionaries are always non-positive but the basic variables xB = A1B b are always infeasible until

    the last iteration. (3) The primal and dual dictionaries have a negative transpose relationship:

    [b>N A

    1N c (b

    >B b>N A

    1N AB)

    A1N c A1N AB

    ]=

    [c>BA

    1B b (c

    >N c>BA

    1B AN )

    A1B b A

    1B AN

    ]>.

    Based on the above observations, we could design a special rule of selecting the entering andleaving variables so that we could go through the iterations on the primal side (right-hand-sidecolumn of the above table) without writing down the dual iterations. An algorithm that appliesthis idea is called the dual simplex algorithm.

    Although point (3) has been observed throughout all iterations of the example, its theoreticalcorrectness is not obvious. The following table defines the problem and then the observation isgiven as a proposition.

    Primal:max = c>xs. t. Ax bx 0

    Dual:min = b>ys. t. A>y c

    y 0

    max = c>xs. t. Ax+ w = b

    x, w 0

    max = b>ys. t. A>y z = c

    y, z 0

    Define:

    c =[

    c0m1

    ]:=[cNcB

    ]A = [A, Imm] := [AN , AB]

    x =[xw

    ]:=[xNxB

    ]Define:

    b =[

    0n1b

    ]:=[bNbB

    ]A = [Inn, A>] := [AN , AB]

    z =[zy

    ]:=[zNzB

    ]

    max{ = c>x : Ax = b, x 0} max{ = b>z : Az = c, z 0}

    max = c>NxN + c>BxB

    s. t. ANxN +ABxB = bx 0

    max = b>N zN + b>B zBs. t. AN zN + ABzB = c

    z 0

    Primal dictionary = c>BA

    1B b+ (c

    >N c>BA

    1B AN )xN

    xB = A1B bA

    1B ANxN

    Dual dictionary = b>N A

    1N c+ (b

    >B b>N A

    1N AB)zB

    zN = A1N c A1N ABzB

    42

  • Proposition 5. The primal and dual dictionaries derived above have a negative-transpose relation-ship: [

    c>BA1B b (c

    >N c>BA

    1B AN )

    A1B b A

    1B AN

    ]=

    [b>N A

    1N c (b

    >B b>N A

    1N AB)

    A1N c A1N AB

    ]>.

    Proof. Bottom right: A1B AN = (A1N AB)>.First, we have

    AA> = [A, I][I, A>]> = 0 and AA> = [AN , AB][AN , AB]> = AN A>N +ABA>B = 0.

    Multiplying both sides of the last equation by A1B and (A>N )1, we have

    A1B [AN A

    >N +ABA

    >B ](A

    >N )1 = A1B AN + A

    >B (A

    >N )1 = 0.

    Bottom left: A1B b = (b>B b>N A1N AB)

    >.Top right: (c>N c>BA

    1B AN ) = (A1N c)>.

    Top left: c>BA1B b = (b>N A

    1N c)

    >.

    From the above point, we can see that in the primal dictionary

    = c>BA1B b+ (c

    >N c>BA

    1B AN )xN

    xB = A1B bA

    1B ANxN ,

    the optimality condition in the simplex algorithm (c>N c>BA1B AN ) 0 is actually the dual fea-

    sibility zN = A1N c 0. When the optimal dictionary is obtained, we get not only the pri-mal optimal solution (xB = A

    1B b, xN = 0) but also the dual optimal solution (zN = A

    1N c =

    (c>N c>BA1B AN ), zB = 0). Complementary slackness is satisfied since[

    xB 0xN = 0

    ][zB = 0zN 0

    ].

    An alternative to the big-M method is the following. Consider

    (P0): max{c>x : Ax b, x 0}

    and its dual(D0): min{b>y : A>y c, y 0}.

    Step 1 If b 0, then apply the simplex algorithm on (P0) with the initial feasible basic solutionx = 0. Otherwise go to Step 2.

    Step 2 If c 0, then apply the simplex algorithm on (D0) with the initial feasible basic solutiony = 0. Otherwise go to Step 3.

    Step 3 Construct the following LP:

    (P1): max{0 : Ax b, x 0}

    and its dual(D1): min{b>y : A>y 0, y 0}.

    Apply the simplex algorithm on (D1) with the initial feasible basic solution y = 0. If (D1) isoptimal, go to Step 4; if (D1) is unbounded, go to Step 5.

    43

  • Step 4 Let y1 and x1 be optimal solutions to (D1) and (P1), respectively. Apply the simplexalgorithm on (P0) with x1 as the initial feasible basic solution.

    Step 5 If (D1) is unbounded, then (P1) is infeasible, thus (P0) is also infeasible.

    For example, we have the following (P0) with b1 < 0 and c2 > 0.

    max =[14 6

    ] [x1 x2

    ]>s. t.

    7 167 913 16

    [ x1x2

    ]

    478

    [x1 x2

    ]> 0.We construct (D1) as

    min =[4 7 8

    ] [y3 y4 y5

    ]>s. t.

    [7 7 1316 9 16

    ] y3y4y5

    [ 00

    ][y3 y4 y5

    ]> 0.We solve (D1), and it turns out to be unbounded, which means that (P0) is infeasible.

    8 Sensitivity Analysis

    Suppose we have found an optimal solution x to an LP max{c>x : Ax b, x 0} using thesimplex algorithm, and then we change some of the parameters (A, b, c) and have a new LP. Wehave the following questions: Is the optimal basic partition for the original LP still optimal to thenew LP? If the answer is yes, then the optimal solution for the new LP can be easily obtained. Ifthe answer is no, do we need to solve the new LP from the very beginning, or can we use the originalLPs optimal solution to find the new LPs optimal solution faster? This type of post-optimalityanalysis is called sensitivity analysis. We focus our discussion on the following four cases.

    8.1 Changing the objective coefficients: c c+ c

    Changing the objective coefficients will not affect the primal feasibility, but could possibly affectthe dual feasibility. The new dual dictionary is

    = b>N A1N (c+ c) + (b

    >B b>N A

    1N AB)zB

    zN = A1N (c+ c) A1N ABzB.

    The range of for the original LPs optimal basic partition to remain optimal to the new LPcan be obtained by solving the following inequality:

    A1N (c+ c) 0.

    If A1N (c + c) 0 holds, then the original LPs optimal basic partition is still optimal tothe new LP.

    44

  • Otherwise, the new dictionary doesnt have dual feasibility, but primal feasibility remains.We can find the optimal solution for the new LP by simply applying the simplex iterationsto the new primal dictionary. Since we dont need to start over from step 1 of the simplexalgorithm, we could find the optimal solution to the new LP faster.

    For example, consider the following LP:

    max =[1 1

    ] [x1 x2

    ]>s. t.

    2 12 41 3

    [ x1x2

    ]

    487

    [x1 x2

    ]> 0.The optimal basic partition is (B = {3, 1, 4},N = {5, 2}). Suppose we change c =

    [1 1

    ]>to c+ c with c =

    [1 0

    ]>, and we want to find the range of that keeps the partitionoptimal. The dual dictionary is:

    = 7 7+[18 7 6

    ] z3z1z4

    [z5z2

    ]=

    [1 4 3

    ][

    2 1 27 3 2

    ] z3z1z4

    .For the partition to remain optimal, we need 1 0 and 4 3 0, thus 1.

    Suppose = 1.1, then the primal dictionary becomes:

    = 0.7 +[

    0.1 0.7] [ x5

    x2

    ] x3x1x4

    = 187

    6

    2 71 32 2

    [ x5x2

    ].

    We can then conclude that changing c from[1 1

    ]to[

    0.1 1]

    would make the LPunbounded.

    8.2 Changing the right-hand-side: b b+ b

    Changing the right-hand-side will not affect the dual feasibility, but could possibly affect the primalfeasibility. The new primal dictionary is

    = c>BA1B (b+ b) + (c

    >N c>BA

    1B AN )xN

    xB = A1B (b+ b)A

    1B ANxN .

    The range of for the original LPs optimal basic partition to remain optimal to the new LPcan be obtained by solving the following inequality:

    A1B (b+ b) 0.

    45

  • If A1B (b + b) 0 holds, then the original LPs optimal basic partition is still optimal tothe new LP. The new optimal primal solution becomes (xB = A

    1B (b+b), xN = 0), but the

    optimal dual solution stays the same. From strong duality, the new optimal objective value = (b+ b)>y, where y is the optimal dual solution to the original problem. As long asthe basic partition stays optimal, y gives the rate at which b changes the objective value.Therefore, y is also known as the shadow price.

    Otherwise if A1B (b+b) 0 doesnt hold, the new dictionary doesnt have primal feasibility,but dual feasibility remains. We can find the optimal solution for the new LP by applyingthe simplex iterations to the dual dictionary. Since we dont need to start over from step 1of the simplex algorithm, we could find the optimal solution to the new LP faster.

    Suppose we change b =[

    4 8 7]> to b + b with b = [ 6 5 2 ]>, and we

    want to find the range of that keeps the partition optimal. The primal dictionary is:

    = 7 2+[1 4

    ] [ x5x2

    ] x3x1x4

    = 18 27 + 2

    6

    2 71 32 2

    [ x5x2

    ].

    For the partition to remain optimal, we need 18 2 0, 7 + 2 0, and 6 0, thus3.5 6.

    8.3 Adding k new constraints:

    A A =

    A

    a>m+1...

    a>m+k

    and b b =

    bbm+1

    ...bm+k

    Corresponding to the k new constraints, there should be k new slack variables in the primal

    problem and k new dual variables in the dual. If we add all the k new variables into the basisB = B {n+m+ 1, ..., n+m+ k} and look at the new dual dictionary

    = b>N A1N c+ (b

    >B b>N A

    1N A

    B)zBzN = A1N c A

    1N A

    B zB ,

    the dual feasibility will not be affected, but primal feasibility could be affected.

    If the original optimal solution x satisfies the new constraints:

    a>m+1...

    a>m+k

    x bm+1...bm+k

    ,then x is still optimal.

    Otherwise, we can find the optimal solution for the new LP by applying the simplex iterationsto the new dual dictionary. Since we dont need to start over from step 1 of the simplexalgorithm, we could find the optimal solution to the new LP faster.

    46

  • Suppose we add a new constraint x1 + x2 6. The optimal solution to the original LPis x1 = 7, x

    2 = 0, which violates the new constraint. Therefore, the original optimal basic

    partition is no longer optimal. We update the dual dictionary as follows:

    = 7 +[18 7 6 1

    ] z3z1z4z6

    [z5z2

    ]=[

    14

    ][

    2 1 2 17 3 2 4

    ]z3z1z4z6

    .Obviously, the dual LP is unbounded, thus the primal is infeasible with the new constraint.

    8.4 Adding k new variables:

    x x =

    x

    xn+1...

    xn+k

    , A A = [ A an+1 an+k ], and c c =

    ccn+1

    ...cn+k

    If we add all the k new variables into the non-basis N = N {n + m + 1, ..., n + m + k},

    the primal feasibility will not be affected, but dual feasibility could be affected. The new primaldictionary is

    = c>BA1B b+ (c

    >N c>BA

    1B A

    N )xN xB = A

    1B bA

    1B A

    N xN .

    If the original dual optimal solution y satisfies the new constraints:

    a>n+1...

    a>n+k

    y cn+1...cn+k

    ,then (x, y) is still optimal.

    Otherwise, we can find the optimal solution for the new LP by simply applying the simplexiterations to the new primal dictionary. Since we dont need to start over from step 1 of thesimplex algorithm, we could find the optimal solution to the new LP faster.

    Suppose we add a new variable x3, expand A to A =

    2 1 32 4 21 3 1

    , and change cto c =

    [1 1 1

    ]>. Since the dual optimal solution y = [ 0 0 1 ]> satisfies theconstraint

    [3 2 1

    ]y 1, we conclude that the optimal solution to the original LP

    remains optimal with x3 = 0.

    9 Decomposition

    Special techniques can be applied to solve some large-scale LP problems with special structures,which would be much slower or even impossible to solve otherwise. Three examples of thesetechniques, namely Benders decomposition, Dantzig-Wolfe decomposition, and column generationare briefly introduced here.

    47

  • 9.1 Benders decomposition

    Consider the following LP problem:

    max =[c>0 c

    >1 c

    >2 c>k

    ]x0x1x2...xk

    (100)

    s. t.

    A0 0 0 0

    A1 D1 0. . . 0

    A2 0 D2. . . 0

    ......

    . . . . . ....

    Ak 0 0 Dk

    x0x1x2...xk

    b0b1b2...bk

    [x0 x1 x2 xk

    ]> 0,where ci Rni1, Ai Rmini , bi Rmi1, xi Rni1 for i = 1, 2, ..., k. If k, mis and nis arelarge, this problem could be extremely hard to solve due to its dimension. However, we can takeadvantage of its special structure and solve this large problem by solving much smaller problemsiteratively as follows.

    Step 1 Solve the following master problem:

    max = c>0 x0 + z1 + z2 + + zk (101)s. t. A0x0 b0

    Constraints (106) and (108) from Step 2.x0 0; z1, z2, ..., zk free.

    If the master problem is infeasible, then we stop and conclude that the original LP (100)is infeasible. Otherwise, let (x0, z

    1 , z2 , ..., z

    k) be the optimal solution to (101), which could

    possibly include + values. For example, in the first few steps, there may be no constraintsto keep some zis from approaching +. In that case, we simply let those zis be + andonly maximize over the constrained variables. Here zi represents an optimistic estimate ofc>i x

    i in (100).

    Step 2 For i = 1, 2, ..., k, solve the following subproblems:

    max i = c>i xi (102)s. t. Dixi bi Aix0

    xi 0.

    These subproblems can be solved in parallel to save computation time. The results of thesesubproblems fall into one of the following three cases.

    (Case I:) If (102) is infeasible, then there must exist an extreme ray for the dual to (102)

    min i = (bi Aix0)>yi (103)s. t. D>i yi ci

    yi 0.

    48

  • We can obtain an extreme ray yi of (103) by solving the following problem:

    min (bi Aix0)>yi (104)s. t. D>i yi 0

    0mi1 yi 1mi1.

    We should send a message back to the master problem that x0 should not make the subprob-lems infeasible. The ideal message would be

    (bi Aix0)>yi(x0) 0, (105)

    where yi(x0) is the optimal solutio