Optimal Design of Experiments - Andreas Weigend, Social Data · PDF file 2008. 2....

Click here to load reader

  • date post

    27-Jan-2021
  • Category

    Documents

  • view

    0
  • download

    0

Embed Size (px)

Transcript of Optimal Design of Experiments - Andreas Weigend, Social Data · PDF file 2008. 2....

  • © Crary Group, 2004 1/31

    Optimal Design of Experiments presentation for Stanford University Statistics 252 class

    Data Mining and Electronic Business

    July 7, 2004

    Selden B. Crary, Ph.D.

    Crary Group, Palo Alto

    [email protected]

    WebDOE™WebDOE™

  • © Crary Group, 2004 2/31

    Design of Experiments Makes a Difference

    Weighing two apples

    Simple method:

    W1= 0.63 kg ± 2σ

    W2= 0.57 kg ± 2σ

    σ = std. dev. of measurement

    Is there a better method?

    WebDOE™WebDOE™

  • © Crary Group, 2004 3/31

    R1 = W1 + W2 R2 = W1 – W2

    Solve for W1 and W2:

    W1 = ½ (R1 + R2)

    W2 = ½ (R1 – R2)

    [ ]

    2 .dev.Std

    2 )R(Var)R(Var

    2 1)W(Var)W(Var

    2

    21221

    σ

    σ

    =

    =+==

    Errors in weights reduced by 29%

    WebDOE™WebDOE™

  • © Crary Group, 2004 4/31

    Many industrial products and processes benefit from designed experiments

    Spaghetti Sauce

    .

    .

    .

    Semiconductors

    WebDOE™WebDOE™

  • © Crary Group, 2004 5/31

    Statistical Design of Experiments: Response-Surface Methodology (RSM)

    Application Areas: Improvement of products and processes

    Three Examples: Quadratic response: one factor

    Sensor calibration: two factors

    Direct mail marketing campaign: multiple factors

    WebDOE™WebDOE™

  • © Crary Group, 2004 6/31

    Examples: Features: Agriculture (static designs) Easy to find

    Pharma (sequential designs) Sample space ~ uniformly

    Quality control in manufacturing Not optimal

    Full-factorial design Fractional-factorial design

    ······· ···

    ·· ··· ····

    ···· · ··

    Classical Design of Experiments

    WebDOE™WebDOE™

  • © Crary Group, 2004 7/31

    Problem: The size of factorial and fractional- factorial designs grows exponentially with the number of variables.

    0

    100

    200

    300

    0 2 4 6 8 10 12

    N um

    be r o

    f t ria

    ls

    Number of factors

    Optimal Designs

    Factorial and Fractional- Factorial Designs

    WebDOE™WebDOE™

  • © Crary Group, 2004 8/31

    Methods used to design experiments depended upon the level of computation available:

    Classical designs (Fisher)

    Orthogonal arrays (Taguchi)

    D-optimal designs

    I-optimal, G-optimal, and application-specific optimal designs

    WebDOE™WebDOE™

  • © Crary Group, 2004 9/31

    Optimal Statistical Design of Experiments

    Application Areas

    • Further improvement of products and processes (50% smaller confidence bounds on prediction are typical)

    • Increased accuracy of standard-cell libraries

    • Optimization of hospital staffing

    • Optimization of marketing campaigns

    WebDOE™WebDOE™

  • © Crary Group, 2004 10/31

    Theory

    WebDOE™WebDOE™

  • © Crary Group, 2004 11/31

    Model Polynomial response model:

    e.g., Y = β0 + β1 x1 + β2 x2 + β3 x12 + β4 x22 + β5 x1 x2 or Y = fT β (linear in the βι’s)

    Measurements:

    e.g., Y = β0 + β1 x1 + β2 x2 + β3 x12 + β4 x22 + β5 x1 x2 + ε

    {Yi} i=1, …, N

    or Y = X β + ε,

    ε ~ i.i.d. with mean zero and variance σ2

    WebDOE™WebDOE™

  • © Crary Group, 2004 12/31

    Example of X, the design matrix

           

           

    =

    2)N(1)N( 2 2)N(

    2 1)N(2)N(1)N(

    2)2(1)2( 2 2)2(

    2 1)2(2)2(1)2(

    2)1(1)1( 2 2)1(

    2 1)1(2)1(1)1(

    xxxxxx1 ...... ...... ...... xxxxxx1 xxxxxx1

    X

    Y = β0 + β1 x1 + β2 x2 + β3 x12 + β4 x22 + β5 x1 x2 + ε

    or, Y = X β + ε

    WebDOE™WebDOE™

  • © Crary Group, 2004 13/31

    ( ) ( )

    ( ) designsintpoNofsettheiswhere

    ,||XX||min

    :optimalityD

    XX)ˆ(Var

    YXXXˆ

    N

    1T

    1T2

    T1T

    N

    =

    =

    ω

    σβ

    β

    ω

    D-optimality: minimize area of confidence interval

    .

    βii β̂

    jβ̂

    Optimal Designs for Parameter Estimation 95% confidence interval for parameter estimationParameters β βj

    WebDOE™WebDOE™

  • © Crary Group, 2004 14/31

    Example 1:

    D-optimal design for fitting a parabola with N=5 points

    Y = β0 + β1x1 + β2x12

    Var of prediction =

    D IV*

    0.667

    0.2

    0.4

    0.6

    0.8

    1.0

    1.2

    -1 -0.5 0 0.5 1

    Va ria

    nc e

    of P

    re di

    ct io

    n x

    D

    ( ) ( )  

      −=

    2 YŶEŶVar

    *IV=Integrated Variance: defined on next slide

    WebDOE™WebDOE™

  • © Crary Group, 2004 15/31

    Optimal Designs for Prediction

    ( ) ( )

    ( ) ( )

    ( ) VarianceIntegrated

    dx...

    dx)x(Y)x(ŶE...

    IV

    dxfXXf...mindx)x(Y)x(ŶE...min

    overerrorpredictionaverageimizemin:optimalityI

    fXXfmaxmin)x(Y)x(ŶEmaxmin

    regionaover errorpredictioncaseworstectedexpimizemin:optimalityG

    x

    x

    2

    x

    1TT

    x

    2

    1TT2

    NN

    NN

    =  

      −

    =

    = 

      −

    = 

      −

    −−

    ∫ ∫

    ∫ ∫

    ∫ ∫∫ ∫

    χ

    χ

    χ ω

    χ ω

    χωχω

    χ

    χ

    WebDOE™WebDOE™

  • © Crary Group, 2004 16/31

    D-, G-, and I- optimal designs for fitting a parabola with N=5 points

    Y = β0 + β1x1 + β2x12

    G I

    D IV

    0.667 0.551 0.444

    0.2

    0.4

    0.6

    0.8

    1.0

    1.2

    -1 -0.5 0 0.5 1

    Va ria

    nc e

    of P

    re di

    ct io

    n x

    D

    I

    G

    WebDOE™WebDOE™

  • © Crary Group, 2004 17/31

    Interesting Observations Re I-optimal Designs

    Asymmetric optimal designs can result from symmetric problem statements.

    Phase transitions exist between designs of various symmetries, as weights assigned to different regions are changed continuously.

    WebDOE™WebDOE™

  • © Crary Group, 2004 18/31

    Extensions of I-optimal Designs (1)

    Region need not be square, cube, or hyper-cube, i.e., inequality constraints can be handled.

    Penalty for variance can be weighted.

    Equality constraints can be handled, e.g., mixtures.

    Variance of responses can be taken into account, i.e., heteroscedasticity can be treated.

    WebDOE™WebDOE™

  • © Crary Group, 2004 19/31

    Extensions of I-optimal Designs (2)

    Points can be forced to be in the design.

    Optimal augmented designs are possible.

    Prior information on the distribution of β’s, e.g., from previous measurements,

    can be taken into account (Bayesian approach).

    I-optimality can be extended to deterministic error models (computer experiments).

    WebDOE™WebDOE™

  • © Crary Group, 2004 20/31

    Example 2: Optimal Sensor Calibration

    MEMS pressure sensor in an automotive setting

    Sensor capacitance C=C(P,T)

    Find an efficient design for determining a compensation formula:

    C = C(P,T) = β0+ β1T + β2P + β3P2 + β4P3

    Design must have at least Nmin=5 points

    Changing P is less expensive than changing T.

    WebDOE™WebDOE™

  • © Crary Group, 2004 21/31

    I-optimal Designs for

    C(P,T) = β0+ β1T + β2P + β3P2 + β4P3

    N=5 N=40

    P

    T

    WebDOE™WebDOE™

  • © Crary Group, 2004 22/31

    Variance Contours of I-optimal Designs for

    C(P,T) = β0+ β1T + β2P + β3P2 + β4P3

    0.6

    0.8

    0.6

    0.06

    0.07

    0.08

    N=5 N=40

    WebDOE™WebDOE™

  • © Crary Group, 2004 23/31

    0.1

    1.0

    10 100

    IV

    N

    slope ~N-1

    Integrated Variance vs. Number of Points C(P,T) = β0+ β1T + β2P + β3P2 + β4P3

    Compensated Pressure = P(C,T) + ∆

    WebDOE™WebDOE™

  • © Crary Group, 2004 24/31

    If some regions are more important for prediction than others, use weighted I-optimal designs

    Example: C(P,T) = β0+ β1T + β2P + β3P2 + β4P3, N=7

    ( ) ( ) dCdTefXXfmin 222 N

    ~2/TC1T

    C T

    T σ

    ω

    +−−∫ ∫

    P

    T

    N=7, σ=0.1∞

    P

    T

    N=7, σ=

    WebDOE™WebDOE™

  • © Crary Group, 2004 25/31

    Example 3: Similar to On-Line Marketing:

    Direct Mail Marketing of Credit-Cards* (1)

    Challenges:

    Response rates down: 2