Resolving the Mysteries of Six Sigma (1.5 Sigma Shift...No part of this book may be used or...

Resolving the Mysteries of Six Sigma: Statistical Constructs and Engineering Rationale

April 21, 2003

Mikel J. Harry, Ph.D.

Founder and Chairman of he Board t

Six Sigma Management Institute Scottsdale, Arizona

Copyright 2003, Mikel Harry, Ph.D.

Resolving the Mysteries of Six Sigma:

Statistical Constructs and Engineering Rationale by Mikel Harry, Ph.D. Copyright © 2003 by Mikel Harry, Ph.D. Six Sigma is a registered trademark of Motorola, Inc. All Rights Reserved. No part of this book may be used or reproduced in any manner whatsoever without written permission from the publisher except in the case of brief quotations embodied in critical articles and reviews. Publisher: Palladyne Publishing Distributor: Tri Star Visual Communications 3110 North 35th Avenue, Suite 4 Phoenix, Arizona 85017 (602) 269-2900 [email protected] Design, layout and printing: Tri Star Visual Communications, Phoenix, Arizona www.tristarvisual.com ISBN 0-9715235-1-7

Table of contents Forward i

1.0 Introducing the Context 1

1.1 Unfolding the history 1.2 Adopting the principles

2.0 Extending the Context 7

2.1 Defining the root 2.2 Expanding the function 2.3 Describing the interaction 2.4 Rationalizing the sample 2.5 Detecting the error 2.6 Classifying the error 2.7 Declaring the opportunity 2.8 Qualifying the interface

3.0 Interrogating the Context 23

3.1 Articulating the goal 3.2 Polishing the definition 3.3 Inflating the error 3.4 Calibrating the shift 3.5 Rationalizing the shift 3.6 Applying the shift 3.7 Framing the correction 3.8 Establishing the center

4.0 Understanding the shift 42

4.1 Identifying the expectations 4.2 Conducting the analysis 4.3 Considering the implications 4.4 Constructing the worst-case 4.5 Exploring the consequences 4.6 Visualizing the distributions

5.0 Examining the shift 58

5.1 Establishing the equality 5.2 Developing the correction 5.3 Advancing the concepts 5.4 Analyzing the system

6.0 Validating the shift 63

6.1 Conducting the simulation 6.2 Generalizing the results 6.3 Pondering the issues

© Mikel J. Harry, Ph.D. 2002

7.0 Contracting the error 68

7.1 Conducting the analysis 7.2 Drawing the conclusions 7.3 Verifying the conclusions 7.4 Establishing the shortcut

8.0 Partitioning the error 76

8.1 Separating the noise 8.2 Aggregating the error 8.3 Rationalizing the sample

9.0 Analyzing the partitions 85

9.1 Defining the components 9.2 Analyzing the variances 9.3 Examining the entitlement

10.0 Computing the Correction 94

10.1 Computing the shift 10.2 Resolving the shift 10.3 Calculating the minimum 10.4 Connecting the capability

11.0 Harnessing the Chaos 111

11.1 Setting the course 11.2 Framing the approach 11.3 Limiting the history 11.4 Understanding the chaos 11.5 Evolving the heuristics 11.6 Timing the geometry 11.7 Exemplifying the fractal 11.8 Synthesizing the journey

12.0 Concluding the discussion 126

Appendix A: Guidelines for the Mean Shift 128 References and Bibliography 130

Forward

Two pillars of seemingly mystical origin and uncertain composition have long

supported the practice of six sigma. The first pillar is characterized by the quantity "six"

in the phrase "six sigma." The second pillar is related to the 1.5 sigma shift. This book

sets forth the theoretical constructs and statistical equations that underpin and validate

both of these pillars, as well as several other intersecting issues related to the subject.

The reader should be aware that this book has been prepared from a design

engineering perspective. Owing to this, it can fully support many of the aims associated

with design-for-six-sigma (DFSS). Although skewed toward design engineers, this book

provides a methodology for risk analysis that would be of keen interest to producibility

engineers. In addition, the book is also intended for quality professionals and process

engineers that are responsible for the "qualification" of a process prior to its adoption.

With these aims in mind, the ensuing discussion will mathematically demonstrate

that the "1.5 sigma shift" can be attributable solely to the influence of random error. In

this context, the 1.5 sigma shift is a statistically based correction for scientifically

compensating or otherwise adjusting a postulated model of instantaneous reproducibility

for the inevitable consequences associated with random sampling variation. Naturally,

such an adjustment (1.5 sigma shift) is only considered and instituted at the opportunity

level of a product configuration. Thus, the model performance distribution of a given

critical performance characteristic can be affectively attenuated for many of the

operational uncertainties associated with a design-process qualification (DPQ).

Based on this quasi-definition, it should fairly evident that the 1.5 sigma shift factor

can often be treated as a "statistical correction," but only under certain engineering

conditions that would generally be considered “typical.” By all means, the shift factor

(1.5 sigma) does not constitute a "literal" shift in the mean of a performance distribution

– as many quality practitioners and process engineers falsely believe or try to postulate

through uniformed speculation and conjecture. However, its judicious application during

the course of designing a system, product, service, event, or activity can greatly facilitate

the analysis and optimization of "configuration repeatability."


By the conscientious application of the 1.5 sigma shift factor (during the course of

product configuration), an engineer can meaningfully "design in" the statistical and

pragmatic confidence necessary to ensure or otherwise assure that related performance

safety margins are not violated by unknown (but anticipated) process variations. Also of

interest, its existence and conscientious application has many pragmatic implications (and

benefits) for reliability engineering. Furthermore, it can be used to "normalize" certain

types and forms of benchmarking data in the interests of assuring a "level playing field"

when considering heterogeneous products, services, and processes.

In summary, the 1.5 sigma shift factor should only be viewed as a mathematical

construct of a theoretical nature. When treated as a "statistical correction," its origin can

be mathematically derived as an equivalent quantity representing or otherwise reflecting

the "worst-case error" inherent to an estimate of short-term process capability. As will be

demonstrated, the shift factor is merely an "algebraic byproduct" of the chi-square

distribution that will vary depending on the accepted level of risk and prevailing degrees-

of-freedom. However, when typical application circumstances are postulated and

rationally evaluated, the resulting shift will prove to be approximately equivalent to 1.5

sigma.

Dr. Mikel Harry, Scottsdale, Arizona April 21, 2003


1.0 Introducing the Context

1.1 Unfolding the history

Today, a great many professionals already know the story of six sigma.

As one of the original architects of six sigma, this author has carefully

observed its phenomenal growth over the years. During this course of time,

numerous business executives have watched six sigma expand from a simple

quality target to a viable system of business management. In fact, Jack Welch

(former CEO of General Electric) has said that six sigma is the first initiative

to come along that reaches the “control function” of a corporation.

During the evolution of six sigma, numerous questions have been

formulated, fielded and addressed by this author. These questions span the

chasm between the issues of business executives and the concerns of technical

practitioners. In this author’s recollection, two recurring questions have

dominated the broad field of technical inquiry, at least where the statistical

theory of six sigma is concerned.

The first type of recurring inquiry can be described by the global

question: “Why 6σ and not some other level of capability?” The second type

of inquiry is more molecular in nature. It can be summarized by the

compound question: “Where does the 1.5σ shift factor come from – and why

1.5 versus some other magnitude?”

Although some quality professionals still debate the merits of a six

sigma level of capability (per opportunity) and the mysterious, so-called “shift

factor,” many have judiciously ignored such rhetoric. These pioneers have

persevered by the continued demonstration of successful application – reaping

considerable treasure along the way. Such individuals are concerned only

with results, not with statistical theory and academic debate that is best left to

the mathematicians and statisticians of the world.

Still, regardless of success, pervasive questions regarding six sigma and

the the1.5σ shift remain in the minds of many theorists and quality

professionals. Although well-intentioned answers are often set forth in the

form of cursory explanations, there is a dearth of technical exposition. In

© Mikel J. Harry, Ph.D. 2002 1

short, such attempts to clearly prescribe the genetic code of six sigma and its

associated 1.5σ shift factor have fallen short of their mark. To satisfy this

apparent need, our discussion will uncloak the theoretical origins of six sigma,

and concurrently demystify the 1.5σ correction factor, by directly examining

their underlying determinants in the context of producibility analysis.1

Further, the examination will be conducted and presented from a theoretical as

well as pragmatic frame of reference.2 As well, a series of sidebar discussions

and several progressive case examples will be provided during the course of

presentation so as to reinforce certain aspects and features of the instructive

content.

At the onset of six sigma in 1985, this writer was working as an engineer

at the Government Electronics Group of Motorola. By chance connection,

this practitioner linked up with another engineer by the name of Bill Smith

(originator of the six sigma concept in 1984). At that time, Bill’s proposition

was eloquently simple. He suggested that Motorola should require 50 percent

design margins for all of its key product performance specifications. When

considering the performance tolerance of any given critical design feature, he

believed that a conventional 25 percent “cushion” was not sufficient for

1 Based on application experiences, Mr. William “Bill” Smith proposed the 1.5σshift factor more than 18

years ago (as a compensatory measure for use in certain reliability and engineering analyses). At that time, the author of this book conducted several theoretical studies into its validity and judiciously examined its applicability to design and process work. The generalized application components were subsequently published in several works by this author (see bibliography). While serving at Motorola, this author was kindly asked by Mr. Robert “Bob” Galvin not to publish the underlying theoretical constructs associated with the shift factor, as such “mystery” helped to keep the idea of six sigma alive. He explained that such a mystery would help “keep people talking about six sigma in the many hallways of our company.” To this end, he fully recognized that no matter how valid an initiative may be, if people stop talking about it, interest will be greatly diminished, or even lost. In this vein, he rightfully believed that the 1.5σ mystery would motivate further inquiry, discussion and lively debate – keeping the idea alive as six sigma seated itself within the corporation. For such wisdom and leadership, this author expresses his deepest gratitude. However, after 18 years, the time has come to reveal the theoretical basis of six sigma and that of the proverbial 1.5σ shift.

2 At all times, the reader must remain cognizant of the fact that the field of producibility assessment is relatively new territory in terms of engineering. Because of this, and its enormous scope, the full and complete articulation of certain details is not possible within the confines of this book. As a consequence, emphasis is placed on the development of a conceptual understanding at many points in the discussion. Also recognize that the focus of this book is on the statistical theory and supporting engineering rational surrounding the 1.5σ shift factor advocated by the practice of six sigma. Nonetheless, the discussion is framed in the context of design engineering and producibility analysis.


absorbing a sudden shift in process centering on the order of 1.5σ (relative to

the target value).

Regardless of the exact magnitude of such a disturbance (shock) to the

centering of a critical performance characteristic, those of us working this

issue fully recognized that the initial estimate of process capability will often

erode over time in a “very natural way” – thereby increasing the expected rate

of product defects (when considering a protracted period of production).

Extending beyond this, we concluded that the product defect rate was highly

correlated to the long-term process capability, not the short-term capability.

Of course, such conclusions were predicated on the statistical analysis of

empirical data gathered on a wide array of electronic devices. At that time,

those of us involved in the initial research came to understand that an estimate

of long-term capability is mostly influenced by two primary contributors – the

extent of instantaneous reproducibility and the extent of process centering

error. In essence, we began to see the pragmatic connection between design

margin, process capability, defects and field reliability.

It must be remembered that Bill’s initial assertions (prior to such

research) were seemingly extremist in nature, at least during that period of

time. Although the ideas had strong intuitive appeal, many design engineers

experienced a very high degree of skepticism, to say the least. The notion of

using a 50 percent margin on both sides of a bilateral performance

requirement (for key design features only) was certainly outside the bounds of

conventional wisdom. Again, at that point in time, conventional engineering

practice advocated a 25 percent design margin for most applications.

Moreover, the unconventional practice of imposing a 1.5σ shift on all of the

critical performance features of a design (so as to test certain system-level

producibility and performance assumptions) did seem somewhat bizarre, even

to very liberal engineers of the time. However, Bill’s many years of

successful manufacturing and engineering experience gave just cause for our

attention – thus meriting further inquisition by this researcher and practitioner.


After several more highly interactive discussions with Bill, this

researcher established that his grounded assertions were well founded and

quite rational – from an engineering and statistical point of view. Essentially,

he was saying that the design margins affixed to certain key design features

(often called CTQs) should be bilaterally increased from 25 percent to 50

percent so as to compensate for the aggregate effect of “normal” process

perturbations that would otherwise not be accounted for over a relatively short

period of sampling. As this research would later validate, such errors are

inherently and progressively manifested in the form of an enlarged standard

deviation, therein expanding or otherwise inflating the performance

distribution. Although not a part of the initial discussions with Bill, it became

all too apparent that he was indirectly attempting to account for long-term

sources of random sampling error (on the order of 1.5σ).

Such temporal error can sometimes manifest itself in a variety of ways,

among which is dynamic momentary shifts in process centering. Naturally,

such shifting is inevitably encountered during the course of production (as can

be readily verified by statistical sampling over protracted periods of time).3

Along these lines, we reasoned that such process behaviors could be

anticipated and subsequently compensated for early on in the design process.

By doing so, the observed in-process dpu could be lowered (by design) to

such an extent that the overall product reliability (MTBF) would be

significantly enhanced. From a design and reliability point of view, his

assertion was well taken.

From this perspective, Bill’s arguments were persuasive. He argued

that, as a result of increasing design margins, the “instantaneous failure rate”

of the product would naturally improve. Also, the need for “in-process

testing” could be significantly reduced. In addition, there would be huge

3 Although a particular set of subgroup-to-subgroup centering errors may be classified as “random,” their

individual existence is nonetheless unique and real. Just because the subgroup-to-subgroup variation is random does not preclude the resulting units of product from being different. In short, the subgroup differences (in terms of centering) may be statistically insignificant (owing to random sampling error), but the difference is real – in an absolute sense, no matter how small or large. Owing to this, it is rational to assert that such variation would induce unit-to-unit differences in reliability.


benefits associated with the reduction of “burn in” time. Furthermore, the

resulting decrease in dpu (by way of increased design margins) would

virtually eliminate the need for in-line test and inspection, thereby reducing

production costs, not to mention the implications on reducing warranty costs,

work in process and process cycle-time. From all of this, he proposed a huge

economic benefit to Motorola and more satisfied customers.

As a member of the engineering community, this researcher found Bill’s

ideas about quality, applied engineering and production management most

intriguing, even though at the time his ideas were mostly undefined and fully

undefended from a statistical point-of-view. In fact, much of the related

literature available in the mid-eighties often posed contrary arguments to

much of Bill’s thinking. To justify any professional use of these concepts,

this practitioner needed to see the statistical architecture and analytical

building blocks that would support his core of reasoning – subsequently

validated with “real world” data. Following one discussion along these lines,

Bill asked this writer if he could investigate the matter from a statistical point

of view. Little did this author know (at that time), a simple “look into the

matter” would trigger an 18-year quest for “enlightenment.”

Over the years to come, this researcher and practitioner enjoyed many

discoveries, among which was the mathematical basis and necessary empirical

evidence to support several of Bill’s original precepts. In fact, this quest

ultimately impacted this writer’s views about statistics, quality, engineering

and how a business should be organized and run. The resulting pursuit lead

this investigator to invent and subsequently disseminate such contributions as

the Breakthrough Strategy® (DMAIC); the black belt concept, terminology

and infrastructure; the plan-train-apply-review (PTAR) cycle of learning; the

idea of Cp* and Cpk* (now known as Pp and Ppk). Suffice it to say, this


author has had a few epiphanies along the way, and his career has never been

the same since.4

1.2 Adopting the Principles

For purposes of simplified knowledge transfer and meaningful

communication, this researcher (and several other founding agents of six

sigma) decided the idea of a shifted distribution would have far more

cognitive appeal within the general workforce (at Motorola) than the idea of

an “inflated” short-term sampling standard deviation (used principally for

purposes of design analysis). Underlying this assertion was the general belief

that most process workers can readily understand that a process will naturally

“shift and drift” over time. However, to assert that the process standard

deviation might dynamically expand over time required additional time-

consuming explanation.

Of course, to yield a meaningful discussion, such explanation had to be

received by someone with the necessary prerequisite statistical knowledge.

Invariably, when given this prerequisite training, participants did not perceive

the sometimes-voluminous statistical details as “value-added.” As a

consequence, many application opportunities were sidestepped and their

potential beneficial effects were forever lost, simply because they perceived

the base of training as too “complicated” and “technical.”

In short, the idea of an expanding and contracting standard deviation

was found to be outside the realm of “common sense reasoning” without the

provision of statistical instruction. However, the idea of a “shift correction”

carried high appeal and inevitably promoted lively and meaningful discussion

4 With this as a backdrop, this writer feels compelled to acknowledge the very fine technical contributions

and enhancements provided over the years by such accomplished engineers as Dr. Thomas Cheek, Dr. Jack Prins, Dr. Douglas Mader, Dr. Ron Lawson and Mr. Reigle Stewart, just to name a few. During this author’s years at Motorola, their personal insights and “late at night over a beer, pencil and calculator” discussions significantly aided in adding to the body of six sigma research. Perhaps most of all, this writer would like to recognize Mr. Robert “Bob” Galvin. His many words of wisdom, piercing leadership acumen and personal encouragement provided this scientist the “intellectually-rich and politically-risk -free” environment from which to reach out and question conventional thinking. Only with his support were the beginnings of this investigator’s journey made possible (and relatively painless). He is truly an icon of leadership and a solid testament to what can happen when a senior executive embodies and empowers the “idea of ideas.”


– without the prerequisite education. Therefore, those of us at Motorola

involved in the initial formulation of six sigma (1984-1985) decided to adopt

and support the idea of a “1.5σ equivalent mean shift” as a simplistic (but

effective) way to account for the underlying influence of long-term, random

sampling error. Of course, the shift factor was viewed as a means to facilitate

the creation of sound design specifications, the study of certain types of

reliability problems and the forecasting of sustainable producibility. In classic

engineering style, we further decided that our optimization efforts should be

wrapped around the idea of “worst case” analysis, but only in a statistical

sense.

2.0 Extending the Context

2.1 Defining the root

At the very heart of six sigma is the idea of determinism. As most

would likely agree, the foundation of this idea is scientific in nature and

advocates that every existing phenomenon is caused by another existing

phenomenon or phenomena. For example, we know that “answers” stem from

“questions.” In turn, we can say that a “question” is the result of “thinking.”

Thus, many believe that our daily reality can be defined by a series of

intersecting cause-and-effect relationships. Of course, we can influence some

of these causative chains, while others are outside our span of control.

Nonetheless, we strive to discover, understand and harness the power of

determinism.

So as to simplify this idea, we set forth the notion that Y = f (X), where

Y is a certain dependent variable, X is an explanatory variable of an

independent causative nature, and f is the function that relates or otherwise

associates Y to X.5 For all intents and purposes, this relation tells us that the

performance of Y can only be made fully known when the performance of X

5 Given the model Y = f (X), it should be recognized that the function can be of a linear or nonlinear form.

For a linear transfer function f, we would rightfully expect that any given incremental change in X would necessarily induce a corresponding and incremental change in Y. Given the same increment of change in X, a nonlinear function would induce a disproportional change in Y.


is fully known – given that the function f is valid and reliable. Consequently,

it can be said that (at any moment in time) the output variable Y is

conditioned by the input variable X.

If for some reason X is not fully explanatory or causative, then we must

assert that Y = f (X) + ε, where ε constitutes the extent or degree of

uncertainty in the forecast of Y. It is from this perspective that the scientific

notion of error is made synonymous with the idea of uncertainty. In this

context, an error (per se) is not related to the phrase “to blunder.” Rather, it is

related to the scientific understanding, which is “to deviate or be different

from.”

In this context, we understand that uncertainty (risk) is constituted or

otherwise manifested by any type or form of variation in Y or X. Owing to

this, the concept of variation is also made germane to the idea of

reproducibility and repeatability, both of which are related to the concept of

replication error.6 For example, let us contrast an arbitrary performance

observation of Y to its model condition ζ. If the observation Yi is not equal to

ζ, it can be said that the observation varies (deviates) from the model

(expected) condition such that |δi| > 0.7

In general, ζ can assume the form of a nominal specification (such as T),

a measure of central tendency (such as µ), or even some other case of Y (such

as Yk). To illustrate, consider the difference δi = Yi – µ. In this instance, we

recognize the error δi to be a particular “mean deviation. Given this

understanding, we might then seek to characterize the aggregate set of

deviations in terms of its specific magnitude, vector, dwell, and timing. Of

course, the outcome of such a characterization study would allow us to better

6 Generally speaking, such variation can be of the random or nonrandom variety. Random variation is

also referred to as “white noise,” where as nonrandom variation is referenced as “black noise.” 7 As may be apparent, such a deviation from expectation could be the result of a random or nonrandom

effect (as the case may be). Of course, the discovery, classification, and subsequent study of such effects are of central concern to the field of mathematical statistics.


define or otherwise describe the underlying system of causation.8 Only in this

manner can Y be scientifically linked to X. It should go without saying that

the progressive classification of error lies at the heart of modern problem

solving and the practice of six sigma.

2.2 Expanding the function

Unfortunately, most types of phenomenon in nature are not so simple

that they can be adequately or sufficiently described by the influence of a

single independent variable. In fact, virtually all such mono-variable cases

would reveal that ε > 0, at least to some extent. To fully eliminate such

uncertainty (error), it would be necessary to isolate all of the other causative

agents (independent variables). Following this, it would be crucial to examine

and subsequently characterize their independent and interactive effects –

instantaneously and longitudinally. Only when such effects are made known

or rationally postulated can it be said that Y = f ( X1 , … , XN ). As before, we

recognize Y as the dependent variable, f as the transfer function (mechanism

of causation), X as an independent variable, and N as the last possible X.

When all of the Xs have been accounted for or otherwise defined in a

valid and reliable manner, the resulting set of independent variables would be

fully comprehensive, or “exhaustive” as some would say. This means that all

possible independent variables (of a causative nature) are present or otherwise

accounted for. Expressed more succinctly, we would logically assert that as

the quantity N approaches its natural upper limit, the inherent error in Y

would necessarily approach zero.9

8 Of interest, most errors can be classified into one of four broad categories: 1) random transient; 2)

nonrandom transient; 3) random temporal; and 4) nonrandom temporal. While transient errors are relatively instantaneous in nature, temporal errors require time to be fully created or otherwise manifested. Without saying, random errors cannot be predicted or otherwise forecast (in a statistical sense) whereas nonrandom errors can be. In this context, random errors do not have an “assignable cause,” but the occurrence of nonrandom errors can be assigned. This is to say that nonrandom errors can be directly attributed to the influence of one or more independent variables or some interactive combination thereof.

9 This theoretical understanding naturally assumes that the partial derivatives associated with the contributing Xs have been rank ordered in terms of influence and then subjected to the transformative process f. Under this assumption, the residual error will decrease as the accumulation of influence increases. Of course, the inverse of this is also true.


However, in practice, it is often not possible to fully establish the

function that operationally connects Y to its corresponding set of Xs. In such

cases, it would be very rare to find that N is fully exhaustive and that the

function f is absolutely valid and reliable.10 Owing to this, we innately

acknowledge the presence of error in our statement of Y, at least to some

statistical or pragmatic extent.11 Thus, we must modify the aforementioned

relation and subsequently proclaim that Y = f ( X1 , … , XN ) + ε. Here again,

ε constitutes the extent or degree of uncertainty (error) that is present in our

forecast of Y.12 Only when given a valid and reliable transfer function f and

an exhaustive set of preconditioned independent causative variables is it

rationally possible to declare that ε = 0. Consequently, we are most often

forced to grapple with the case ε > 0. Hence, the ever present need for

mathematical statistics.

With respect to any dependent variable Y, each X within the

corresponding system of causation exerts a unique and contributory influence

(W). Of course, the weight of any given X is provided in the range 0.0 < 10 To this end, a statistical experiment is often designed and executed. Such experiments are intended to

efficiently isolate the underlying variable effects that have an undue effect on the mean and variance of Y. As a part of such an exercise, a polynomial equation is frequently developed so as to interrelate or otherwise associate Y to the “X effects” that prove to be of statistical and practical concern. In such cases, it is not feasible to isolate the exhaustive set of Xs and all of their independent and interactive effects. In other words, it would not make pragmatic or economic sense to attempt a full explanation or accounting of the observed behavior in Y. Consequently, we observe that ε > 0 and conclude that the given set of causative variables is not exhaustive.

11 For the moment, let us postulate that N is exhaustive. As any given X is made to vary, we would naturally observe some corresponding variation in Y, subject only to the mechanistic nature of the function f. Of course, such variation (in Y and X) is also referred to as “error.” Thus, the function f is able to “transmit” the error from X to Y. If the errors assignable to X are independent and random, the corresponding errors in Y will likewise be independent and random. Naturally, the inverse of this would be true – nonrandom error in X would transmit to Y in the form of nonrandom error. From a more technical perspective, it can be said that any form of autocorrelated error in X would necessarily transmit to Y in a consistent and predictable fashion – to some extent, depending on the function f. In any such event, it is quite possible that a particular “blend” of nonrandom input variation could be transmitted through the given function in such a way that the output variation would not exhibit any outward signs of autocorrelation (for any given lag condition). Since Y would exhibit all the statistical signs of random behavior, it would be easy to falsely conclude that the underlying system of causation is non-deterministic.

12 Uncertainty is often manifested when: a) one or more causative variables are not effectively contained within the composite set of such variables; b) the transfer function f is not fully valid or reliable; c) one or more of the causative (independent) variables has undergone a momentary or temporal change of state; d) two or more of the causative (independent) variables are somehow made interactive, instantaneously or longitudinally; or e) some combination thereof.


Wi < 1.0, where Wi is the contributory weight of the ith independent

variable.13

Given this knowledge, the adequacy and sufficiency of f, as well as the

declaration that N is exhaustive, it would then be reasonable to assert that Y

and its corresponding set of Xs can be fully characterized without error. In

other words, there would be no “error” inherent to our characterization of Y –

owing to the inclusion of all possible variables operating in the light of a valid

and fully reliable transfer function f. This is to say that, for any unique set of

momentary or longitudinal conditions, it would be possible to forecast or

otherwise characterize the nature of Y with 100 percent certainty.

2.3 Describing the interaction

With the same form of reasoning, we must also concede that the

resulting influence of any given interactive combination of Xs will likely

change over time – owing to the instantaneous and longitudinal states that are

naturally manifest to each X. For example, let us suppose that Xi and Xj have

the potential to be interactive when both variables dwell on the high side of

their respective performance scales, but the interactive effect is not nearly as

strong when both variables dwell on their low side.

We will further postulate that when both variables are operating near

their central condition, they are no longer interactive, per se. Based on this set

of circumstances, it is easy to understand how it is possible that their joint

influence ( βij XiXj ) might radically change over time – owing to a change of

state in Xi and Xj respectively. Thus, we conclude that replication error is

often quite circumstantial by nature. For example, many types and forms of

variable interactions are dependent upon the sustained coincidence of certain

respective frequencies and amplitudes among the independent variables.

Given such reasoning, we theoretically recognize that the progressive

behavior of any given X can be described in terms of frequency and 13 For virtually any relatively complex system of causation, it is widely accepted that a small number of the

Xs will generally account for a majority of the total weight. This is often referred to as the “Pareto” effect – the case where most of the influence emanates from the “vital few” variables versus the “trivial many.”


amplitude. By the laws of nature, there would exist a hierarchical progression

of causation that would continue through the Zth level, where Z could

represent infinity.

In this context, every X is a contributor to, or the resultant of, some other

X – in some way, shape, or form. Hence, the declaration of a Y variable

provides us with an indirect reference to one of the infinite steps on the

“staircase of causation.” Here again, we are reminded that everything is

relative. Only at the Zth level of a given causative system would each X

exhibit a distinct and perfectly predictable pattern behavior. This is to say

that, at the lowest possible level of causation, the pattern of each X would be

stable in terms of its operating frequency and amplitude.

From this perspective, it should be relatively easy to understand how the

instantaneous or longitudinal effect of interactive variables could induce the

illusion of random behavior in Y, even though the unique operating frequency

and amplitude of each X is deterministic. From another angle, it is possible

that each X (associated with a complex system of causation) could exhibit a

high level of autocorrelation (for a lag 1 condition), but the dependent variable

Y might not exhibit such autocorrelation (for any given lag condition).

Naturally, such a phenomenon results from the blending of the many

instantaneous and longitudinal effects stemming from the underlying system

of causation.

2.4 Rationalizing the sample

As many practitioners of process improvement already know, it is often

the case that the influence of certain background effects must be significantly

reduced or eliminated so as to render a “statistically valid” estimate of process

capability. Of course, this goal is often achieved or greatly facilitated by the

deliberate and conscientious design of a sampling plan.

Through such a plan, the influence of certain variables and related

effects can be effectively and efficiently “blocked” or otherwise neutralized.

For example, it is possible to block the first-order effect of an independent


variable by controlling its operative condition to a specific level. When this

principle is linked to certain analytical tools, it is fully possible to ensure that

one or more causative variables do not “contaminate” or otherwise unduly

bias the extent of natural error inherent to the response characteristic under

consideration.

As a given sampling strategy is able to concurrently block the influence

of more and more independent variables, the response replication error is

progressively attenuated. In other words, as the influence of each independent

variable is progressively blocked, it is theoretically possible to eventually

reach a point where it is not possible to observe any type, form, or magnitude

of replication error. At this point, only one measurement value could be

realized during the course of sampling. In short, the system of classification

would be so stringent (as prescribed by the sampling plan) that no more than

one observation would be possible at any given moment in time.

Should such a sampling plan be invoked, the same response

measurement would be observed upon each cycle of the process – over and

over again it would be the same measurement – the replication error would be

zero (assuming a fully valid and reliable system of measurement). However,

for any given sampling circumstance, there does exist a theoretical

combination of blocking variables and corresponding control settings that will

allow only random errors to be made observable. However, the pragmatic

pursuit of such an idealized combination would be considered highly

infeasible or impractical, to say the least. For this reason, we simply elect to

block on the variable called “time.” In this manner, we are able to indirectly

and artificially “scale” the system of blocking to such an extent that only

random errors are made known or measurable.

If the window of time is made too small, the terminal estimate of pure

error (extent of random variation) is underestimated, owing to the forced

exclusion of too many variable effects of a random nature. On the other hand,

if the window of time is too large, the terminal estimate of pure error (extent

of random variation) is overestimated, owing to the natural inclusion of


nonrandom variable effects. However, by the age-old method of trial and

error, it is possible to define a window size (sampling time frame) that

captures the “true” magnitude of background variations (white noise) but yet

necessarily precludes nonrandom sources of error from joining the mix.

In short, it is pragmatically feasible to discover a sampling interval (in

terms of time) that will capture the full extent of white noise while preserving

the primary “signal effect.” Only when this has been rationally and reasonably

accomplished can the instantaneous and longitudinal reproducibility of a

performance characteristic be established in valid manner. Such a sampling

plan is also called a rational sampling strategy, as execution of the plan

rationally (sensibly and judiciously) partitions the array of “signal effects”

from the mix of indiscernible background noises. In this sense, a rational

sampling strategy can effectively and efficiently preserve the array of signal

effects, while concurrently capturing the full extent of random error.

From this perspective, it is easy to understand why the idea of rational

sub grouping is so important when attempting to estimate the short-term

standard deviation of a response characteristic (CTQ). Only when the

“signal” effects are removed from the total mix of variations can the

instantaneous (short-term) reproducibility be made known in a statistically

valid way.

2.5 Detecting the error

The idea of error is certainly not a new concept – by any stretch of the

imagination. We naturally recognize the existence of error whenever there is

a departure from some type of model expectation, regardless of the

magnitude, direction, dwell, or timing of that departure. To illustrate, let us

suppose that a certain performance variable can be fully described as Y ~

NID(µ,σST), where µ is the distribution mean and σ is the standard deviation.

If a single member of such a population is arbitrarily (randomly) selected, but

its instantaneous performance cannot be immediately assessed or otherwise

made known, the momentary expectation (best guess) would be µ. Under


these circumstances, the odds of correctly guessing (estimating) the true value

of Y are maximized since 50 percent of the values lie above and below µ.

Should we discover a difference between a value and the corresponding

model expectation, then such a differential would be referred to as a “mean

deviation.” In the context of our discussion, we would observe | Yi – µ | > 0.

Given this condition, we would necessarily declare an error in our estimate of

Y. When such errors are amalgamated and then subsequently summarized in

the form of a standard deviation σ, the resulting index only reflects the

perturbing influences that would have been present during the interval of

observation. If the period of observation (duration of sampling) is relatively

short, it is rational to assert that not all sources of potential error would be

accounted for or otherwise represented by the given standard deviation. In

other words, it can be said that as the period of observation approaches its

natural upper limit, the likelihood of detecting or otherwise “trapping” all

possible sources of error approaches 100 percent.

Conversely, as the period of observation (time) approaches the natural

limit of zero, there exists a point at which it would no longer be possible to

make more than one uniquely independent observation. Obviously, under

such a condition, the underlying system of causation would be virtually

invariant. Consequently, it would not be possible to identify and subsequently

authenticate (validate) any given source of variation related to the dependent

variable Y.

2.6 Classifying the error

Holistically speaking, the exact nature and magnitude of any given

replication error is fully determined by the net effect of many ever-changing

conditions and circumstances within the underlying system of causation.

Globally speaking, such errors can be classified into two distinct categories.

The first category is called “random error,” while the second is referenced as

“nonrandom error.” We fully recognize that random error (white noise) is


unpredictable in a mathematical sense.14 However, nonrandom error (black

noise) is often found to be predictable – at least to some extent greater than

zero.15 We also must concede that white noise is due to unassignable

(untraceable) causes, while black noise can be attributed to assignable causes

(those sources of causation that can be made accountable or traceable).

Pertaining to both classifications of error (random and nonrandom), we must

fully consider two discrete types of effects. The first type is called a

“transient effect” while the second is referred to as a “temporal effect.”

To illustrate the nature of a transient effect, consider a dependent

variable Y that has just experienced a momentary oscillation, brief

disturbance, or instantaneous shock – much like a sudden pulse or surge of

energy. Of course, such an effect can be due to the sudden influence of one or

more random or nonrandom forces within the underlying system of

causation.16 It is also understood that the dwell of such an effect is relatively

14 In many cases, the nature of such error is often so complex, compounded, and confounded that existing

analytical technologies do not have the “diagnostic power” to discern or otherwise “source trace” its independent origins through the many chains of causation. When it is not pragmatically feasible or economically sensible to “track down” the primary sources of variation, we simply declare (assume) that each individual error constitutes an “anomaly.” For any given anomaly, the circumstantial state of the underlying cause system is momentarily declared to be “indeterminate” and, as a consequence, the perturbation is treated as if it emanated from a system of random causes.

15 The momentary or longitudinal blending (mix) of many independent variables (each with a unique weighting) can effectively “mask” the presence of a nonrandom signal condition inherent to the dependent variable Y. As may be apparent, this would create the illusion of random variation (with respect to Y). However, as the sources of variation (Xs) are progressively blocked or otherwise neutralized (by virtue of a rational sampling scheme coupled with the appropriate analytical tools), the dominant signal conditions would then be discernable from the white noise. When such a signal condition is detected, the composite (total) variation would no longer be considered fully random. In other words, as the background variations are minimized, the likelihood of detecting some type or form of underlying signal increases. From a purely classical point-of-view, some would assert that nothing in nature happens by chance (everything is theoretically deterministic). In other words, everything moves in some form of trend, shift, or cycle. Holding this as an axiom, it would then be reasonable to assert that Y is always perfectly predictable (theoretically speaking), regardless of how complex or sophisticated the underlying system of causation may be. Accepting that Y = f (X1, … , XN) and given that the influence of all variables is effectively eliminated except that of XK, then Y would necessarily exhibit the same behavior as XK (momentarily and longitudinally). Thus, it can be theoretically argued that any collective set of independent variables, each having a unique signal effect of a nonrandom nature, can be momentarily or longitudinally blended or otherwise mixed in such a manner so as to form a seemingly nondeterministic system. When this type of condition is at hand, it is often far more convenient (for purposes of analysis) to assume a random model than it is to progress under the constraints of a nonrandom model.

16 As independent agent, any given source of variation has the capacity and capability to induce a transient or temporal effect (error). However, when two or more such forces work in unison (at certain operational settings), it is often possible to form an effect that is larger than the simple sum of their independent contributions. In general, as the number of independent contributory forces increases, it becomes less likely that the resulting effect (error) can be dissected or otherwise decomposed for


instantaneous and, as a consequence, is not sustained over time. Naturally, a

transient effect can be periodic or sporadic. However, when the timing of

such an effect is sporadic (random), it is often referred to as an “anomaly.”

Although the magnitude and direction of a transient effect can be random or

nonrandom in nature, its timing and dwell is generally found to be

unpredictable. From a statistical perspective, the magnitude of such effects

can often be made known by progressively tracking the general quantity δ = Y

– µ. Of course, the direction of effect can often be established by noting the

vector of δ. In other words, the sign of a deviation (positive or negative)

reports on the direction of effect.

The basic nature of a temporal effect is time-dependent. In other words,

a temporal effect requires the passage of time before its influence can be fully

manifested and subsequently detected. Of course, a temporal effect can be of

the random or nonrandom variety. For example, suppose a random normal

independent variable experiences a temporary interaction with another such

variable. It is quite possible that the outcome of such a phenomenon would be

manifested in the form of a “performance dwell,” where the period of rise or

fall in the dependent variable is sustained for a moderate period of time within

the system of causation.

From a process control perspective, such a condition could constitute a

“temporal shift” in the signal condition of the performance characteristic. By

nature, this type of shift would exhibit a particular magnitude and vector –

both of which would be fully attributable to the two-variable interaction

within the system of causation. However, the timing and dwell may be fully

nondeterministic (random). Moreover, the overall time-series pattern may or

may not exhibit a “statistically significant” autocorrelation.17 Consequently, it

independent consideration and analysis. Consequently, higher order interactions are often treated as a random effect when, in reality, that effect is comprised of several deterministic causes.

17 To better understand the idea of autocorrelation, let us consider a set of time-series data. Given this, we say that the data are “sequentially realized over time.” First, let us consider a lag one condition. For this condition, it can be said that any given error cannot be used to statistically forecast the next observed error. For a lag two condition, the error from the two previous time periods cannot be used (individually or collectively) to forecast the next observed error. Of course, this line of reasoning would apply for all possible lag conditions. If no statistical correlation is observed over each of the possible


would be extremely difficult (if not impossible), or at least generally

impractical, to undertake a comprehensive characterization (classification) of

the composite variations (errors).18

In light of such considerations, we seek to employ various types and

forms of rational sampling strategies so as to ensure the random transient

effects are reflected or otherwise trapped within sampling groups and the

temporal effects are duly reflected or otherwise accounted for between

sampling groups. Given that the sampling strategy is adequately and

sufficiently “blocked,” the primary signal effects will then be forced into their

respective blocks. As a continuous and progressive strategy, the goal of

rational sampling is fairly straightforward – separate the random and

nonrandom errors so they may be categorically analyzed, independently

compared, and statistically contrasted. Of course, there exists various types of

statistical tools to facilitate this aim.

2.7 Declaring the opportunity

At the inception of six sigma, the issue of “opportunity counting” was a

source of heated debate and analytical confusion – often centered on the

criteria that constitute an opportunity. Simply stated, an opportunity is merely

a set of conditions favorable to some end. In view of the two possible fates of

a CTQ – success or failure – we have the idea of a “yield opportunity” and

that of a “defect opportunity.” Since one is merely the flip side of the other

lags, it would then be reasonable to assert that the data is not patterned (the data would be free of any discernable trends, shifts, or cycles).

18 From the dictionary, it should be noted the word “temporal” is taken to mean “of or related to time.” Of course, this definition could be applied to a short-term or long-term effect. However, for purposes of this book and six sigma work, we naturally apply its meaning in a long-term sense. For example, when characterizing a performance variable, we often seek to accomplish two things. First, we attempt to isolate the short-term influence of random, “transient” effects (instantaneous errors). In general, transient errors usually prove to be of the random variety. Second, we isolate those factors that require the passage of time before their unique character can be fully identified or otherwise assessed. Such errors are time-dependent and, as a consequence, are often referred to as “temporal errors.” From this perspective, it is easy to understand how the collective influence of transient effects can govern the short-term capability (instantaneous reproducibility) of a process. Given this, it is now easy to reason how the total set of temporal effects (coupled with the aggregate transient effects) determine the long-term (sustainable reproducibility) of a process. Again, we must take notice of the fact that any given transient or temporal effect can be of the random or nonrandom variety. However, as previously stated, transient effects most generally induce a random influence whereas temporal effects are generally manifested as both.


(as they are mutually exclusive), we choose most frequently to use the form

“defect opportunity” in recognition of certain quality conventions.19

From an industrial or commercial perspective, the “set of conditions”

just mentioned can be associated with a set of performance standards. For

example, we can offer a set of performance standards in the form often given

as LSL < T < USL. In this form, a “potential opportunity” can be fully

described by the relation Op = f (LSL, T, USL), where Op is the potential

opportunity, LSL is the lower specification limit, T is the target value

(nominal specification) and USL is the upper specification limit. In addition,

the operational condition of the corresponding process distribution (defined by

the parameters µ and σ) must be made known or rationally estimated, and

then “mated” or otherwise contrasted to the performance specifications so as

to place the opportunity in a kinetic state.

Thus, a “kinetic opportunity” can be fully prescribed by the simple

relation Ok = f (LSL, T, USL, µ, σ), where µ is the corresponding process

mean, and σ is the standard deviation of the corresponding process.20

Essentially, this relation infers that a kinetic opportunity can only be created

when these five key factors are mechanistically interacted or otherwise

interrelated in real time and space. From a different angle, we can say that a

kinetic opportunity can be brought forth into real time and space only when

the performance specification of a given design feature is married or

otherwise operationally mated to its corresponding process capability (process

distribution). Only then can a probability of success or failure be rationally

estimated, declared or consequentially established.

19 The reader should recognize that the idea of “error” and that of a “defect” are closely related, but not

necessarily synonymous. For example, let us postulate the marriage of a certain process to a symmetrical bilateral specification, such that µ = T. In addition, we will also postulate the existence of a particular error described as δi = Yi - µy. In this case, the deviation δi is fully recognized as an error, but its vectored magnitude may not be large enough to constitute a defect (nonconformance to specification).

20 The reader is again reminded that our discussion is based on the assumption of a random normal

variable.


It should go without saying that if any of the underlying conditions are

fully absent, unknown or not established, a kinetic opportunity cannot be

declared. However, a potential opportunity can be acknowledged. For

example, if the performance specifications USL, T and LSL do in fact exist,

but the corresponding parameters µ and σ are unknown or have not been

empirically estimated and made relational to the specifications, then the

opportunity would only exist in a potential state. As a result, it would not be

possible to estimate the probability of a nonconformance to standard. In other

words, if an opportunity does not kinetically exist in real time and space, it

should not be counted among those that do exist (for reporting purposes).

On the other hand, if the opportunity is kinetic, but the performance is

not regularly assessed or otherwise measured and reported, the opportunity

would be declared as “passive” in nature. Consequently, it should not be

included among those opportunities that are active by nature (regularly

measured and reported). Thus, we have the operational guideline that says: a

defect opportunity should only be ”counted” if it is regularly assessed

(measured in terms of conformance to standards) and subsequently reported

for purposes of quality management. This is to say the opportunity must not

only be kinetic, it must be active as well.

Application of this general rule and its underlying precepts will

significantly reduce the spurious practice of denominator management where

such performance metrics as defects-per-million-opportunities (dpmo) are

concerned.21 Given the nature of such quality metrics (like dpmo), the

21 The colorful term denominator management is used to describe the practice of inflating or otherwise

distorting the denominator term of the classic quality metric called defects-per-opportunity. As should be apparent to the informed practitioner, such a practice is most often applied to effectively mask or confound the true quality of a product or service. For example, consider a simple printed circuit board (PCB) that employs through-hole technology. In this case, we will exemplify the soldered connection between the two leads of a standard carbon resistor and the PCB. Given this, it is understood that each component lead must be adequately soldered to the PCB at two different but related points (i.e., on the top-side and bottom-side of the board). For the sake of discussion, let us say that the performance category called “solder joint pull strength” is the CTQ of concern. Given the nature of this CTQ and application technology at hand, it should be quite evident that each PCB connection constitutes an independent opportunity to realize a pull-test failure. In other words, each lead of the resistor represents a defect opportunity. If one lead of the resistor passes the pull test and the other lead fails the test, then the defects-per-opportunity metric would be properly presented as dpo = d / o = 1 / 2 = .50. A more liberal perspective would hold there are four defect opportunities since there would exist


management focus should be on minimizing the numerator, not maximizing

the denominator. Naturally, the practice of denominator management should

be highly discouraged as it does nothing more than thwart judicious attempts

to create “true” quality improvements. Such false reporting not only harms

the producer but the customer as well. Although the practice of denominator

management can create the illusion of quality improvement, such fictitious

gains are inevitably brought to light over time as the lack of correlation

between field performance, reliability and process performance becomes

known.

2.8 Qualifying the interface

As many practitioners of six sigma know, designing a product or service

is often a tenuous and iterative process, fraught with many uncertainties.22 As

an integral part of such a process, various interventions are made to either

eliminate or reduce various forms of risk. For example, producibility analyses

are commonly undertaken to examine and ultimately enhance the viability of

manufacture. Unfortunately, such efforts frequently miss or fall short of their

aims and intents, often due to the absence of a science-based methodology

that will sufficiently interface a performance specification to its corresponding

process distribution.

To conceptually illustrate such a shortfall, consider the absence or

misconduct of a design-process qualification procedure (DPQ). Without a

four separate solder joints. In this event, the defects-per-opportunity would be wrongfully reported as dpo = d / o = 1 / 4 = .25. Even more liberal would be the case that advocates six defect opportunities – four solder joints and two leads. Taken to an extreme, some conniving managers might even try to say there exist eight defect opportunities – four solder joints, two leads, and two through-holes. In this case, the product quality would be given as dpo = d / o = 1 / 8 = .125. In this way, management could inappropriately create a 4X quality improvement by simply changing the “rules of defect accounting.” Thus, we have improvement by denominator management. To avoid such an error of leadership, we must recognize that any given unit of product or service will inherently possess “Y” number of critical failure modes, where each mode has “X” number of active chances. Thus, the total number of defect opportunities can be described by the general relation O = Σ( Y * X ).

22 For purposes of simplified communication, the author shall define the term “product” to mean any form of deliverable resulting from a commercial or industrial endeavor or process. In some cases the “product” may be a process, such as those often encountered in the service sector. In addition, any performance characteristic that is vital to customer or provider satisfaction will be herein referred to as a “critical-to-quality characteristic,” or CTQs for short.


statistically valid way to qualify a design (relative to its corresponding

processes), we are unable to confidently establish “interface quality.” In other

words, we cannot speak to the “quality of marriage” that exists between the

design and its corresponding production process. In such instances, it is

possible that the allowable bandwidth of a performance specification does not

adequately “fit” the operational bandwidth of the process. Analogously

speaking, the process owner’s automobile is inconveniently wider than the

designer’s garage door. When the sufficiency of unionization between a

design and process is procrastinated until initial production is already

underway, we certainly have a formula for disappointment, failure or both, as

reproducibility errors will likely be bountiful.

As most practitioners of six sigma are all too aware, many product

design organizations simply put an idea on paper and then “throw it over the

wall” to see if the configuration is producible or viable in terms of

reproducibility. In some cases, to make such an assessment, the design is

exercised or otherwise tested during a limited production run. Of course, this

type of approach for studying producibility is undertaken to work out or

otherwise resolve any “unanticipated design flaws and process bugs” prior to

full-scale production.

When problems arise, stopgaps are plugged into the process, or the

product design is somehow changed to accommodate the unwanted

intervening circumstances. Needless to say, such a highly reactionary

approach is not a very productive or efficient way of assuring performance

and producibility. But without a scientific process to follow, perhaps the trial-

and-error approach is well served.

If the results of a DPQ prove unfavorable, the design (and process) is

subsequently “tweaked” until an acceptable result is obtained. Sometimes, a

substantial redesign is undertaken. Other times, the many “marriages”

embedded within a design prove to be so inconvenient and costly that the

entire product is wiped off the business radar screen.


Nevertheless, an alternative to the test-discover-tweak approach is the

six sigma method. Essentially, the six sigma way is a more scientific

approach that is grounded in mathematical statistics. Of course, the

overriding purpose of a six sigma DPQ is to statistically prescribe, analyze

and validate the marriage between the design and its conjugal processes.

From this perspective, it is easy to see how the quality of such a marriage can

be used to forecast the relative extent to which the value entitlements will be

realized.23

As many already know, there are a variety of existing statistical tools

and methods that are fully capable of characterizing and optimizing the

producibility of a design before the fact, not after. In other words, the intent is

to assure the realization of value entitlements during the process of design, not

during the course of production. Even more importantly, the six sigma method

of analysis (as prescribed in this book) will provide such assurances with a

known degree of statistical risk and confidence.

3.0 Interrogating the Context

3.1 Articulating the goal

Before proceeding with a pervasive discussion that will answer the

driving questions underpinning this book, we should first review several of the

key tenets associated with the statistical idea of six sigma . To enrich this

perspective, let us briefly comment on what six sigma “is” and what it “is

not.” This is important because many newcomers to the world of six sigma do

not fully appreciate the fact that the idea of six sigma originates at the

opportunity level of a deliverable. In this context, the word “deliverable”

23 There is usually a performance expectation for each and every critical feature in a system. Of course,

such specifications and requirements are derived from higher-order negotiations between the customer and provider about what constitutes “value” in the business relationship. When such value is achieved or exceeded, even for a single CTQ, we can say that entitlement has been realized. In this sense, value entitlements are rightful expectations related to the various aspects of product utility, access and worth. For example, there are three primary physical aspects (expectations) of utility – form, fit, and function. In terms of access, we have three basic needs – volume, timing and location. With respect to worth, there exist three fundamental value states – economic, intellectual and emotional.


should be interpreted as a product, service, transaction, event or activity. It is

generally described as “that which the customer seeks to purchase.”

For example, when we refer to a certain deliverable as being “six

sigma,” we do not mean that each unit will contain only 3.4 defects.

Furthermore, we do not mean that only 3.4 units-per-million production units

will contain a defect, as this would imply that (on average) only 1 out of about

every 294,118 units will exhibit a quality infringement of some form or type.

What we do mean is quite simple. For any type of deliverable, each defining

critical-to-quality characteristic (CTQ) will exhibit a 6σ level of

instantaneous reproducibility, or “capability” as some would say.24 However,

this model level of capability is degraded to 3.4 defects-per-million

opportunities (dpmo), owing to certain process variations.25

3.2 Polishing the definition

Perhaps, through an example, we should set forth and interrogate a more

technical definition of six sigma. By doing so, we will be able to gain deeper

insight into its original intent and meaning. To this end, let us consider a

random performance variable (Y) in the context of a symmetrical-bilateral

specification (two sided tolerance with a centered nominal specification).26

24 Holistically speaking, any design feature (or requirement) constitutes a quality characteristic.

Interestingly, such characteristics are also known as “potential defect opportunities.” If a defect opportunity is vital or otherwise critical to the realization of quality, it is most typically called a critical-to-quality characteristic and designated as a “CTQ.”

25 Based on this, it is only natural that the defects-per-unit (dpu) will increase as the number of CTQs are increased, given a constant and uniform level of process capability. As a result of this, the DPU metric is not a good comparative index for purposes of benchmarking. In other words, DPU should not be used to compare the inherent quality capability of one deliverable to some other type of deliverable, owing to differences in complexity. However, by normalizing the DPU to the opportunity level, and then converting the defect rate to a sigma value (equivalent Z), it is possible to compare apples-to-oranges, if you will. Only then do we have a level playing field for purposes of benchmarking and for subsequently comparing dissimilar phenomena.

26 We naturally recognize that a symmetrical-bilateral specification is arguably the most common type of performance requirement. As a consequence, this particular type of design expectation was selected to conventionally idealize a statistically-based definition of six sigma capability. Nevertheless, we must also acknowledge the existence of asymmetrical-bilateral specifications, as well as unilateral specifications (one-sided). While the unilateral case can be defined by either side of a symmetrical-bilateral specification (with or without a nominal specification), the short-term error rate is consequently reduced to one defect-per-billion-opportunities, or DPBO = 1.0. However, the asymmetrical bilateral case presents some interesting challenges when attempting to define a six sigma level of capability. For example, consider an asymmetrical-bilateral performance specification while recognizing that a normal distribution is symmetrical – indeed, an interesting set of circumstances. Given this framework, a six


From a design engineering perspective, a six sigma level of capability can be

theoretically prescribed by the a priori assignment of 50 percent design

margins.

Of course, such “guard banding” of the specification limits is imposed to

account for or otherwise counterbalance the influence of uncertainties that

induce process repeatability errors. Naturally, such uncertainties are reflected

in the form of variation during the course of production. In light of such

variation, we establish a bilateral design margin of M = .50 so as to provide a

measure of resilience. Given this, the magnitude of necessary guard banding

can be theoretically and equivalently realized by hypothesizing a six sigma

model of reproducibility during the course of design.

Naturally, such a postulated performance distribution would be normal

in its form and comprised of an infinite degrees-of-freedom (df). In addition,

the three-sigma limits of such a distribution are conveniently used as the

pragmatic boundaries that prescribe unity. Given these factors, we are able to

establish an operating margin with respect to the performance specification

that is theoretically equivalent to 50 percent.

By the conventions of quality engineering, we naturally understand that

the instantaneous reproducibility of a design feature can be described by

several different but related indices of short-term capability. For example, it

is widely known that the short-term (instantaneous) capability of a process can

be generally described by the relation ZST = (T – SL)/ σ ST , where ZST is the

short-term standard normal deviate, T is the specified target value, SL is a

specification limit (upper or lower), and σST is the short-term standard

deviation.27 Of course, this particular performance metric assumes µ = T

sigma level of capability must be conditionally associated with the most restrictive side of the specification. In other words, the capability must be made relational to the smallest semi-tolerance zone. But if for some pragmatic reason it is more beneficial to locate the process center off target (in the form of a static mean offset), the short-term definition of six sigma becomes highly relative. For such instances, sound statistical reasoning must prevail so as to retain a definition that is rational, yet theoretically sound.

27 It must be recognized that the short-term standard deviation (root-mean-square) is a statistical measure of random error that, when properly estimated, provides an index of instantaneous reproducibility. In this regard, it only reports on the relative extent to which random background variation (extraneous noise) influences the “typical mean deviation” that can be expected at any given moment in time. In this sense,


(centered process). Given that µ = T, and the fact σ ST constitutes a measure of

instantaneous reproducibility, it should be evident that ZST represents the

inherent performance capability of the corresponding process. Thus, ZST must

be viewed as a “best case” index of reproducibility.

Another common and closely related index of capability is given by the

relation Cp = |T – SL| / 3σ ST. For purposes of comparison, it can be

algebraically demonstrated that Cp = ZST / 3, where 3 is a statistical constant

that defines the corresponding limit of unity.28 In this context, we naturally

understand that the process capability ratio Cp is merely one-third of the

quantity ZST. Thus, a six sigma level of short-term capability (instantaneous

repeatability) is given as ZST = 6.0, or Cp = 2.0 if preferred. Consequently, it

can be said that a six sigma level of instantaneous reproducibility is

distinguished by a ±6σST random normal distribution that is centered between

the limits of a symmetrical, bilateral performance specification, thus realizing

the design expectation M = .50. In this context, the ±6σST limits exactly

coincide with their corresponding design limits – the upper specification limit

(USL) and lower specification limit (LSL), respectively. To better visualize

the six sigma model of instantaneous reproducibility, the reader’s attention is

directed to figure 3.2.1.

The reader must recognize that the given figure only provides the right-

hand side of a symmetrical-bilateral performance specification. Since the left-

hand side is a mirror image of the right-hand side, there is little need to

discuss both. Consequently, the ensuing discussion is simplified without loss

of specificity.

it constitutes the magnitude of instantaneous error that emanates from the system of causation and is, therefore, a measure of inherent capability, also called entitlement capability.

28 The uninformed reader should understand that unity (per se) is statistically constituted by 100 percent of the area under the normal distribution. Given this, we naturally recognize that the “tails” of a normal distribution bilaterally extend to infinity. However, conventional quality practice often “trims the tails” of such a distribution and declares that unity exists between the three sigma limits. This is done in the interests of enjoying certain analytical conveniences. Of course, this convention logically assumes the area extending beyond the three-sigma limits is trivial and, therefore, inconsequential. Perhaps such an assumption is reasonable when balancing statistical precision against the demands of quality reporting.


Figure 3.2.1

Depiction of a Centered Short-Term Six Sigma Critical-to-Quality Characteristic

that Reflects Only Transient Sources of Random Error

Although previously stated, it should again be recognized that Y is an

independent random normal performance variable. Thus, we naturally

understand that Y ~ NID(µ,σST) such that µ = T, where T is the specified

target value (nominal specification). Based on these model circumstances, the

short-term quality goal of a six sigma characteristic is statistically translated to

reflect one random error per 500 million chances for such an error, or simply

two defects-per-billion opportunities, but only for the centered, symmetrical,

bilateral case. The unilateral case (no target specified) would reflect only one

defect per billion opportunities.


3.3 Inflating the error

Over a great many cycles of production, we are inevitably confronted

with the natural occurrence of transient and temporal effects of a

circumstantial and random nature. It should stand without saying that the

mitigating effect of these errors can be quite significant, as they ultimately

induce a consequential impact on the long-term reproducibility of Y. Of

course, the pragmatic nature of this impact is manifested in the form of an

“inflated” short-term standard deviation over many cycles of the process.

Uniquely stated, the short-term (instantaneous) error model is

compensated or otherwise corrected by enlarging the “typical” root-mean-

deviation, also called the standard deviation. However, in practice, the exact

magnitude of such a compensatory measure is theoretically established by

way of the chi-square distribution (for a given df and α). Again, this is the

primary means of compensating or otherwise mitigating the short-term

performance model for a wide array of transient and temporal uncertainties (of

a random nature). As should be intuitively evident, such variations will

inevitably arise during the course of protracted process operation.

It should now be noted that such a compensatory inflation of the short-

term (instantaneous) standard deviation is employed purely for the purposes of

conducting a producibility analysis or a design optimization study.29 In the

context of a six sigma reproducibility model, the magnitude of inflation is 29 Many practitioners that are fairly new to six sigma work are often erroneously informed that the proverbial

“1.5σ shift factor” is a comprehensive empirical correction that should somehow be overlaid on active processes for purposes of “real time” capability reporting. In other words, some unjustifiably believe that all processes will exhibit a 1.5σ shift. Owing to this false conclusion, they consequentially assert that the measurement of long-term performance is fully unwarranted (as it could be algebraically established). Although the “typical” shift factor will frequently tend toward 1.5σ (over the many heterogeneous CTQs within a relatively complex product or service), each CTQ will retain its own unique magnitude of dynamic variance expansion (expressed in the form of an equivalent mean offset). Of course, we also recognize that the centering condition of a CTQ can be deliberately offset – independently, or concurrently. Naturally, such a deliberate offset in the process center is frequently employed to enjoy some type of performance or business-related benefit. In no way can or should a “generalized” shift factor be defined to characterize or otherwise standardize such an offset in the mean, nor should it be confused with the idea of a compensatory static mean offset (such as discussed in this book). Although both types of mean offset constitute “shifting” the process center, their basic nature and purpose is radically different and should not be confused.


expressed as an expansion factor and quantified in the form c = 1.33. In this

context, c is often referred to as the six sigma correction. Thus, the general

reproducibility of Y is degraded or otherwise diminished via a compensatory

inflation of the short-term standard deviation such that σLT = σ STc, where c is

the inflationary correction, σ ST is the short-term standard deviation (index of

instantaneous random error), and σ LT is the expected long-term standard

deviation (index of sustained random error).

The long-term inflationary effect of transient and temporal sources of

random error on a short-term six sigma performance distribution is presented

in figure 3.3.1. The reader should notice that the net effect of transient and

temporal error (of the random variety) results in a 1.5σ ST loss of design

margin. Here is yet another perspective of the proverbial “shift factor”

commonly employed in six sigma work.30

Figure 3.3.1

Depiction of a Six Sigma Critical-to-Quality Characteristic that Reflects Transient and Temporal Sources of Random Error

30 These assertions will be theoretically demonstrated later in this discussion. For the moment, the reader

is kindly asked to faithfully accept this premise without proof.


Case A = 6.0s

115.0m =100

T

50%Margin

122.5

25%Margin

1.5s

130.0

USL

scale

Case B = 4.5s

Case A = Short-Term ppm = .001

Case B = Long-Termppm = 3.4

A

3.4 Calibrating the shift

So as to conceptually simplify the inflationary effect of transient and

temporal errors (of a random nature), and to enjoy a more convenient form of

application, an equivalent mean off set is often applied to the model

distribution of Y. In the spirit of six sigma, such a quantity is expressed in the

form of δ = 1.5σST. Of course, the relative direction of such a linear

correction to µ can be positive or negative, but not both concurrently.

However, it is most often applied in the “worst-case” direction – when testing

or otherwise analyzing the performance of a design.

Applying this compensatory correction to the short-term distribution

(illustrated in figure 3.4.1) reveals a long-term performance expectation of

6σST - 1.5σST = 4.5σLT. Expressed differently, the resulting long-term

capability is given as an “equivalent” figure of merit and expressed in the

form ZLT = 4.5. Under this condition, the design margin is consequentially

reduced to M = .25. Of course, the remaining safety margin of 25 percent is

still large enough to absorb a fairly substantial shock to process centering,

owing to some type or form of transient or temporal perturbation of a

nonrandom nature. Of course, such a shock may or may not be manifested as

a momentary disturbance to the process center.31 Statistically translating the

4.5σ LT level of capability into defects-per-million-opportunities reveals that

dpmo = 3.4. For the reader’s convenience, the long-term “shifted” model of

31 However, such a shock effect is often manifested as a transient (short-term) disturbance to the process

center. When this happens, the probability of a defect temporarily increases. Of course, the exact duration of this effect is generally indeterminate, owing to the random nature of the underlying system of causation. Because of this, it should now be apparent that if a design engineer seeks to establish a long-term safety margin of M = .25, the short-term marginal expectation must be generously greater than 25 percent. By enlargement of M, the engineer is able to provide a more realistic level of “guard banding” that cushions a performance distribution against certain types of disturbances resulting from transient and temporal effects that tend to upset process centering. Again, more will be said about this later in this book.


six sigma capability is depicted in figure 3.4.1. The reader must recognize the

probabilistic equivalency between figures 3.3.1 and 3.4.1.

Figure 3.4.1

Depiction of a Long-term Six Sigma Critical-to-Quality Characteristic Presented as an Equivalent Short-term Shifted Distribution

Thus, whenever we refer to a system, product, process, service, event or

activity as being “six sigma,” what we are really saying is that any given CTQ

related to that deliverable will maintain a short-term capability (instantaneous

reproducibility) of ±6σST and will exhibit no more than 3.4 dpmo over the

long-term (after many cycles or iterations of the corresponding process).


3.5 Rationalizing the Shift

In order to better grasp the original constructs underpinning the six

sigma shift factor, we must jointly consider the engineering rational and

statistical context by considering a hypothetical performance characteristic.

Doing so will provide us with a conceptual platform from which to view the

justification for employing an equivalent 1.5σST static off-set to the mean of a

critical-to-quality performance characteristic. For purposes of our discussion,

we will simply refer to a critical performance variable as a CTQ . As related

to the process capability of our CTQ, it will be accepted that the population

standard deviation σ is short-term in nature, known, rational, and statistically

stable over time.32 We will also assert that the center of this normal

distribution is positioned such that µ = T, where µ is the process mean and T

is the target value of the design (nominal specification).

Let us now say the referenced CTQ will be independently replicated K

number of times during the course of executing a standard cycle of

production, where K is a relatively small quantity – often called a “production

batch.” It will also be known that, for any given batch, only N = 4 of the K

replicates would be arbitrarily selected for performance verification.33

Following this, the sample average (Xbar) would be dutifully computed for

the N = 4 performance measurements. Given many independent occurrences

of Xbar, it would then be possible to form a distribution of sampling averages.

With such a distribution, certain decisions about process centering could be

32 As this discussion point would naturally infer, the population standard deviation is fully known a priori

and genuinely reflects all known and unknown sources of random error (white noise). 33 For purposes of this discussion, it will be known to the reader (by definition) that the given sample

consisting of N = 4 members prescribes a “rational” sub-grouping of the measurements. In recognition of conventional quality practice, this assertion stipulates that the observed within-group errors are fully independent, random and normally distributed. Furthermore, sampling plans that involve the formation of rational subgroups often rely on a subgroup size within the general range 4 < N < 6, where the typical subgroup size is often defined as N = 5. Since subgroup size is positively correlated to statistical precision, it is proposed that the case of N = 4 can be pragmatically and operationally viewed as a “worst-case” sampling construct, especially when declaring the expected theoretical error associated with process centering. In other words, a design engineer is often not privy to the sampling plan that manufacturing intends to implement (for purposes of statistical process control). As a consequence, the design engineer should be somewhat pessimistic when attempting to analyze the influence of natural process centering errors on design performance. Hence, the reliance on “worst-case” sampling assumptions when analyzing the producibility of a design (prior to its release for production).


rationally made and scientifically interrogated during the course of design, as

well as production.34,35

To better illustrate the import of our latter discussion, we should closely

examine the simple case of N = 4 and α = .0027, where Xbar = Τ. With these

conditions in mind, it is more than reasonable to ask the central question: “For

any given cycle of production, what is the expected bandwidth around T

within which Xbar should fall – given only the expectation of random

sampling error?” In different form, the same question could be presented as:

“For any given cycle of production, how much could a sampled process center

be expected to momentarily shift from the nominal specification before such a

deviation can no longer be attributable to random variation?” In short, this

question seeks to uncover the maximum statistical extent to which a sample

average can be off-set from the target specification in the instance µ = T, but

only for the special case of N = 4 and 1 – α = .9973.

Obviously, an answer to this question would reveal the theoretical extent

to which a common process could momentarily “shift and drift” from its ideal

centering condition (owing to the sole influence of random sampling error),

given that the population mean µ is, in reality, centered on the nominal

specification such that µ = T. With such a rule-of-thumb, a product design

engineer could realistically emulate and better simulate the extent to which

normal process centering bias could influence the performance of a given

product design. It is from this perspective that we will explore the six sigma

shift factor and consider its implications for product design analysis and

optimization.

34 For example, the distribution of sampling averages is one of several theoretical constructs that is

essential to the proper construction and operation of an Xbar and R chart. Such statistical devices are often employed during the course of production to ensure the proper and sufficient management of process centering.

35 Knowledge of the distribution of sample averages would make it possible (and highly advantageous) to account for natural process centering error during the course of design. In this manner, the natural and expected errors in process centering (as would be normally experienced during the course of production) could be effectively neutralized or otherwise managed at the time of design configuration. Of course, the principles of robust design and mathematical optimization could be invoked to realize this aim.


To fully answer this historically stubborn question, we must first

consider the confidence interval surrounding the process average µ. With this

aim in mind, the experienced practitioner will recall that such a boundary

condition about µ can be given as Xbar – Zα/2σST / N1/2 < µ < Xbar + Zα/2σST /

N1/2, where µ is the population mean, N is the sample size, Xbar is the sample

average, Zα/2 is the required type I decision risk (expressed as a standard

normal deviate), and σST is the short-term population standard deviation. If

we standardize to the case NID (0,1) and let Xbar = µ, it would be most

apparent that the given confidence interval can be reduced to the form of -

Zα/2 / N1/2 < 0 < +Zα/2 / N1/2.

For the special case of N = 4 and α / 2 = .00135, we can easily obtain

the solution - 3 / 2 < 0 < +3 / 2, therein providing the standardized interval of

0 ± ZShift = 0 ± 1.5. Under such theoretical but conventional conditions, it is

reasonable to assert that only 2,700 subgroups out of every 1,000,000 would

produce a sampling average outside the interval T ± 1.5σST. In other words, it

is not likely that µ would be momentarily shifted more than 1.5σST from T,

owing to the presence of random sampling error. Hence, it can be statistically

concluded that the 1.5σ shift factor is a rational means for realistically and

meaningfully injecting the bias of process centering error into the ways and

means of a producibility analysis (repeatability study). Of course, the

statistical confidence underpinning such an assertion would be given as 100( 1

– α ) = 100( 1 – .0027 ) = 99.73 percent confidence, but only when

considering the special case of N = 4 randomly selected measurements drawn

from a normal distribution and where the population parameters µ and σST are

known a priori.36

In many organizations, these sampling conditions and process control

criteria are often at the forefront of daily operation. Given this, it would be

36 The reader must recognize that such a level of confidence ( 1- α = 1 - .0027 = .9973, or 99.73 percent)

is frequently employed in the quality sciences, especially in the application of statistical process control (SPC) charts. Statistically speaking, this particular level of confidence is defined or otherwise circumscribed by the ± 3.0σXbar limits commonly associated with the distribution of sampling averages.


most reasonable to artificially induce a 1.5σST shift in the target value each

CTQ during the course of design analysis. By doing so, the engineer can

better study the performance repeatability of a product configuration prior to

its release for full-scale production. To this end, the methods and tools

associated with the practice of design for six sigma (DFSS) can be readily

employed so as to avoid, tolerate, or otherwise neutralize the influence of a

momentary shift in the process center.

3.6 Applying the shift

To better understand the implications of the six sigma definition and the

1.5σST shift, let us consider a simple example. Suppose that a particular

system is characterized by K = 2,500 opportunities. For the system design

discussed in this example, it will be know there is only one opportunity per

CTQ. 37 It will also be known that each opportunity is fully independent and

4.5σLT capable (in the long-term). Thus, for such a level of sustained

capability, we would expect (on average) only one nonconformance out of

every 1 / .0000034 = 294,048 opportunities.

Based on these facts, the total defects-per-unit would be given as dpu =

p(d) * K = .0000034 * 2,500 = .0085. Given this, we recognize p(d) as the

statistical probability of nonconformance (per opportunity, over many cycles

of process operation). We also understand that K is the total number of

independent opportunities contained within the system (unit). With these

facts in mind, we note that about one out of every 118 systems (units) could

be expected to contain at least one CTQ that is defective (verified to be in a

state of nonconformance).

Owing to this level of long-term quality, it is often desirable to

approximate the probability of zero defects, or “throughput yield” as it is

frequently referred to. It is possible to provide such an estimate by way of the 37 The reader must recognize that a critical-to-quality characteristic (CTQ) is, by definition, a defect

opportunity (assuming it is actively assessed and reported). To illustrate, let us consider a product called “Z.” As expected, Z would most likely consist of Y number of CTQs, where any given CTQ could have X number of occurrences. Therefore, the total number of defect opportunities per unit of product would be computed as O = Σ(Y * X).


Poisson function. Considering this function, we would recognize that Y =

(np)re-np / r!, were n is the number of trials, p is the event probability, and r is

the number of such events. By rational substitution, we would further observe

that Y = (dpu)re-dpu / r!, where dpu is the defects-per-unit. Thus, for the

special case of r = 0 (zero defects), we are able to ascertain the throughput

yield (probability of zero defects) by the simple relation YTP = e -dpu.

For the case example at hand, we compute the throughput yield to be YTP

= e -dpu = e -.0085 = .9915, or about 99 percent. This is to generally say there is

99 percent confidence that all K = 2,500 CTQ opportunities will “yield”

during the course of production – assuming that each characteristic

(opportunity) is fully normal, independent and exhibits a long-term capability

of 4.5σLT. If so, each system would then maintain a 99 percent probability of

zero defects.38

Reasoning from the flip side of our example, it can be said that the long-

term, first-time yield expectation is known to be YFT = .9999966. Since there

exists K = 2,500 independent yield opportunities per unit, the throughput yield

should be given as YTP = YFT2,500 = .9915, or about 99 percent. In turn, the

defects-per-unit expectation could be computed as dpu = -ln(YTP) = -ln(.9915)

= .0085. Here again, the Poisson distribution is used to facilitate this

approximation.

So as to provide a first-order approximation of the short-term capability,

we merely add the “standard shift correction” of 1.5σ to the long-term

capability.39 In this case, we would compute ZST = ZLT + 1.5 = 4.5 + 1.5 = 6.0.

38 Of sidebar interest, the advanced reader will understand that the Poisson distribution can be employed

to establish the throughput yield of a process (likelihood of zero defects) when the dpu is known or has been rationally estimated. This is done by considering the special case of r = 0 (zero defects), where the quantity Y = [(dpu)r * e -dpu ] / r! is reduced to Y = e -dpu. In this reduced form, the quantity Y represents the statistical probability of first-time yield. In other words, e -dpu is a quantity that reports the statistical probability of a unit of product (or service) being realized with zero defects (based on the historical dpu or projection thereof).

39 Generally speaking, the shift factor is added to an estimate of long-term capability in order to remove

long-term influences, therein providing an approximation of the short-term capability. Conversely, the shift factor is subtracted from an estimate of the short-term capability in order to inject long-term influences, thereby providing an approximation of the long-term capability. For example, if the long-term capability of a process was known to be 4.5σ, and we seek to approximate the short-term


At this point, the reader is admonished to recognize that ZST is merely a

general figure of merit (performance index) that constitutes nothing more than

a high-level approximation of the prevailing instantaneous reproducibility (per

CTQ opportunity).40

In this example, we would naturally recognize that the approximated

value of ZST = 6.0 would statistically translate to an equivalent bilateral short-

term throughput yield expectation of 99.9999998 percent (per opportunity).

Based on this level of short-term yield, we could expect about one out of

be validated in a state of nonconformance.41 Based on these facts, the total

defects-per-unit expectation would be given as dpu = p(d) * K = .000000002 *

2,500 = .00000495. Thus, we could expect about one out of every 201,996

units to contain at least one defective CTQ opportunity.

3.7 Framing the correction

At this point in our discussion, it should be generally understood that the

short-term standard deviation is at the core of many process capability

metrics. Without the short-term standard deviation, it would not be possible

to establish the instantaneous reproducibility of a process. To better

understand the nature of this particular performance measure, let us consider a

rational sampling strategy consisting of g subgroups, each of which is

comprised of n sequential observations – where g is sequentially obtained and

substantially large, n is relatively small and the sampling intervals are

randomly determined. In this manner, we are able to “trap” the vast majority

capability, then 1.5σ would be added to 4.5σ, therein providing the short-term estimate of 6.0σ. Conversely, if the short-term capability was known to be 6.0σ, and we seek to approximate the long-term capability, then 1.5σ must be subtracted from 6.0σ, therein providing the long-term estimate of 4.5σ

40 In other words, it is a “best guess” in the light of ambiguity – especially in the absence of actual short-term performance information. As such, it must not be viewed as an empirical measure of inherent capability or instantaneous reproducibility – as many uninformed practitioners might falsely believe. It is simply a rational “ballpark” approximation, expectation, or projection of short-term performance – made in the absence of empirical data or experiential information.

41 Naturally, this assumes that the process of verification is perfect. This is to say that the test or inspection process is fully devoid of any type or form of error. In other words, the probability of decision error is zero, regardless of its nature -- Type I ( α ) or Type II ( β ).


every 1 / ( 1 - .999999998) = 1 / .000000002 = 500,000,000 defect opportunities to

of white noise (random variation) over time, while concurrently disallowing

the influence of nonrandom sources of variation.

As the ng observations are sequentially and progressively made

available over time, the short-term standard deviation will asymptomatically

approach its maximum limit – reflecting only random sources of variation. Of

course, this provides a rational estimate of inherent capability (instantaneous

reproducibility). However, whenever ng is relatively small and sequentially

originated (say ng = 30), it is often not possible to trap all of the non-

deterministic sources of error (owing to the fact that some of the sources could

be time-dependent). Consequently, it would be analytically desirable to

compensate the biased estimate of error (due to the constrained sampling

strategy) so as to approximate or otherwise forecast the true magnitude of

random error naturally inherent to the defined population.

Since all of the random errors cannot be made instantaneously available

for analytical consideration and evaluation, we attempt to compensate the

estimate of instantaneous reproducibility (short-term standard deviation) by

simply expanding its relative magnitude by a rational correction. As we shall

come to understand later in this book, the specific magnitude of expansion in

the short-term standard deviation (due to un-sampled time-dependent sources

of white noise) can be expressed in the form of an equivalent mean offset (on

the order of ZShift = 1.5σ). Such an offset can be given in unilateral or

bilateral form, depending upon the application circumstances. So doing

provides an analytical model of the long-term capability, but expressed as an

equivalent short-term distribution experiencing a transient shift.

Without such a correction, it would often not be possible to

meaningfully execute certain types of engineering and producibility analyses

that are dependent on a consideration or evaluation of the full range of random

variation inherent to a system of causation. Consequently, the six sigma shift

factor should be thought of as a corrective measure for calibrating the

instantaneous reproducibility of a process for unaccounted, but natural long-

term random variations. More specifically, the shift factor is a statistically


based mechanism that is intended to adjust the short-term capability of a

performance variable for the influence of unknown (but yet anticipated) long-

term random variations.

In this regard, the unilateral correction can be implemented by

considering the quantity µ + 1.5σ, or µ −1.5σ depending on the worst-case

direction of effect. Of course, the bilateral correction is recognized as µ ±

1.5σ, but is seldom implemented as such – owing to the fact that the expanded

standard deviation is of more convenient form. Thus, the net effect of

unknown long-term sources of random error can be rationally postulated and

generally accounted for.

The ability to postulate and subsequently induce such a statistical bias is

often essential for the effective study of certain types of physical phenomenon

and engineering conditions. For example, it might make more “engineering

sense” to simulate an electrical circuit or conduct a mechanical tolerance

analysis under the condition where the nominal (target) specification of each

component is temporarily repositioned to a location other than the nominal

specification. This is often done to facilitate or otherwise enhance the larger

engineering analysis. But without a corrective guideline (such as the 1.5σ

shift factor), the designer is often uncertain as to the extent of adjustment that

should be applied.

In most engineering applications, the engineer naturally recognizes the

design will be subjected to some “process perturbations and anomalies” over

time, but the relative extent to which such random errors should be considered

is often unclear or ambiguous. In other words, the practicing engineer often

does not know how to declare or define a corrective device to sufficiently

compensate for the overestimation or underestimation of process capability.

Because of this, the designer simply reverts to classical worst-case analysis so

as to be “absolutely sure.” Of course, the practice of worst-case analysis

inevitably leads to overly conservative design specifications. To this end, the

six sigma concept of an equivalent static mean shift and that of the expanded

short-term standard deviation provides the engineer with the analytical


capability and flexibility to combine the benefits of statistical reasoning with

the merits of worst-case analysis.

3.8 Establishing the center

Related to the previous discussion is the issue of establishing an

operating center for a process. To illustrate some of the more prevalent

nuances associated with establishing a pre-determined process center (target

specification), let us evaluate a particular CTQ. For the sake of discussion, it

will be known that the related process is a “system of material removal,” such

as a milling operation. In this particular instance, we will say the CTQ is an

outside dimension related to a particular steel block. With respect to this

CTQ, we will also declare that the design specification sets forth a certain

centering condition called the “target value,” simply abbreviated as “T.”

However, we will further assert that after some number of process cycles, it is

economically demonstrated that another process set point would be more

beneficial. For the sake of our discussion, we shall refer to such a point as ζ.

For the most part, the manufacturing implications of ζ are fairly self-

apparent. If the process is initially centered on T such that µ = T, we surely

understand that µ will eventually “drift” toward the upper specification limit

(USL) over time – owing to the unavoidable effects of natural tool wear.42

Interestingly, at some point called “P” the tool must be replaced, regardless of

how much “useful life” might remain in the tool. However, if the initial

process centering is established at ζ such that ζ < T, the average tool life can

be significantly extended, thereby resulting in a legitimate cost savings

without sacrificing capability or quality.

Hence, the idea of “optimality” now revolves around ζ and not T, even

though the design clearly specifies T as the required nominal condition. In the

scenario at hand, the process owner is more concerned about the disparity

42 In this particular case example, the “centering drift” is biased toward the USL, owing to the fact that an

outside dimension (OD) is being considered. If an inside dimension (ID) is being considered, the drift will be biased toward the LSL.


between the natural process center (µ) and ζ, versus any discrepancy that

might exist between the µ and T. From this vantage point, we recognize that

such a bias provides the process owner with a significant benefit.

To better understand the practical meaning of this discussion, let us form

another application example. Suppose we are considering the thickness of

nickel-plating on a particular engine part. In this case, we would say that

plating thickness is the CTQ of concern. For the sake of discussion, we will

also say that the design engineer established a symmetrical-bilateral plating

specification (in terms of metal thickness).

Under this condition, the process owner would seek to set ζ < T because

an under-plated part can be resubmitted to the process for additional plating,

but over-plated parts must be scraped. Such scrap is generated because it is

not economical or practical to remove progressive layers of plating. Hence, to

set ζ < T makes more “manufacturing sense.” Given this, it is easy to see why

engineering and manufacturing often butt heads. Simply stated, the fate of

ζ is subject to negotiation because everyone has their own idea of what is

“optimum.” Again, the idea of “success” is relative.

This point is most dramatically reinforced when we consider a case that

involves the use of an asymmetrical bilateral tolerance. For purposes of

illustration, let us say that the design target T is intentionally located off-

center relative to the specification limits. In other words, T is asymmetrical

with respect to USL and LSL when design engineers employ this type of

performance specification to realize some form of technical benefit.

However, the process owner may recognize some other type of benefit

(usually of an operational nature) that supports a process center other than T,

say ζ. More specifically we will say that ζ is symmetrical with respect to

USL and LSL, but T is not.

Given that ζ is symmetrically located with respect to the specification

limits, and the process distribution is also symmetrical in terms of its shape

(i.e., a normal distribution), we naturally recognize that defects are inherently

minimized, especially when contrasted to the case µ = T, where T is


asymmetrical. As well, the process owner runs the risk of getting “dinged” by

engineering for not centering the process on the nominal specification T;

however, she stands to be praised by management for minimizing defects, cost

and cycle-time by the action of centering µ on ζ. Again, it is easy to see why

the idea of “success” is often relative.

The reader must fully understand that an equivalent shift factor can be

circumstantially computed for the quantities T – ζ and ζ − µ, but not in the

context of six sigma. The reason for this is quite simple – the exact form or

magnitude of these disparities cannot be statistically interrogated or studied a

priori in a generic way, as each case is unique. Consequentially, any attempt

to establish a global correction or standardized constant based on such

disparities would be highly inappropriate and constitute a spurious practice,

owing to the “context sensitivity” of ζ.

While such examples certainly highlight the need for judicious design

practices, we will constrain our ensuing discussion about process centering to

the most common instance. In short, we will constrain our focus to the

symmetrical-bilateral case where T = ζ. The other cases have been

intentionally omitted to limit the length of this book without loss of

specificity. In other words, by only considering the instance T = ζ, we lose

some breadth of discussion, but realize significant depth and color.

4.0 Understanding the Shift

4.1 Identifying the expectations

Given the qualifying understandings presented thus far, we may now

proceed with our discussion about the practice of six sigma and the basis for

employing a 1.5σST shift. To begin, let us once again construct an application

scenario. For purposes of illustration, we shall consider the design of a certain

electrical system. To this end, the project design manager reported that the

initial system configuration was comprised of M = 1,000 performance

features. Of this number, V = 300 “value-centric” features were subsequently


determined to possess leverage with regards to utility, access or worth (the

basic ingredients of quality).43 Of the V = 300 value-centric features, Q = 160

were deemed essential to the realization of utilitarian value (form, fit and

function). Thus, it was established that 160 of the 1,000 design features (16

percent) were critical-to-quality. Consequently, these features were

designated as CTQs 44

With these circumstances in mind, the project manager declared the

short-term system-level confidence to be .99, or 99 percent. Of course, this

constitutes the collective confidence for all Q = 160 CTQs. Based on this, the

project manager was fairly confident about the system reproducibility,

especially since he had established the expectation based on statistical worst-

case assumptions. 45

Based on the system requirements, the instantaneous reproducibility for

each CTQ (also known as the short-term capability) was established by one of

the resident engineers. He was able to make such an estimate by statistically

normalizing the short-term system-level confidence to the opportunity level.

For the scenario at hand, the normalization was presented as CQ = Y1/Q =

.991/160 = .99994, or 99.994 percent. This is to say there would exist at least

43 Utility has to do with the form, fit and function of a deliverable. Access has to do with the various timing,

volume and location aspects associated with the delivery of a product or service. Worth covers the emotional value, intellectual value and economic value of any given deliverable.

44 We naturally recognize that the configuration and composition of a system’s design is unique in every case. In fact, the interactions within and between these two aspects of a design can spawn a very complex system of classification in terms of scope and depth. Owing to this, we often see the Pareto principle at play when considering a certain aspect of the design. Translated, this principle holds that a certain 15 percent of a system’s complexity will fully account for 85 percent of the value associated with a given aspect of quality (utility, access, worth). However, when the various aspects of quality are considered as a collective whole, the Pareto principle is often severely mitigated or otherwise distorted. In general, however, the Pareto Principle (85/15 rule) will emerge and become self-apparent as a given system of quality classification is hierarchically and progressively interrogated. The reader should be aware that many practitioners advocate the rule of Pareto to be 80/20. Regardless of analytical precision, the main lesson under girding the Pareto Principle is about how the vital few often has more influence than the trivial many.

45 In the namesake of pragmatic communication, this author has made liberal use of the term “worst-case.” For the given context, it must not be interpreted as a mathematical absolute or engineering construct, but rather as a statistical boundary condition (much like the natural limits of a confidence interval). For example, one of the confidence bounds related to some mean (or standard deviation) can be thought of as the “statistical worst-case condition” of that parameter. In this context, the term is quite relative to such things as alpha risk and degrees-of-freedom, not to mention various distributional considerations. Nonetheless, its use carries high appeal for those not intimately involved with the inner workings of statistical methods. More will be said on this topic later on in the discussion.


99.994 percent certainty that any given utility-centric CTQs would comply

with its respective performance specification. Another interpretation would

be that 99.994 percent of the attempts to replicate or otherwise create any

given utility-centric CTQ would prove successful (with respect to the

performance specifications).

In this case, the instantaneous reproducibility (short-term confidence of

replication) statistically translated to a ±4.00σST level of bilateral capability

per CTQ (per quality opportunity). From a unilateral perspective, the

capability would be given as 3.83σST. Either way, the odds are about one out

of 15,920 that any given attempt to produce a CTQ will fail otherwise fall

short of performance expectation (unilateral or bilateral, as the case may be).

Once the system configuration was finalized and agreed upon, the

project manager then appointed a certain quality engineer to continue the

producibility analysis and optimization exercise. At this point, several of

engineer’s colleagues pondered: “Given the performance specifications of a

CTQ, how can someone go about analyzing the manufacturing viability

(reproducibility) of a design when a corresponding production process has yet

to be selected?”

For this case, the quality engineer (producibility analyst) decided to

begin her study by first isolating one of the most critical utilitarian-centric

design features. In this case, we will say she selected CTQ4. We will further

suppose that CTQ4 was assigned a nominal specification (target value) such

that T = 100 units. It will also be known that the tolerance bandwidth was

specified as B = ±30 units. Thus, the range of performance expectation for

CTQ4 was given as LSL < Y < USL, or simply 70 < Y < 130. At the onset of

the study, the analyst discovered that the original product designer imposed a

conventional safety margin of M = .25, or 25 percent. The analyst also

learned that this particular level of guard-banding was selected and

subsequently specified because of several reliability considerations

surrounding CTQ4. For the reader’s convenience, figure 4.1.1 provides a


graphical understanding of the design margins used in the case scenario under

consideration.

Figure 4.1.1

Visualization of the Design Margins Imposed on CTQ4

4.2 Conducting the analysis

At this point, the analyst decided it would be necessary to set forth the

short-term standard deviation that would be associated with CTQ4. Using the

short-term system-level producibility analysis as a backdrop, she computed

the quantity σA = (SL – T) / ZST = (130 –100) / 4.0 = 7.50. Of course, this

particular standard deviation represents or otherwise constitutes the

instantaneous capability of CTQ4.46 For purposes of our discussion, we will

46 Instantaneous capability only reports on the short-term reproducibility of a characteristic. In other words,

it only considers the influence of random background variations (white noise, or pure error as some would say). In this context, the instantaneous (short-term) capability offers a moment-in-time “snapshot” of the expected performance error. An extension of this idea provides the understanding of “longitudinal capability.” The longitudinal capability (also called temporal capability) not only considers the influence


simply refer to this idealized standard deviation as the “short-term variation

model,” or SVM.47 In this case example, the specified SVM represented the

analyst’s assertion (preliminary expectation) about the instantaneous process

capability that would be needed to realize the short-term system-level

producibility goal as well as the anticipated value entitlements.48

Remember that the original design engineer established a uniform safety

margin (guard band) of 25 percent at both ends of the tolerance. With this in

mind, the analyst was able to employ a second approach for establishing a

short-term standard deviation. Thus, she computed the SVM and provided the

result as:

( )( ) ( )( ) 50.73

10013025.13

TUSLM1ˆ A =

−−=

−−=σ .

Eq.( 4.2.1 )

of black noise (nonrandom variations), but includes the influence of white noise as well. In the real world, short-term capability is always greater than long-term capability (in terms of Z) for a wide range of pragmatic reasons (e.g., the influence of tool wear, machine set-up and the like). Only when there is an absence of black noise will the two forms of capability be equal. Under this condition, the characteristic of concern is (by definition) said to be in a perfect state of “statistical control.” In other words, variation in the characteristic’s performance is free of assignable and special causes and, as a consequence, is subject to only random sources of error.

47 The short-term variation model (SVM) is offered as an analytical contrast to the long-term variation model (LVM). By definition, the SVM only reflects the influence of random variation (extraneous error of a transient nature), also called “white noise.” The LVM not only reflects random sources of error, but nonrandom sources as well. In this sense, the LVM echoes “gray noise” because it reflects the mixture of random and nonrandom sources of error. The differential between the SVM and the LVM portrays the pure effect of nonrandom variation, or “black noise” as it is often called. In general, it can be said that the influence of random error determines the bandwidth of a performance distribution, whereas the signal (central tendency) of that distribution is governed by nonrandom error. Thus, we say that T = W + B, where T is the total noise, W is the white noise and B is the black noise. Owing to this relationship, it should be apparent that the total noise can be decomposed into its component parts for independent analysis and optimization.

48 There are a number of different types and forms of variation design models (VDMs), such as that for a hypothetical mean and variance. In most cases, the VDM is a theoretical construct (or set of constructs) that is postulated so as to engage or otherwise facilitate some type of design-related analysis, simulation or optimization. Interestingly, in more progressive design organizations, the VDMs are provided in a database that consists of actual process capabilities and various types of parametric data. Such databases provide a distinct advantage when attempting to “mate” a CTQ specification to a production process. The pairing of a design specification with the performance capability of a candidate process is a key topic in the field of design for six sigma, or DFSS as it is most often called. Of course, the primary aim of such “pairing” is to optimize all value entitlements, not just those that are product performance or quality related.


Of interest, the reader should recognize that the denominator quantity of

3 was given as the analytical equivalent of unity. By providing such a

quantity, the analyst was able to rationally postulate a distribution based on

the specification limits. Thus, the SVM for CTQ4 was declared to be σA =

7.50, thereby satisfying the expectation M = .25, where M was made relative

to the upper specification limit (USL). Of course, the same would hold true

for the left side of the specification owing to its symmetrical-bilateral

character. We also recognize that the value σA = SVM = 7.50 is theoretical by

nature. As a consequence of construction, it was implicitly prescribed with

infinite degrees of freedom (df).49

As previously stated, this magnitude of planned variation (instantaneous

error) provided the analyst with a short-term process capability expectation of

±ZσA = ±4σA = ±30 units. Based on this, the ±4σA range of CTQ4 was

recognized to be 70 < Yi < 130, where Yi is the ith replication of CTQ4. A

visualization of this condition is fully illustrated in figure 4.2.1 and referenced

in the form of case “A.”

49 As a theoretical construct, the notion of degrees of freedom is fully independent of time, but not so in

practice. For example, it would take an infinite period of time to produce an infinite number of units. However, when considering the many approaches to the conduct of a producibility analysis, it should be recognized that any given VDM containing an infinite degrees of freedom can be declared as a short-term or long-term model, depending on application circumstances. For the case scenario at hand, we can say that the designer postulated the referenced VDM as a short-term construct, owing to the application context. In other words, the designer treated the given VDM as an “instantaneous model,” versus a “temporal model.” As a result, the analytical focus is on “error expansion” as compared with “error contraction.” More will be said later in this discussion about these two unique but interrelated concepts.


Figure 4.2.1

Theoretical Short-Term Performance Capability of CTQ4

During a brief discussion with the manufacturing manager, the analyst

discovered that, by convention, a design-process qualification procedure

(DPQ) would be executed for all of the utilitarian-centric CTQs. She also

learned that the qualification procedure would be conducted on the basis of n

= 30 random samples.50 Such a sample size is often recognized as “ideal” in

50 During the execution of a design-process qualification (DPQ) it is often not possible to obtain the

measurements by way of a random sampling plan. For example, a newly developed process might be brought “on-line” just long enough to get it qualified for full-scale production. Given this, it is likely that only a few units of product will be produced (owing to the preparation and execution costs associated with a short production run). As yet another example, the candidate process might currently exist (and have a performance history), but has been selected to produce a newly developed product (with no production history). In either case, there is no “steady stream” of measurements from which to “randomly sample”. When such constraints are at hand, such as presented in our application scenario, the performance measurements are usually taken in a sequential fashion. Owing to this, one must often assume that the resulting measurements are independent and random (for purposes of statistical analysis). The validity of this assumption can be somewhat substantiated by autocorrelation (testing the measurements at various lag conditions). If the resulting correlation coefficients are statistically insignificant (for the first several lag conditions) it is reasonable to assume that the measurements are random-even though they were sequentially obtained. Given the general absence of correlation, it would then be rational to assert their random nature and independence. In addition, we are also often forced to assume that the measurements are normally distributed. Employing a simple normal probability plot can test this assumption (to a reasonable extent). In essence, we are often forced to employ a sequential


terms of the tradeoffs between statistical precision and sampling costs. The

point of diminishing returns on sample size can be visually understood by

referencing figure 4.2.2. This illustration clearly shows why many

statisticians and quality practitioners attempt to enforce this rule of thumb. 51

Following this sampling, the measurements would be recorded and

subsequently analyzed at some point prior to the design’s release for full-scale

production. Unsurprisingly, the DPQ would be invoked so as to verify that

the selected process could (in reality) fully satisfy the SVM expectation.52

sampling strategy and then subsequently utilize a family of statistical tools that assumes the data is normal, independent, and random. Fortunately, many statistical procedures (such as those often used during a DPQ) are relatively robust to moderate violations of the aforementioned assumptions.

51 Statistically speaking, we recognize that the given sample size (n = 30) constitutes a point of “diminishing return” with respect to “precision of estimate.” To better understand this point, let us consider the standard error of the mean. This particular statistic is defined by the quantity σ/sqrt(n). Now suppose we were to plot this quantity for various cases of n under the condition σ = 1.0. Such a plot would reveal several break points, or “points of diminishing return.” The first point occurs at about n = 5, the second point at around n = 30, and the third point in the proximity of n = 100. Thus, as n is incrementally increased, the quantity σ/sqrt(n) decreases disproportionately. This is one of the reasons statisticians often say that n = 30 is the ideal sample size – it represents a rational tradeoff between statistical precision and sampling costs.

52 The reader should recognize that many manufacturing organizations “buy off” on a process (during the design phase) on the basis of only a few samples. In fact, some execute a practice called “first article inspection.” From a statistical point of view, this is a very spurious practice, since it is virtually impossible to construct meaningful (useful) confidence intervals with only a few degrees of freedom. Without proof, there are valid reasons for supporting the case of n = 30. For the purpose of process qualification, it may be necessary to form g rational subgroups consisting of n observations to realize ng = 30 samples. Rational subgrouping is often employed to block sources of black noise. In essence, such a practice enables the benefits of a larger df, but minimizes the likelihood of “black noise contamination” in the final estimate of instantaneous capability.


Figure 4.2.2

The Point of Diminishing Return on Sample Size

4.3 Considering the implications

Since the given DPQ called for n = 30 samples, the analyst recognized

that she would only have df = n – 1 = 30 – 1 = 29 degrees of freedom with

which to statistically verify the SVM during the course of process evaluation.

Given this, she reasoned to herself that such a sampling constraint might

produce a biased estimate of the “true” short-term process standard deviation.

Owing to this phenomenon, there would exist some statistical likelihood of

rejecting a candidate process that might have otherwise been fully qualified.

As we shall come to understand, the implications of this are quite profound.

For example, if the true short-term process standard deviation of a

particular “candidate process” is in reality 7.50, it is quite likely that a limited

sampling of n = 30 will reveal a biased short-term standard deviation. This is

to say that any given estimate of instantaneous reproducibility could provide a

short-term standard deviation greater than 7.50, owing to a pragmatic

constraint on the degrees of freedom made available for the process


evaluation. Of course, any given random sampling of n = 30 could just as

well provide an estimate less than 7.50.

If the resulting estimate proves to be greater than 7.50, management

would falsely reject a process that, in reality, would have otherwise been fully

qualified. In statistical lingo, the probability of such a decision error is called

“alpha risk” and is understandably designated as α. Of interest, it also called

“producer’s risk.” From this perspective, we can treat the alpha state as if it

were a worst-case condition, but of a statistical nature. In this sense, the

worst-case condition is statistically defined by the given degrees of freedom

and the selected level of decision risk.

On the other hand, if the random sample of n = 30 revealed a short-term

standard deviation less than 7.50, management would falsely believe they

adopted a supremely qualified process when, in reality, it would prove to be

only marginal (but yet acceptable). Obviously, the alpha state is of more

concern to the project manager (and analyst) since this particular type of

decision risk is far more likely to produce negative consequences (with

respect to the realization of value entitlement). Owing to this, the alpha state

is often referred to as the “statistical worst-case condition,” or SWC in

abbreviated form.53

4.4 Constructing the worst case

Because of such reasoning, the analyst decided to compute an upper

confidence limit for the SVM so as to account for random sampling error

(under the constraint df = 29 and α = .005).54 In other words, the analyst

53 The idea of an alpha state can be applied to any type of sampling distribution (empirical or theoretical) or,

more specifically, to any or all of the parameters associated with a sampling distribution (such as measures of central tendency and variability, or mean and standard deviation, respectively). Owing to this, a “statistical worst-case distribution” is also called the “alpha sampling distribution.” As such, it constitutes a producibility risk condition that prescribes the “statistical state of affairs” in the presence of random sampling error.

54 By convention, alpha risk is often established at the .05 level. Of course, this translates to 95 percent decision confidence. Since the statistical analysis of a design almost always involves multiple decisions, a higher level of decision confidence is often required to compensate for the degradation of confidence when considering the cross-multiplication of decision probabilities. Therefore, we impose a 99.5 percent level of confidence (as a convention) for purposes of producibility analysis. This substantially improves the aggregate confidence when considering multiple decisions. For example, a .95 level of decision


wanted to compute the upper confidence bound of the SVM, but with 100(1 -

α) = 100(1 - .005) = 99.5 percent certainty, given the limitation of df = 29.

Knowing this, she set about estimating the upper confidence interval, also

known as the UCL. This was accomplished with assistance of the chi-square

distribution. Following the computations, she presented her result as:

15.1112.13

1305.71nˆˆ 2AB

1=−=

χ−

σ=σα−

Eq.( 4.4.1 )

Thus, she was able to compute (estimate) the worst-case condition of the

short-term standard deviation model (SVM). The analyst then concluded that

if the DPQ team isolated a process that exhibited a short-term standard

deviation of 7.50 (on the basis of n = 30 random samples), it would be

possible that the “true” standard deviation could be as large as 11.15 (worst-

case condition). Obviously, if such a magnitude of variation eventually

proved to be true (because of random error in the qualification sample), there

would be a practical (as well as statistical) discontinuity between the analyst’s

reproducibility expectation and reality.55

confidence applied to 10 decisions provides an aggregate (joint) confidence of only 60 percent, whereas a .995 level reveals the joint certainty to be about 95 percent. Of course, it is fully recognized that some circumstances might require a more stringent alpha while others might tolerate a more relaxed criterion.

55 For the reader’s edification, it should be pointed out that the general combination α = .005 and df = 29 is arguably the most “generally optimal” set of such decision criteria to employ when conducting a design-process qualification (DPQ). When considering this, and other factors, the given combination offers a standard convention from which to initiate the practice of design for six sigma (DFSS). Of course, this particular convention should give way to other combinations as DFSS becomes entrenched in an organization. Experience will naturally show the path to more optimal conditions of α and df, owing to the unique circumstances associated with each application – sampling costs, destructive testing, production volume, background knowledge, internal procedures, customer requirements and so on. Owing to the consideration of these and many other factors, the combination α = .005 and df = 29 was employed by this researcher to originally establish the first six sigma DPQ and subsequently validate the 1.5σ shift factor. Again, it must be recognized that this particular set of decision criteria was judiciously selected and practiced by this researcher for an array of theoretical and pragmatic reasons, many of which are far beyond the scope and intent of this book. Consequently, we must recognize and always bear in mind that the 1.5σ shift factor is a dynamic construct of a theoretical nature. As such, it is only retained in static form when the aforementioned decision criteria are considered and practiced as a convention.


Given the level of sampling risk on the upper end (α

= .005), she could

nominally expect such an adverse discrepancy (or smaller) in one out of every

200 samplings. On the flip side, the analyst would have 99.5 percent certainty

that the true standard deviation would eventually prove to be less than the

worst-case condition of 11.15, given a “qualified” standard deviation of 7.50,

as estimated on the basis of n = 30 random samples.

More poignantly, there would exist a 50/50 nominal chance that the true

standard deviation would eventually prove larger than what the DPQ estimate

would suggest.56 Obviously, the odds would not be in the analyst’s favor if

such a sampling criterion was set forth (and subsequently obtained during the

qualification trial). In other words, the likelihood of being satisfied with such

a process would not be biased in the analyst’s favor. At the risk of

redundancy, this is to say that an estimate of σ = 7.50 (at the time of

qualification) could ultimately lead to the adoption of an incapable process.

4.5 Exploring the consequences

To better understand the import of the latter discussion, let us reason

from the other side of the coin. In other words, let us assume that the true

standard deviation of CTQ4 was in reality 11.15, but the sampling standard

deviation was found to be 7.50 upon execution of the DPQ. Of course, such

an estimate of the short-term standard deviation would be due to a statistically

biased sample at the time of qualification. Given this, the project manager

could unwittingly adopt a highly biased “4σ” process.

Based on such a biased DPQ result, it is reasonable to assert that the true

short-term standard deviation would eventually stabilize at its genuine value

of 11.15 (across an infinite number of observations). Of course, such an

inflationary effect would become quite evident during the course of

56 Based on an expected short-term standard deviation of 7.50, and given that α = .50 and df = 29, the 50th

percentile of the chi-square distribution reveals a theoretical short-term standard deviation of 7.68. This small discrepancy is attributable to the fact that df = 29. However, as the degrees of freedom approaches infinity, the consequential discrepancy would necessarily approach zero. Thus, we recognize the 50/50 odds that the true short-term standard deviation will be greater (or less) than 7.50.


cumulative sampling. In other words, the progressive discovery and

integration of other sources of random sampling error during ongoing

production would naturally tend to inflate the short-term standard deviation

until it stabilized at its true value.57 Without a reasonable doubt, the veritable

capability would ultimately prove to be

69.215.11100130

ˆTUSL

ZB

USL =−

=σ

−= ,

Eq.( 4.5.1 )

owing to the unknown presence of random sampling error at the time of

qualification. Of course, such a level of reproducibility would translate to a

capability ratio of Cp = (USL - LSL)/3σ = Z/3 = 2.69/3 = .897. Based on this,

it could be said that 1/.897 = 1.115, or about 112 percent of the design

bandwidth (USL - LSL) would be consumed by the process bandwidth

(±3σST).58 Thus, we recognize that, by establishing a short-term standard

57 Following a DPQ, it is conventional practice to continually monitor the instantaneous and longitudinal

capability of CTQs. For a continuous, high-volume performance characteristic (such as CTQ4), this task can be effectively accomplished by way of a statistical process control device called an “Xbar and S chart.” The use of such an oversight mechanism forces the white noise (extraneous variations) to appear in the S chart while the influence of black noise (nonrandom variations) is forced to emerge in the Xbar chart. The general use of such analytical tools requires the implementation of a sampling technique called “rational subgrouping.” Essentially, this method of sampling forces the random sampling errors to be retained within groups, while the nonrandom sources of error is forced to develop between groups. By virtue of the merits associated with rational subgrouping, one-way analysis of variance can be naturally employed to interrogate the root-mean-square of the error term (within-group standard deviation). As would be intuitively apparent to the seasoned practitioner, the within-group component is a direct measure of instantaneous reproducibility. As a natural consequence, the various components of error can be subsequently utilized to formulate certain other indices of short-term capability (ZST, Cp, Ppk and so on). To achieve this aim, we employ the general model SST = SSW + SSB, where SS is the sum of squares, T is the total estimate, W is the within group estimate and B is the between group estimate. In this form, the SSW term can be continually updated to obtain an ongoing estimate of the background noise (random error) without the contaminating influence of nonrandom sources of error, as this type of error is continually integrated into the SSB term. As a side benefit of this, the SSB term can be employed to establish the “equivalent mean shift.” Of course, all of these assertions can be directly verified by mathematical examination or by way of a simple Monte Carlo simulation. More will be said about this later on in the discussion.

58 From this perspective, it should be evident that a process capability ratio of Cp = 2.00 defines a six sigma level of performance. For this level of capability, only 50 percent of the design bandwidth is consumed by the process bandwidth. Of course, the remaining 50 percent of the design bandwidth is dedicated as “design margin.” Given this, it should be self-evident that a process capability ratio of Cp = 2.00 corresponds to a 50 percent design margin (M = .50). Here again, the criterion of “six sigma” would be fully satisfied.


deviation of 7.50 as a process qualification target, random sampling error will

dictate that the true capability could be worse than ±4.0σ, under the condition

n = 30 and α = .005. Of course, the same may be said when reasoning on the

opposite side of this discussion.

Flipping the coin back to its original side, let us return to the unique

circumstances of our case example. Remember, at this point in the analysis,

the SVM was established at 7.50. Also recall that n = 30 measurements are to

be taken during the DPQ. From this point forward, we will say that the DPQ

reveled a short-term standard deviation of 7.50. As a consequence, it would

seemingly appear that the sampled variation was “in tune” with the theoretical

design expectation.

4.6 Visualizing the distributions

For purposes of visualization, the nominal design distribution (Case A)

and its corresponding worst-case sampling distribution (Case B) are presented

and contrasted in figure 4.6.1. By careful examination of this illustration, the

reader can better visualize the differences between these two cases and reason

through the implications. Doing this will produce a better appreciation for the

potential impact of random sampling error under the constrained condition n =

30 and 1 – α = .995.


Figure 4.6.1

Nominal Design Distribution for CTQ4 (Case A) Contrasted to its Worst-Case Sampling Distribution (Case B).


From this figure it is quite easy to see that the “inflationary effect” of

random sampling error (case B) can be quite profound when reasoning from

the classical engineering mindset of worst-case analysis.59 This is particularly

apparent when the nominal expectation of performance capability (case A) is

contrasted to its statistical worst-case expectation (case B).

The reader will recall that in the field of quality engineering (as well as

other technical fields) it is conventional practice to constrain the idea of unity

between the ± 3.0σ limits of a distribution. Such an understanding of unity

conventionally applies when researching process capability. When

considering the limits of unity related to case B, such a level of inflation (c =

11.15/7.50 = 1.487) is probabilistically equivalent to shifting the theoretical

design distribution (case A) by µA = T ± 1.46σA, or simply 1.5σ. In this

particular case, the equivalent condition is exemplified by cases A1 and A2,

respectively. When considering the upper limit of unity for case A2 ( given as

+3σA2), notice the exact coincidence to the upper limit of unity for case B

(given as +3σB). This is also true when comparing case B to case A1, but on

the left-hand side of the distribution.

59 The reader is admonished to recognize that the general practice of worst-case analysis is, in an of itself,

generally not a good thing. Such a position is rational because the statistical probability of such events often proves to be quite remote. For example, if the probability of failure for a single characteristic is 10 percent and there are only five such interdependent characteristics, the likelihood of worst-case would be .105, or .00001. Of course, this translates to one in 100,000. Obviously, this probabilistic circumstance would imply an “overly conservative” design. Although the aim of worst-case design is to secure a “guarantee” of conformance to performance standards, it usually does nothing more than suboptimize the total value chain. However, when descriptive and inferential statistics are integrated into the general practice of worst-case analysis, we are able to scientifically secure a “conditional guarantee” of sorts, but without absorbing any of the principal drawbacks. In other words, the application of modern statistics to worst-case analysis provides us a distinct opportunity to significantly enhance the ways and means by which the producibility of a product design can be assessed and subsequently optimized. From this perspective, it is easy to understand why this six sigma practitioner advocates the use of applied statistics when establishing performance specifications (nominal values and tolerances) during the course of product and process design.


5.0 Examining the Shift

5.1 Establishing the equality

Although we can easily visualize the equivalent mean shift with the aid

of an illustration, such as that presented in figure 4.6.1, it is often more

convenient to formulate a mathematical understanding of the situation. To

this end, we seek to mathematically equate the statistical cases of A2 and B,

but only at their respective upper limits of unity. In order to develop an

equivalent mean shift, the analyst decided that she should begin by

establishing a fundamental equality. In other words, the analyst recognized

that a “shifted distribution” must equate to an “inflated distribution.” Given

this, she offered such equality in the form

σ+=σ+σ+ ˆ3TˆZˆ3T BAshiftA .

Eq. ( 5.1.1 )

Applying this equality to the case data, the analyst reaffirmed that T =

100, σA = 7.50, Zshift = 1.46, σB = 11.15. In addition, she used the conventional

value of Z = 3.0 as a constant to prescribe the upper limit of unity. By

substitution, the analyst computed the equality as 100 + (3 * 7.5) + (1.46 *

7.5) = 100 + (3 * 11.15) = 133.45.

Recognizing the equality of these two quantities, the analyst was able to

successfully establish that the upper limit of unity related to case A exactly

coincided with the worst-case condition given by case B. Simply aligning the

elements of Eq. (5.1.1) with the corresponding elements provided in figure

4.6.1 provides even greater insight into the stated equality. To further such

insight, she determined that the standardized equivalent mean offset (Zshift)


could be isolated by simple algebraic rearrangement of Eq. (5.1.1). Doing so

provided her the solution:

( ) ( ) 46.150.7

50.7315.113ˆ

ˆ3ˆ3=

∗−∗=

σσ−σ=

A

ABshiftZ .

Eq. ( 5.1.2 )

Thus, she was able to recognize that the quantity Zshift describes the

relative differential between the mean of case A2 and that of case B, but

scaled by the nominal standard deviation associated with case A. Because of

the nature of Zshift, the analyst clearly understood that it could not be

technically referenced as a “mean shift” in the purest and most classical sense.

However, she did come to understand that it could be declared as an

“equivalent mean shift,” but only in a theoretical sense. From another

perspective, she recognized that the quantity Zshift provided her with yet

another benchmark from which to gain deeper insight into the “statistical

worst-case” condition of the design, but only with respect to the temporal

reproducibility of CTQ4 set in the context of a DPQ.

Owing to this line of reasoning, the analyst concluded that the quantity

Zshift is simply a compensatory static (stationary) off-set in the mean of a

theoretical performance distribution (TPD) reflecting the potential influence

of dynamic sampling error (of a random nature) that would otherwise inflate

the postulated short-term standard deviation of the TPD (at the time of

performance validation). In light of this understanding, the analyst noted to

herself that Zshift cannot be statistically described as a “true” standard normal

deviate, simply because its existence is fully dependent upon the chi-square

distribution (owing to the theoretical composition of σB).

Stated in more pragmatic terms, the analyst recognized that Zshift does

not infer a “naturally occurring” shift in a distribution mean (in an absolute or

classical sense). Rather, it is an “equivalent and compensatory shift”

employed to statistically emulate or otherwise account for the long-term


sampling uncertainties that could potentially cause an initial estimate of short-

term process capability to be unfavorably biased in its worst-case direction.

5.2 Developing the correction

As an addendum to the analyst’s reasoning, she made brief recollection

of the dynamic correction called “c.” As presented earlier on in the case, the

analyst reasoned that c is a compensatory measure used to adjust or otherwise

correct for the influence of dynamic random sampling error over a protracted

period of time or many cycles of operation. Given this, she reconciled that σB

= σAc. By simple algebraic rearrangement, the analyst was able to solve for c

and presented her results as

49.1488.150.715.11

ˆˆc

A

B ≅==σσ= ,

Eq. ( 5.2.1 )

where the resultant 1.49 indicated that the SVM (σA = 7.50) should be

artificially inflated or “expanded” by about 149 percent to account for the

potential effect of statistical worst-case sampling error.

At this point, the analyst decided to transform σA to the standard normal

case. Thus, she was able to declare that σA = 1.0 and thereby constitute a unit-

less quantity. For this case, she then rationalized that c2 = ( n - 1) / χ2, where

χ2 is the chi-square value corresponding to selected α and df. Thus, the

analyst was able to establish the theoretical connection between c and the chi-

square distribution. Given this, she then formulated Zshift from a rather unique

perspective by providing the standardized equivalent mean shift in the form:

( ) ( ) 46.1149.131c3Zshift =−=−= .

Eq. ( 5.2.2 )


Based on the case data, the analyst discovered that the standardized

equivalent shift should be given as Zshift = 1.46, or approximately 1.5σA. She

also noted the same result when considering the left-hand side of Case B in

relation to A1. Hence, the analyst now better understood the theoretical basis

for the proverbial “1.5σ shift factor” commonly employed in six sigma work.

To a large extent, this answered her colleague’s initial question: “Where does

the 1.5σ shift factor come from – and why 1.5 versus some other magnitude?”

However, the analyst went on to reason that Zshift will vary somewhat,

depending on the selection of α and n. Latter on in the day she was informed

by another practitioner that such a combination of α and n is fairly typical

when conducting a six sigma DPQ. The more experienced practitioner

informed her that if the aggregate decision confidence (C) was to be

established such that C = 100(1 - α) = 95 percent, and there were k =10

independent CTQs to be concurrently but independently qualified, then each

SVM should maintain a statistical confidence of C1/k = .9501/10 = .995, or

about 99.5 percent. Hence, the rationale for the six sigma convention of

setting n = 30 and 1 - α = .995 for each CTQ during the course of a DPQ.

Following these insights, the analyst (and one of her colleagues)

reflected on their experiences and realized that a great many design

configurations (perhaps the vast majority) are predicated on the assumption

that each CTQ will possess a short-term capability of 4.0σST. They fully

recognized that such a level of capability is generally inferred by virtue of the

conventional reliance on 25 percent design margins. Because such a level of

capability is often targeted when a process is first brought into service, they

believed it was reasonable to assert that the long-term, statistically-based,

worst-case capability expectation (per opportunity) could be rationally

approximated as 4.0σST − 1.5σST = 2.5σLT. Naturally, they both understood

that such a level of capability translates to a long-term first-time yield

expectation of YFT = .99379, or 99.38 percent. As a consequence, they began

to discuss the implications of this for the design of their system.


5.3 Advancing the concepts

Of sidebar interest to the analyst, her colleague continued his dialogue

by reminding her that the ratio of two variances can often be fully described

by the F distribution. He then called upon the classical F distribution to define

yet another approach for establishing the equivalent static mean off-set (Zshift).

Without further consideration, he demonstrated that

Fˆˆc

A

B =σσ=

.

Eq. ( 5.3.1 )

As may be apparent, from Eq. 5.2.2 and Eq. 5.3.1 the analyst was able

to formulate the quantity Zshift = 3( sqrt(1 / F ) - 1 ). Using this particular

relationship as a backdrop, she then computed Zshift = 3( sqrt( 1/ .4525) - 1) =

1.46, or 1.50 in its rounded form. Of course, she referenced the F distribution

with the appropriate of freedom. In this case, she utilized df = ( n – 1 ) = ( 30

– 1 ) = 29 degrees of freedom in the numerator term and declared an infinite

degrees of freedom for the denominator term. In addition to this, she

referenced the F distribution with a decision confidence of C = (1 - α ) = ( 1 -

.005) = .995, or 99.5 percent. Given these criteria, the analyst discovered that

F = .4525.

5.4 Analyzing the system

At this point in our discussion, recall that the system was known to

contain Q = 160 independent critical-to-quality characteristics of a utilitarian

nature. Given this, the analyst reasoned that if each of the CTQs exhibited a

long-term, first-time yield expectation of YFT = 99.38 percent (also equivalent

to ZLT = 2.5), then the aggregate success rate at the system level could be

rationally projected as YSYS = YFTQ = .99379160 = .3691, or about 37 percent.

In other words, there would exist a 37 percent likelihood of realizing a system

without exceeding the specification limits related to any of the Q = 160 CTQs.


In essence, the long-term first-time yield projection of 37 percent

represented the statistical probability of realizing a system with zero utility-

related defects. Consequentially, the analyst determined that the defects-per-

unit metric should be computed. Relying on certain theoretical properties of

the Poisson function, she knew that the long-term defects-per-unit could be

given as dpu = –ln(YSYS) = –ln(.3691) = .9966, or about 1.0. Of course, she

also recognized that this result (dpu = 1.0) assumed that all Q = 160

utilitarian-centric characteristics were postulated in their alpha state.

Without regard to the statistical worst-case condition, she again

judiciously interrogated the total system confidence. This time around, she

declared all Q = 160 critical-to-quality characteristics to be at their respective

nominal states of reproducibility (ZST = 4.0), and then estimated the system

throughput yield expectation. She presented the results of this analysis as YSYS

= YFTQ = .999968160 = .9949, or about 99.5 percent. Given such a high level

of system confidence, she immediately recognized that the corresponding

defects-per-unit would be quite favorable. Again relying on the Poisson

function, she computed the system-level defect rate as dpuSYS = –ln( YSYS ) = -

ln(.995) = .005. Given this computational outcome, she reasoned that, in the

short-term, only one out of about every 200 electrical systems would contain a

CTQ that was not in conformance to its respective specification.

6.0 Validating the Shift

6.1 Conducting the simulation

To validate the statistical assertions presented thus far in our discussion

and highlight our case example, a simple Monte Carlo simulation was easily

designed and executed by this researcher. The simulation-based study was

undertaken to empirically demonstrate that the influence of random sampling

error (in and of itself) is often much larger and more profound than most

quality professionals believe it to be, simply because the nature of its


influence is felt across the full bandwidth of unity, not just in the dispersion

parameter called the “standard deviation.”

In other words, the random error assignable to a single sampling

standard deviation must be multiplied by 3 when estimating any of the

common indices of process capability (e.g., ZUSL), owing to the fact that unity

is declared at the ±3.0σ limits of a distribution. Due to such an accumulation

of error across the total bandwidth of unity, the resulting long-term

inflationary effect is statistically equivalent to bilaterally shifting the short-

term sampling distribution by approximately 1.5σST.

In this instance, several thousand cases (each consisting of n = 30

randomly selected observations) were compiled under the generating

condition µ = 100, SVM = σST = 7.50. For each sampling case (subgroup),

the standard deviation was estimated and made relative to the upper semi-

tolerance zone of the performance specification (USL - T) by expressing the

ratio in the form ZUSL. In turn, each unique estimate of short-term capability

(ZUSL) was then contrasted to the theoretical design expectation ZUSL = 4.0,

therein noting the simple difference as Zshift. The result of this author’s Monte

Carlo simulation is presented in figure 6.1.1.


Figure 6.1.1

Results of the Monte Carlo Simulation

Not surprisingly, a rudimentary analysis of the Monte Carlo simulation

confirmed this author’s theoretical assertion – the most central condition for

the Zshift metric should be given as approximately 1.5σST. Again, the reader is

reminded that this particular magnitude of equivalent shift is a theoretical

construct based on the influence of random sampling error over a great many

sampling opportunities.

Owing to the equations previously set forth, and given the results of the

aforementioned simulation, this researcher and long-time practitioner strongly

asserts the following points. First, it is scientifically rational and operationally

prudent to contend that a ±4σST model of instantaneous reproducibility can be

meaningfully shifted off its target condition (in a worst-case direction) on the

order of ±1.50σST so as to analytically consider or otherwise compensate for

the influence of long-term random effects when examining or attempting to

optimize the producibility of a design. Second, such a corrective device


should be methodically applied when attempting to establish a short-term

qualification standard deviation during the course of a DPQ.

It should also be recognized that the exact magnitude of compensation

(contraction) will vary somewhat depending on the selection of α and df.

Third, the compensatory mean offset provides an excellent way to assess the

short-term and long-term reproducibility of a new design prior to its release

for full-scale production, or for conscientiously trouble-shooting an existing

design. Thus, a designer can rationally and judiciously study the

instantaneous and temporal reproducibility of a design without many of the

intellectual encumbrances often associated with direct application of

mathematical statistics.

In further considering our case example, we can say that if the

theoretical short-term standard deviation (SVM = 7.50) was to be utilized as a

decision threshold for process adoption (under the constraint n = 30), the

designer would have at least 99.5 percent confidence that the adopted process

(based on the alpha sampling distribution) would ultimately prove to be less

than 11.15 – thereby offering a potential violation of the 25 percent bilateral

safety margin. Expressed another way, it could be said that, if the “buy-off”

sampling distribution revealed a short-term standard deviation of 7.50, and the

candidate process was subsequently adopted on the basis of this result, there

would be 99.5 percent confidence that the terminal capability would prove to

be greater than 2.7σ. Obviously, such a level of "worst-case" capability is not

very appealing.

6.2 Generalizing the results

As we have already stated, the expansion (inflation) factor c describes

the statistical impact of worst-case sampling error during the course of a DPQ.


Based on the example design qualification set forth in our case example, we

demonstrated that c = 11.15 / 7.5 = 1.46. As our example calculations

revealed, such a magnitude of long-term expansion is equivalent to bi-

directionally “shifting” the mean of a short-term theoretical process

distribution by approximately ±1.5σST from its natural center. Given such an

equivalent shift in a short-term theoretical process distribution, the resulting

limits of unity will exactly coincide with the corresponding limits of the

inflated long-term sampling distribution, given that the six sigma design

qualification guidelines were followed.

Given these discussions, the reader should now be on familiar terms

with the idea that the 1.5σ shift factor (as associated with the practice of six

sigma) is not a process centering issue, although it can be applied in this

context in a highly restricted sense. In a theoretical sense, it is very much

intertwined with the mechanics of certain statistical process control methods;

however, it is not a leading indicator or measure of how much a process will

shift and drift over time due to assignable causes. It is simply a mathematical

reconstruction of the natural but random error associated with a theoretical

performance distribution.

The reconfigured form of such sampling error (1.5σ shift) helps the

uninformed practitioner better comprehend the consequential relationship

between a four sigma theoretical design model and its corresponding alpha

sampling distribution in the context of a DPQ. Use of the 1.5σ shift factor is

an effective means to assure that the influence of long-term random sampling

error is accounted for when assigning tolerances and specifying processes.

We should also understand that the 1.5σ shift factor has many other six sigma

applications where producibility analysis, risk management and benchmarking

are concerned.


6.3 Pondering the issues

Statistically speaking, the 1.5σ shift factor represents the extent to which

a theoretical ±4σ design model (based on infinite degrees of freedom) should

be “shifted” from the target value of its performance specification in order to

study the potential influence of long-term random effects. In this sense, it

provides an equivalent first-order description of the expected “worst-case”

sampling distribution under the condition df = 29 and 1 - α = .995.

Again, the general magnitude of expected random error (across many

periods of rational sampling) is sufficient to support an equivalent 1.5σ shift

in the theoretical design model, when considering the upper confidence limit

of the SVM. As previously discussed, this statement is fully constrained to

the case n = 30 and α = .005, and is generally limited to the purpose of design

qualification and producibility analysis, as well as several other common

types of engineering assessments.

As with any compensatory measure, the given correction factor can be

inappropriately applied or otherwise improperly utilized. However, it is fairly

robust and can be confidently employed when a) there are many CTQs being

simultaneously considered, b) there is an absence of empirical data from

which to estimate the true long-term capability expectation and/or c) the need

for a high level of statistical precision is questionable or unnecessary.

7.0 Contracting the Error

7.1 Conducting the analysis

To accomplish the overall design goal related to our case example, we

may now consider working the problem “in reverse.” This is to say the

analyst would start with the initial product performance expectations related to

CTQ4 and then “reverse compute” the target short-term sampling standard

deviation that would be necessary to “qualify” a candidate process with 99.5

percent certainty, but under the limitation of n = 30. In this manner, she

would be able to account for the potential influence of sampling bias by

“deflating” or otherwise contracting the SVM (short-term variation model).


In an effort to rationally establish a contracted short-term standard

deviation, the analyst once again called upon the merits of the chi-square

distribution. With this distribution, she could “reverse compute” a short-term

sampling standard deviation that could be used as a qualification criterion

during the course of a DPQ. After some careful study, the analyst formulated

an approach and presented her results as

0.5045.513012.135.7

1nˆˆ

22 2

C

A 1 ≅=−

∗=

−χσ=σ α−

.

Eq.( 7.1.1 )

She then informed one of her colleagues that the given value of 5.045

represents the criterion short-term sampling standard deviation that must be

acquired by the production manager upon execution of the DPQ in order for

the related process to “qualify” as a viable candidate for full-scale production.

By isolating a process with a short-term sampling standard deviation of 5.045

or less (at the time of the DPQ), the production manager would be at least

99.5 percent certain that the true short-term standard deviation would not be

greater than σA = 7.50, given df = n – 1 = 30 –1 = 29 at the time of sampling.

As should be evident, such a “target” standard deviation will virtually ensure

that the design engineer’s minimum producibility expectation will be met.

Merging the statistical mechanics of a confidence interval with the idea

of design margins and specification limits, the analyst computed the same

result by interrogating the relation

( )( ) ( )( ) 0.5045.5130

10013025.11n

TUSLM1ˆ C ≅=

−12.13

332

1

−−=

−χ

−−=σ

α−

.

Eq.( 7.1.2 )


After some algebraic rearrangement of this equation, she established a

more convenient and general model in the form

( ) ( )( )

( ) ( )( )

0.5045.51309

10013025.112.131n9

TUSLM1ˆ

2222 2

1C ≅=

−−−=

−−−χ=σ α−

.

Eq.( 7.1.3 )

Based on the model constraint µ = T, it was then evident the production

manager would have to isolate a prospective (candidate) process with a short-

term sampling standard deviation of 5.045 or less in order to qualify as a

viable process (in terms of reproducibility). This represented a 32.8 percent

reduction in the theoretical design expectation of SVM = 7.50.

7.2 Drawing the conclusions

Since it was resolved that the criterion short-term sampling standard

deviation of 5.045 could support a ±4σLT level of reproducibility with at least

99.5 percent certainty under the assumption of statistical worst-case sampling

conditions, she then concluded that the short-term capability would

necessarily need to be

0.6947.5045.5

100130ˆ

TUSLZ

CUSL ≅=

−=

σ

−= .

Eq.( 7.2.1 )

Following this calculation, the analyst suddenly realized that, if a 4σLT

(long-term) level of capability is to be statistically assured (with a rational

level of confidence) over many cycles of production, a 6σST (short-term)level


of capability must be “targeted” as the prerequisite condition for process

adoption (during the course of a DPQ). Thus, she indirectly answered the

pervasive and sometimes perplexing question: “Why did Motorola adopt a

six sigma level of capability as a performance standard and not some other

level?

Based on the outcome of Eq. 7.2.1, the analyst decided to express the

given level of capability in the form of Cp. That particular estimate was

presented as

0.2982.1045.5*3100130

ˆ3TUSL

CC

p ≅=−

=σ

−= .

Eq.( 7.2.2 )

Since the analyst’s goal was to net Cp = 1.33 with at least 99.5 percent

certainty, she determined that a process capability of Cp = 2.0 or greater would

have to be discovered upon execution of the DPQ (based on df = 29). Of

course, this was to say that the ±3.0σST limits of such a process would

naturally consume only one-half of the specification bandwidth (tolerance

zone), owing to the understanding that 1 / Cp = .50, or 50 percent. Of course,

it is fully recognized that this quantity is just another form of design margin.

If such a process could be isolated and subsequently qualified, there would be

a very small risk (α = .005) of inappropriately accepting that process as a

candidate for adoption and implementation. Figure 7.2.1 graphically

summarizes all of the fundamental assertions related to our discussion thus

far.


Figure 7.2.1

A Comparison of the Theoretical Design Distribution (Case A) and the Corresponding Alpha Sampling Distribution (Case C)


Here again, it is relatively easy to visualize the driving need for an initial

capability of ±6σST when the “statistical worst-case” goal is ±4σLT. As

previously stated, this portion of the book has sufficiently answered the two

key questions that have been puzzling the world of quality for many years.

Almost without saying, those two central questions are: “Why six sigma and

not some other value?” and “Where does the 1.5σ shift come from?”

As the reader should now understand, a short-term sampling capability

of ±6σST must be realized (during qualification) in order to be at least 99.5

percent confident that the net effect of random sampling error (as accumulated

over an extended period of time) will not compromise or otherwise degrade

the idealized design capability of ±4σLT, given that the DPQ was founded on n

= 30 random samples collected over a relatively short period of time.

Extending this reasoning, we naturally come to understand that the 1.5σST shift

factor is merely a linear offset in the idealized process center that compensates

for “statistical worst-case” random sampling error over the long haul, given

that certain statistical criteria have been satisfied. In this sense, the 1.5σ shift

factor represents an equivalent short-term statistical condition that is

otherwise manifested in the form of long-term “inflated” standard deviation.

7.3 Verifying the conclusions

Confirmation of these assertions was realized by the construction of a

Monte Carlo simulation under the condition of a random normal distribution

defined by the parameters µ = 100 and σST = 5.045. The reader will recognize

the specified standard deviation as the “target” index of variation discussed in

our case example. For this case, g = 3,000 subgroups were randomly

assembled, each consisting of n = 30 random observations. Following this,

the capability of each subgroup was computed and subsequently graphed for

the purpose of visual and statistical interrogation. For the reader’s

convenience, the results of this simulation have been summarized in the form

of a histogram and located in figure 7.3.1.


Figure 7.3.1

Histogram of Process Capabilities Related to the Monte Carlo Simulation

From this figure, it is apparent that the general range of capability

extends from approximately +4σST to a little over +10σST, while maintaining a

central condition of about +6σST. Focusing our attention on the minimum

capability (left side of the distribution), it is virtually certain that the analyst’s

original long-term producibility goal of ±4σLT can be realized if a ±6σST

process capability can be isolated – even in light of worst-case sampling error.

7.4 Establishing the shortcut

Given that the DPQ criteria are sufficiently recognized and understood,

we may now establish a “shortcut” method for specifying a candidate process


based solely on the design specifications. In general, that method may be

summarized in the context of our example case and presented as

0.56

1001306

TSLˆ spec =

−=

−=σ .

Eq. ( 7.4.1 )

Thus, we have the theoretical basis for a design-related rule of thumb.

This rule advocates that, if the design goal is to realize a long-term capability

of 4σLT, then the semi-tolerance zone of a symmetrical, bilateral performance

specification should be divided by a quantity generally known as the “six

sigma constant,” where the value of that constant is 6. Naturally, this constant

is indicative of ZST = 6.0. Of course, when considering the full range of a

bilateral tolerance, the constant should be given as 6 * 2 = 12. Essentially, the

rule of thumb says that a “six sigma specification” is defined or otherwise

constituted when a bilateral performance requirement is “mated” with a

corresponding process performance distribution whose standard deviation

consumes no more than 8.33 percent of the total specification bandwidth.

As may be reasoned, this rule of thumb can be translated into a design

margin. In other words, if a designer seeks to “net” a 4σLT level of long-term

reproducibility for a given performance characteristic under the assumption of

worst-case random sampling error, then it would be necessary to assign a

“gross” design margin of 50 percent. This means that, in the unlikely event

that worst-case sampling error is present at the time of process qualification,

the design margins will not drop below 25 percent. In comparison, this is

twice the margin advocated by conventional engineering practice.


8.0 Partitioning the Error

8.1 Separating the noise

For many years, practitioners of quality methods have directly or

indirectly utilized the idea of “rational subgroups.” The reader should

recognize that this particular sampling strategy is inherently germane to a

wide array of statistical applications. We see this practice commonly

employed whenever such tools as control charts and design of experiments are

selected as the instruments of analysis.

At the theoretical core of many statistical tools and methods is the idea

of “error partitioning.” Essentially, an error partition is a “screen” that

excludes certain types of variation from contaminating or otherwise biasing

one or more of the other partitions. In this manner, a rational sampling

strategy can block certain types of error so as to ensure that the effects of

interest are not confounded or statistically biased. In addition, such a strategy

ensures that the response measurements remain fully independent.60 For

example, when using an Xbar and S chart, we seek to separate the background

noise (reflected in the S chart) from the signal effect (displayed in the Xbar

chart). Doing so provides us various types of “reports” on how well the

process center is controlled over time – relative to the extent of inherent

background noise (random error).

When designing a statistically based test plan (experiment) we have the

same ultimate goal – the deliberate partitioning of “random noises” from the

various signal effects induced by the experimental factors. In this manner, we

can examine the inherent repeatability of the response characteristic while

60 The idea of independence is essential to the existence of modern statistics and the practice of six sigma.

To illustrate, consider a cupcake pan. If we prepared n = 8 cupcakes from the same “mix” and then put them in a standard 8-hole pan for purposes of baking, we could subsequently measure the “rise height” of each cupcake once removed from the oven. In this scenario, all n = 8 cupcakes would have likely experienced very little difference in the baking conditions during preparation. In other words, each hole would have simultaneously experienced the same causal conditions during preparation and baking. As a consequence, we could not consider the “within pan” measurements to be independent of each other. It is interesting to notice that by preparing all n = 8 cupcakes at the same time, we would likely have “blocked” the influence of many variables (of a random and nonrandom nature).


concurrently evaluating any changes in the central condition (mean) that may

result from changes in the competing settings among the test variables.

Only in this manner can the signal effects of a given test variable be

separated from the other signal effects that may be present during the

experiment. At the same time, the signal effects of concern must be made

relative to the observed level of background noise (emanating from all of the

other variables not germane to the experiment or otherwise controlled). Only

by virtue of the experimental test plan can the nonrandom noise (variable

effects) be separated from and subsequently contrasted to the random noise

(experimental error). From this perspective, it is easy to see why the idea of

rational sub-grouping is so important.

As previously stated, rational sub-grouping is a sampling strategy that

has the principal aim of separating white noise from black noise. Regardless

of the basis for such sampling (sequential or random) the overall aim in the

same.61 When the data are to be sequentially gathered, a rational sub-

grouping strategy would necessarily seek to exclude any special effects (also

called assignable causes or nonrandom sources of error) that may be present

across the total range or interval of sampling. Essentially, this has the effect

of forcing any assignable causes (nonrandom effects) that might be present to

appear between subgroups (partitions) rather than within subgroups

(partitions). Only if this intent is sufficiently satisfied in a statistical and

pragmatic way can the experimental test plan or control chart “do what it was

designed to do.”

Process characterization studies are not exempt from the discussion at

hand – at least where rational sub-grouping is concerned. Naturally, our

terminal goal is to estimate how capable the process actually is. Most

generally, we seek an estimate of the “short-term” capability, as well as the

61 The reader is kindly asked to remember that throughout this portion of this book the term “process

center” is used without regard to the nominal (target) specification. It simply implies the central location of a normal distribution relative to some continuous scale of measure. Furthermore, it must always be remembered that the employment of a rational sampling strategy gracefully supports the study of autocorrelations and time-series phenomenon during the course of a process characterization and optimization initiative. Of course, this is another discussion in and of itself – perhaps at some other time.


“long-term.” An estimate of the short-term capability reports on the inherent

repeatability (reproducibility) of the process without regard to the overall

process mean. In this context, the random errors are “trapped” within

subgroups. Thus, when the quadratic mean deviations of all the individual

subgroups are “pooled” and subsequently normalized to the subgroup level,

the resulting variance only reflects the influence of random error – assuming

that the basic intents of rational sub-grouping have been fully satisfied. By

taking the square root of this variance, we are left with an instantaneous

measure of error – the short-term standard deviation.

As the informed reader might have surmised, the influences of both

random (un-assignable causes) and nonrandom errors (assignable causes) are

reflected in the long-term standard deviation – also recognized as a

longitudinal measure of reproducibility. Where this particular type of

standard deviation is concerned, the individual quadratic mean deviations are

based on the grand mean rather than the individual subgroup means. Owing

to this, the long-term estimate of the standard deviation reflects all sources of

error, not just those of the random variety. By contrasting the long-term

standard deviation to its short-term counterpart, we are able to surmise how

well the short-term measure of repeatability can be sustained over time.

Hence, the simple ratio of these two types of estimates provides great insight

into how well a process is controlled over time (in terms of process centering).

In one way or another, virtually all of the commonly used indices of

capability are predicated on the short-term standard deviation. Expressed

differently, it can be said that most indices that claim to report on process

repeatability (capability) assume that the underlying measure of variation

(standard deviation) only reflects the influence of white noise (un-assignable

causes). Only when this assumption is fully and rationally satisfied can we

reliably proclaim a certain level of instantaneous repeatability

(reproducibility), or momentary capability as some would say. However, by

formulating the given index of capability to include the corresponding long-

term standard deviation, the resulting performance measure reports on the


“sustainable” reproducibility of the process, or longitudinal capability as it is

often called.

All too frequently, this architect of six sigma has observed that such

performance metrics (indices of capability) are often improperly formulated,

incorrectly applied or inappropriately reported in such a way that the terminal

result is counterproductive to the aims of management. This is to say that the

“analytical pieces” of a capability metric are often found to be inappropriate

or somehow deficient in their construction or compilation. The net effect is a

performance metric that does not have the contextual ability to report on what

it purports to measure or otherwise assess.

For example, this architect of six sigma has been told (on numerous

occasions) that the “Cp” of a particular process is of a certain magnitude, only

to latter discover that the data used to compute the underpinning short-term

standard deviation were gathered over a relatively lengthy period of time

without regard to the fundamental principles of rational sub-grouping. On

such occasions, it is usually discovered that the standard deviation should

have been classified as “long-term” in nature.

Obviously, when this occurs, management is presented with a

misleading indicator of instantaneous reproducibility. Without elaboration it

should be fairly obvious how such an understatement of entitlement capability

(owing to improper or insufficient partitioning) could easily mislead someone

attempting to make use of that metric for purposes of decision-making.62

So as to fully avoid such problems, the “rationality” of each subgroup

should always be ensured before the onset of a DPQ or process

characterization study. Generally speaking, the subgroup sampling interval

should be large enough to trap the primary sources of white noise that may

exist, but not so large that nonrandom influences are captured or otherwise 62 Numerous times, this practitioner of six sigma has witnessed (after the fact) precious resources

squandered on new capital-intensive technology because the true capability of the existing technology was not properly estimated, or was improperly computed. It is professionally shameful that, virtually every day across the world, many key quality and financial decisions are founded upon highly biased indices of capability. Arguably, the most common error in the use of Cp is the inclusion of a standard deviation that, unknown to the analyst, was confounded with or otherwise contaminated by sources of nonrandom error (black noise) – thereby providing a less favorable estimate of short-term capability.


“trapped” within the subgroup data. Unfortunately, there is no conventional

or standardized guidance on how this should be accomplished.63 Only when

these principles are theoretically and pragmatically understood, and then

linked to an intimate knowledge of the process, can the practitioner properly

prescribe, compute, report and subsequently interpret a given measure of

capability, such as ZST, ZLT, CP, CPK, Pp, PPK, and the like.

As many of us are painfully aware, the relative economic and functional

vitality of commercial and industrial products, processes and systems is often

positively correlated to the extent of replication error that is present in or

otherwise inherent to a situation. In general, the relative capability of a

deliverable (or the ways and means of realizing a deliverable) is directly

proportional to one’s capacity and capability to repeat a set of conditions that

yields success.

63 So as to facilitate the execution of a process characterization study, this six sigma practitioner has often

employed a rational sampling strategy that is “open ended” with respect to subgroup size. In other words, the sample size of any given subgroup is undefined at the onset of sampling. The performance measurements are progressively and sequentially accumulated until there is a distinct ”slope change” in the plotted cumulative sums-of-squares. As a broad and pragmatic generality, the cumulative sums-of-squares will aggregate as a relatively straight line on a time series plot-but only if the progressive variations are inherently random. It can generally be said that as the sampling progresses, a pragmatic change in process centering will reveal itself in the form of a change in slope. Naturally, the source of such a change in slope is attributable to the sudden introduction of nonrandom variation. Of course, the point at which the slope change originated is also a declaration for terminating the interval of subgroup sampling. Although it can be argued that this particular procedure is somewhat inefficient, it does help to ensure that virtually all of the key sources of random error have had the opportunity to influence the subgroup measurements. At the same time, the sampling disallows the aggregate mean square (variance) from being biased or otherwise contaminated by the influence of assignable causes. After defining the first subgroup, the second subgroup is formed in the same manner, but not necessarily with the same sample size. This process continues until the “cumulative pooled variance” has reached a rational plateau (in terms of its cumulative magnitude). Naturally, the square root of this quantity constitutes the composite within-group error and is presented as the “short-term” standard deviation. As such, it constitutes the instantaneous capability of the process and constitutes the pragmatic limit of reproducibility. However, it is often necessary to continue the formation of such subgroups until all principal sources of nonrandom variation have been accounted for. This objective is usually achieved once the between-group mean square has reached its zenith (in terms of its cumulative magnitude). At this point, the composite sums-of-squares can be formed and the total error variance estimated. Of course, the square root of this quantity constitutes the overall error and is known as the “long-term” standard deviation. As such, it constitutes the sustainable capability of the process. The relative differential between these two indices constitutes the extent of “centering control” that has been exerted during the interval of sampling. To facilitate the computational mechanics of such analyses, several years ago this researcher provided the necessary statistical methods to MiniTab. In turn, they created the “add-on” analytics (and reports) now known as the “Six Sigma Module.” This particular module has been specifically designed to facilitate the execution of a six sigma process characterization study.


8.2 Aggregating the error

As the sources of replication error vary in number and strength, one’s

ability to deliver a product or service changes accordingly. In other words,

the error associated with the underlying determinants (input error) is

propagated (or otherwise transmitted) from the system of causation through

the transformational function, ultimately manifesting itself in the resultant

(outcome). Expressed differently, we may state

σT2 = f σ1

2 , σ22 , ... , σN

2,

Eq.( 8.2.1 )

where σΤ2

is the total performance error exhibited by a system, product,

process, service, event or activity. Of course, N is the Nth source of error

inherent to the characteristic of concern. If we assume that the sources of

error are independent such that 2 ρij σi σj = 0 ,

Eq.( 8.2.2 )

the total error may be fully described by

σ T2 = σ 1

2 + σ 22 + +σ N

2.

Eq. ( 8.2.3 )

Obviously, as the leverage or vital few sources of replication error are

discovered and subsequently reduced in strength and number, our capability

and capacity to replicate a set of “success conditions” will improve

accordingly.

When reporting the performance of an industrial or commercial product,

process or service, it is customary and recommended practice to prepare three

separate but related measures of capability. The first performance measure

reflects the minimum capability or "longitudinal reproducibility" of the

characteristic under investigation. In this case, the performance measure is

given by σ2T. As previously indicated, the total error accounts for all sources


of variation (error) that occur over time, regardless of their type or nature.

The second type of performance measure reports the maximum capability or

"instantaneous reproducibility" of the process and is designated as σ2W. It

should be understood that σ2W reflects only random sources of error and is

exclusive to those variations that occur within sampling subgroups.64 As may

be apparent, σ2B must only reflect those sources of variation (error) that occur

between sampling subgroups.

By virtue of the latter definitions, we may conceptually consolidate the

right-hand side of Eq. (8.2.3) in a rational manner so as to reveal

σT2 = σB

2 + σW2 .

Eq. ( 8.2.4 )

In a great many process scenarios, it would be considered quite desirable

to obtain an estimate of the long-term error component in the form σ2T and the

short-term error in the form σ2W. As previously indicated, such estimates of

repeatability are most conveniently assisted via a rational sampling strategy.65

With such estimates at hand, we can rearrange Eq. (8.2.4) so as to solve for

the between-group error component σ2B. Such a rearrangement would

disclose that

64 The instantaneous reproducibility of a manufacturing process reflects the state of affairs when the

underlying system of causation is entirely free of nonrandom sources of error (i.e., free of variations attributable to common or "assignable" causes). Given this condition, the related process would be operating at the upper limit of its capability. Hence, this particular category of error cannot be further reduced (in a practical or economical sense) without a change in technology, materials or design.

65 For purposes of process characterization, such a sampling strategy has been thoroughly described by Harry and Lawson (1990). The theoretical principals and practical considerations that underpin the issue of "rational sub-grouping" have also been addressed by Juran (1979), as well as Grant and Leavenworth (1980). In the context of this book, it should be recognized that the intent of a rational sampling strategy is to allow only random effects within groups. This is often accomplished by “blocking” on the variable called "time." When this is done, the nonrandom or assignable causes will tend to occur between groups. Hence, one-way ANOVA can be readily employed to decompose the variations into their respective components. With this accomplished, the practitioner is free to make an unbiased estimate of the instantaneous reproducibility of the process under investigation. With the same data set, an estimate of the sustained reproducibility may also be made, independent of existing background noise.


σB2 = σT

2 - σW2 .

Eq. ( 8.2.5 )

In this manner the signal effects (if any) can be ascertained by

subtraction since the total variations are reflected by σ2T and the random

influences are forced into the σ2W term. At great risk of redundancy, the author

again points out that, for a wide range of processes, the effect of nonrandom

and random perturbations can be estimated and subsequently separated for

independent consideration. As previously stated, the concept of “rational sub-

grouping” must be employed to realize this aim.

8.3 Rationalizing the sample

As many practitioners of process improvement already know, it is often

the case that the influence of certain background variables must be

significantly reduced or eliminated so as to render “statistically valid”

estimates of process capability. Of course, this goal is often achieved or

greatly facilitated by the deliberate and conscientious design of a sampling

plan.

Through such a plan, the influence of certain causative variables can be

effectively and efficiently “blocked” or otherwise neutralized. For example, it

is possible to block the effect of an independent variable by controlling its

operative condition to a specific level. When this principle is linked to certain

analytical tools, it is fully possible to ensure that one or more causative

variables do not “contaminate” or otherwise unduly bias the extent of natural

error inherent to the response characteristic under consideration.

As a given sampling strategy is able to block the influence of more and

more independent variables, the total observed replication error σ2T is

progressively decreased in magnitude. In other words, as each of the

independent variables within a system of causation are progressively blocked,

it is theoretically possible to reach a point where it is not possible to observe


any type, form or magnitude of replication error. At this point, only one

measurement value could be repeatedly realized during the course of

sampling. For such a theoretical case, we would observe σ2T = σ2

B = σ2W = 0.

In short, the system of classification would be so stringent (as prescribed by

the sampling plan) that no more than one measurement value would be

possible at any given moment in time.

Should such a highly theoretical sampling plan be activated, the same

response measurement would be observed upon each cycle of the process –

over and over again it would be the same measurement, and the replication

error would be zero. However, for any given sampling circumstance, there

does exist a rational combination of blocking variables and corresponding

control settings that will allow only random errors to be made observable.

However, the pragmatic pursuit of such an idealized combination would be

considered highly infeasible or impractical in most circumstances, to say the

least. For this reason, we simply elect to block on the variable called “time.”

In this manner, we are able to indirectly and artificially “scale” the system of

blocking to such an extent than only random errors are made known or

measurable.

If the window of sampling time is made too small, the terminal estimate

of pure error σ2W is underestimated, owing to the forced exclusion of too

many sources of random error. On the other hand, if the window of time is too

large, the terminal estimate of σ2W is overestimated, owing to the natural

inclusion of nonrandom effects. However, by the age-old method of trial-and-

error, it is possible to define a window size (sampling time frame) that

captures the “true” magnitude of σ2W but yet necessarily precludes nonrandom

sources of error from joining the mix. In this instance, the nonrandom errors

would be assigned to the σ2B term.

In short, it is pragmatically feasible to discover a sampling interval (in

terms of time) that will separate or otherwise partition the primary signal

effect of a response characteristic from the background noise (random

variation) that surrounds that effect. Only when this has been rationally


accomplished can the instantaneous reproducibility of a performance

characteristic be assessed in a valid manner. Such a sampling plan is also

called a rational sampling strategy, as execution of the plan rationally

(sensibly and judiciously) partitions the array of signal effects from the mix of

indiscernible background noises. In this sense, a rational sampling strategy

can effectively unmask or otherwise make known the array of signal effects,

thereby allowing σ2B and σ2

W to be statistically contrasted in a meaningful and

rational manner.

9.0 Analyzing the Partitions

9.1 Defining the components

The analytical separation of nonrandom and random process variations

may be readily accomplished via the application of one-way analysis-of-

variance, herein referred to as "one-way ANOVA." In the context of a rational

sampling strategy, one-way ANOVA provides a means to decompose the total

observed variation (SST) into two unique parts, namely SSB and SSW.

The first component of the total variation (SST) is the between group

sums-of-squares (SSB). This particular component (partition) describes the

cumulative effect of the between-group variations attributable to common

causes, often called "assignable causes." The second component (partition) is

the within group sums-of squares (SSW,). This partition contains the

variations that cannot be assigned. As previously indicated, any error

(variation) that cannot be statistically described or assigned is generally

referred to as "white noise," or random error. As such, the within-group

component (partition) reports on the extent of instantaneous reproducibility

naturally inherent to the underlying system of causation (process). In this

context, the experienced practitioner will attest that the total variation can be

expressed as

SST = Xij - X

2Σ

i = 1

nj

Σ j = 1

g

Eq. ( 9.1.1 )


where Xij is the ith variate associated with the jth subgroup, X is the grand

average and nj is the total number of variates within the jth subgroup. In order

to utilize the information given by the g subgroups, we must decompose Eq.

(9.1.1) into its component parts. The first component of variation to be

considered is the within-group sum of squares. This particular partition is

given as

SSW = Xij - Xj

2Σ i = 1

nj

Σ j = 1

g

Eq. ( 9.1.2 )

where Xj is the average of the jth subgroup. The second component to be

examined is the between-groups sum of squares. Considering the general

case, this partition be described by

SSB = nj Xj - X

2Σ

j = 1

g

Eq. ( 9.1.3 )

or, for the special case where n1 = n2 = ... = ng, the between-group sum of

squares can be presented in the form

SSB = n Xj - X

2Σ

j = 1

g

.

Eq. ( 9.1.4 )

Thus, we may now say that

SST = SSB + SSW.

Eq. ( 9.1.5 )

In expanded form, the special case is expressed as


X ij - X 2Σ

i = 1

n

Σ j = 1

g = n Xj - X

2Σ

j = 1

g

+ Xij - X j 2Σ i = 1

n

Σ j = 1

g

.

Eq. ( 9.1.6 )

Recognizing the quantities n and g, it is naturally understood that the total

degrees of freedom (dfT) must also be rationally evaluated. In view of Eq. (

9.1.5 ), we note that

dfT = dfB + dfW Eq. ( 9.1.7 )

which expands to reveal

ng - 1 = g-1 + g(n-1).

Eq. ( 9.1.8 )

9.2 Analyzing the variances

Relating the sums-of-squares of each partition to the corresponding

degrees of freedom provides the mean square ratio, or variance as some would

say. Of course, the variance reports on the “typical” quadratic mean deviation

of a given error component (partition). Making such a formulation for the

mean square ratios commonly associated with a process characterization study

reveals that

MST = SSTng - 1

Eq. ( 9.2.1 )

and

MSB = SSBg - 1,

Eq. ( 9.2.2 )


as well as

MSW = SSWg(n - 1) .

Eq. ( 9.2.3 )

To further our understanding, let us turn our attention to the F statistic

and one-way ANOVA. As most six sigma practitioners are aware, F = MSB /

MSW. Both the F statistic and ANOVA seek to highlight the presence of a

signal effect (as represented by the MSB term) and then contrast the absolute

magnitude of that effect to the absolute magnitude of random background

noise (recognized as the MSW term). In other words, the F statistic is

designed to “pull” a signal out of the total variations – but only if that signal is

large enough to be statistically discernable from the variations that constitute

the bandwidth of random error.

Expressed in another way, we may say that if a signal effect can be

statistically differentiated from the pure error (white noise), we would have

sufficient scientific evidence to conclude that the signal is “real,” with some

degree of statistical confidence (1 - α). On the other hand, if the signal was

not sufficiently large, we would then conclude that it was just an artifact of

naturally occurring random variations. In such cases, it would be statistically

inappropriate to further interrogate the between group sums-of-squares.

By all means, we must recognize that the square root of Eq. ( 9.2.3 )

constitutes a short-term standard deviation (estimate of instantaneous process

reproducibility). As such, it represents the typical mean deviation that can be

expected at any moment in time, without any nonrandom bias. Expressed

differently, the square root of Eq. ( 9.2.3 ) is a relative measure of dispersion

that reflects the “momentary capability” of a performance characteristic, fully


independent of any nonrandom influences that would otherwise be considered

“assignable.”66

From this vantage point, the within-group errors can be fully evaluated

as if the process center never varied. Only then can the “true” instantaneous

reproducibility of the process be rightfully declared, given that all the normal

statistical assumptions have been fully satisfied or adequately rationalized. In

this manner, we are able to discover the true extent of entitlement capability

(in terms of instantaneous reproducibility). Of course, such an estimate of

short-term capability would constitute the “best possible performance

scenario” that could be expected, given the prevailing system of causation.67

Within this context, we often say the short-term standard deviation only

reflects the influence of pure error (effects that have no deterministic origin).

At the risk of unnecessary iteration, we employ a rational sampling strategy to

ensure the within-group errors are fully independent and unpredictable

(random). In this regard, the short-term standard deviation is not analytically

biased by any momentary centering condition, or unduly influenced by any

assignable cause that might have been present at some point during or across

the total interval of sampling.

By retaining the between-group partition, the reproducibility of process

centering is not considered when developing the short-term standard

deviation. In other words, retention of this partition ensures that the errors

occurring between groups (random or otherwise) do not become confounded

or manifested within the short-term standard deviation. 68

66 The reader must bear in mind that a temporal source of error requires the passage of time before it can

develop or otherwise realize its full contributory influence – such as the effects of machine set-up, tool wear, environmental changes and so on.

67 From a process engineering perspective, the system of causation relates directly to the technologies employed to create the performance characteristic. Thus, the measure of instantaneous reproducibility (short-term standard deviation) reports on how well the implemented technologies could potentially function in a world that is characterized by only random variations (error), and where the process is always postulated to be centered on its nominal specification (target value).

68 For example, consider a high-speed punch press. Natural wear in the die is not generally discernable within a small subgroup (say n=5), but can be made to appear between subgroups in the context of a rational sampling strategy. With this understanding as a backdrop, we can say that the influence of die wear will be manifested over time in the form of a “drifting process center.” To some extent, wear in the die does exert an influence during the short periods of subgroup sampling, but that effect is so miniscule


Thus, the root mean square of the between group partition (once

corrected for the influence of n) can be thought of as the “typical” difference

that could be expected between any two subgroup means (over a protracted

period of time). As we shall come to understand later on in the discussion,

such a difference can be reduced to an equivalent mean shift. In this context,

the equivalent mean shift is a subtle but direct measure of process control, but

only as a measure of “centering reproducibility.”

With this in mind, we will now turn our attention to the “longitudinal

state of affairs” by further considering the total sums-of-squares and its related

degrees of freedom. To this end, we must recognize that the square root of

MST represents the long-term standard deviation. In this form, the long-term

standard deviation is a direct measure of sustainable capability, or longitudinal

reproducibility as some would say. By nature, it can be thought of as the

“typical” mean deviation that can be expected over a protracted period of

time. Because the long-term standard deviation is composed of both random

error (white noise) and nonrandom error (black noise), it is quite sensitive to

time-centric sources of error, especially when the subgroup sampling is

protracted over a long period.69

it would not be practical or economical to analytically separate its unique contributory effect for independent consideration. With respect to the within-group partition, the influence of die wear should remain confounded with the many other sources of background noise. Consequently, its momentary influence should be reflected in the MSW term, but its temporal influence should be reflected in the MSB term. Thus, the long-term standard deviation would likely be considerably larger than the short-term standard deviation. Given this line of reasoning, it is easy to understand why nonrandom sources of temporal error generally have a destabilizing effect on the subgroup averages rather than their respective standard deviations. Based on this line of reasoning, we must recognize that a rational sampling strategy can greatly facilitate “bouncing” nonrandom variations into the between-group partition while restraining the inherent random variations within groups. Only in this manner can the short-term standard deviation (instantaneous capability) of a given performance characteristic be contrasted or otherwise made relative to the long-term standard deviation (temporal capability).

69 A temporal source of error generally exerts its unique and often interactive influence over a relatively protracted period of time. Although such errors can be fully independent or interactive by nature, their aggregate (net) effect generally tends to “inflate” or otherwise “expand” the relative magnitude of the short-term standard deviation. Naturally, a statistically significant differential between the temporal and instantaneous estimates of reproducibility error necessarily constitutes the relative extent to which the process center is “controlled” over time. This understanding cannot be overstated, as it is at the core of six sigma practice, from a process as well as design point of view.


9.3 Rearranging the model

As may be apparent, the aforementioned statistics are excellent tools

with which to assess short-term and long-term process capability

(reproducibility). Given this perspective, it is not a stretch to compare the

estimate of long-term reproducibility error (as reflected by the RMST term) to

the estimate of short-term reproducibility error (as revealed by the MSW term).

As stated earlier in this book, such a contrast can be expressed in the form of a

ratio, otherwise known as “c.” Recall that c is a multiplication factor used to

inflate the short-term standard deviation so as to account for the influence of

temporal sources of error. In this context, c is somewhat analogous to a

signal-to-noise ratio.

But if we consider the ratio MSB / MSW, we discover the F statistic. In

this context, the F statistic fully constitutes a “true” signal-to-noise ratio.

Furthermore, this classic (if not cornerstone) statistic answers a question

where process characterization work is concerned: “How big is the signal

effect relative to the inherent background noise?” With this question

answered, we may now ask: “How big can the typical between-group

deviation become before it is no longer statistically insignificant?” As we

shall come to see, some simple algebraic manipulation of the one-way

ANOVA model will reveal the answer.

By simple rearrangement of the one-way ANOVA model, we quickly

take note that SSB = SST – SSW. By conducting a longitudinal capability

study in the context of a rational sampling plan, we can fully estimate the total

variation, as well as the inherent extent of random variation, and then uncover

the relative impact of nonrandom variation by subtraction. In this manner, we

are able to gain insight into the long-term performance of the process, as well

as the short-term performance.

It is well recognized that one-way ANOVA has the ability to fully

decompose the total variations into two fundamental sources: the random

error component (which reflects the relative extent of white noise) and the

between-group component (which reflects the relative extent of black noise).


With a good rational sampling plan and one-way ANOVA, we have the

analytical power to say that “if it wiggles, we can trap it, slice it and dice it

any way we want.” In other words, we can look at the reproducibility of a

performance characteristic from any angle. Without this power, it is doubtful

that a process or design characterization and optimization study could be

rationally planned and executed in a meaningful way.

To underscore the latter point, let us further our understanding of the

power behind one-way ANOVA by way of some simple algebra. By

subtracting the observed within-group variations from the total variations

comprising the capability study, we are left with the between-group variations.

In closed form, we can express this operation by the relation SSB = SST - SSW.

In this case, the SSB term is an absolute measure that, when normalized by the

related degrees of freedom , reports on the extent to which we can

successfully replicate the overall centering condition of the process. In other

words, the SSB term is an aggregation of process centering errors that, when

normalized for the given degrees of freedom, reports on our efforts to

maintain a stable location parameter over time, irrespective of the target

specification (if given by the design).

9.3 Examining the entitlement

So as to expand our discussion, recall that SST/(n-1) is an estimate of the

long-term variance and SSW/g(n-1) is an estimate of the short-term variance –

for reasons previously stated. Of course, the square root of these respective

variances respectively reveals the long-term and short-term performance of

the process in the form of a root mean square (standard deviation). The reader

will also recall that these two types of root mean squares are merged in the

form of a ratio called “c” and estimated as c = σLT / σST. From this

perspective, c is a hybrid performance metric that provides us with an insight

into how much “dynamic centering error” is occurring in the process over the

total period of sampling.


When the ratio is such that c = σLT / σST = 1.0, then SSB = 0, but only

after adjusting for differences in degrees-of-freedom (between the numerator

and denominator terms). On the other hand, we note that when σLT > σST, then

SSB > 0. Thus, we assert that c is a general measure of capability related to

process centering.70 Given the circumstance that SSB = 0, we naturally

recognize that all of the subgroup means would necessarily be identical in

value. Under this condition, one could proclaim a “state of perfection” in the

“centering repeatability” of the given process.

On the flip side, we recognize that as SSB approaches its theoretical

limit, our confidence of repeating a given centering condition approaches

zero. Naturally, there would be some point far short of infinity at which one

would conclude, based on pragmatic argument, that the potential of replicating

any given centering condition would become nonexistent. Of course, there

would also exist a “point of pragmatism” on the bottom end as well – a point

where the subgroup-to-subgroup variations are not of any statistical or

practical significance, but merely “noise.” In this circumstance, we would

conclude that such variations were segmented from the SSW term and then

inappropriately assigned to the SSB term. Under this condition, it is quite

possible that the subgroup sampling interval was too small.

Given the previous discussions it should now be fairly easy to see that

the SSW term is a foundational measure from which to launch a better

understanding of what is meant by the phrase “instantaneous reproducibility.”

In this context, the terminal measure of instantaneous process capability (ZST

or Cp), would be unbiased and uncontaminated in terms of perturbing

influences of a non-random nature (free of black noise influences). In other

words, the SSW term (once normalized to the form of a standard deviation)

reflects the relative “upper limit of capability” inherent to a given process.

more to the point, it represents the “best-case capability scenario” given the 70 As a sidebar of interest, it should be recognized that there is a special mathematical relationship between

“c” and the “F” statistic. More specifically, we may mathematically define the dynamic expansion factor c and its interrelationship with the F statistic by way of the equation c = sqrt(1+(((F-1)*(g-1))/(ng-1))). As you investigate this relationship through algebraic manipulation and Monte Carlo simulation, you will likely gain many new insights into the field of statistical process control (SPC).


current operational settings and technological circumstances related to that

process.

In this sense, Cp can be thought of as a measure of “entitlement

capability,” only if the underlying standard deviation is a true and full

measure of white noise. Of course, such noise would stem from the effects

due to uncontrolled background variables. If the sampling strategy allowed

sources of nonrandom or systematic errors of some form (black noise) to

influence the data, the resulting calculation of SSW would necessarily be

inflated, thereby, forcing the estimate of Cp to be worse than should be

rightfully known. The decision-making consequences related to such an

understatement of inherent capability need not be stated or expounded upon,

as they are intuitively evident.

Since the idea of “entitlement” is defined as a “rightful level of

expectation,” the term “entitlement capability” would seem to make pragmatic

and theoretical sense, given that the variations associated with the SSW term

would only reflect random influences, assuming sufficient and appropriate

blocking. Regardless of terminology, the SSW term reports on the short-term

repeatability of the process, but without any regard to performance

specifications or degrees of freedom. In this sense, the short-term standard

deviation is, in its own right, an absolute measure of repeatability and,

therefore, an absolute measure of instantaneous capability.

10.0 Computing the Correction

10.1 Computing the shift

At this point in our discussion we must turn our attention to the

research of Bender (1975), Gilson (1951), and Evans (1975). In essence,

their work focused on the problems associated with establishing engineering

tolerances in light of certain manufacturing variations that assume the form

of mean shifts and drifts. In synopsis, Evans pointed out that


... shifts and drifts in the mean of a component occur for a number of

reasons ... for example, tool wear is one source of a gradual drift...

which can cause shifts in the distribution. Except in special cases, it is

almost impossible to predict quantitatively the changes in the

distribution of a component value that will occur, but the knowledge

that they will occur enables us to cope with the difficulty. A solution

proposed by Bender ... allows for [nonrandom] shifts and drifts.

Bender suggests that one should use

V = 1.5 VAR (X)

as the standard deviation of the response ... [so as] to relate the

component tolerances and the response tolerance.

Of particular interest is the idea that we cannot forecast any or all of

the random or nonrandom errors at any given moment in time, but mere

knowledge that they will occur over time provides us with an engineering

edge, so to speak. In view of this research, we may redefine Bender’s

correction in the form

Eq. ( 10.1.1 )

or

c = σTσW

Eq. ( 10.1.2 )

where c is the relative magnitude of inflation imposed or otherwise overlaid

on the measure of instantaneous reproducibility.

In this context, c is a corrective measure used to adjust the

instantaneous reproducibility of a performance characteristic. This

corrective device is intended to generally account for the influence of

random temporal error, and an array of transient effects that periodically


vT

2

= cvWi2d

emerge over extended periods of process operation. Again, such a

compensatory measure is used to inflate or otherwise expand the

instantaneous estimate of reproducibility (short-term standard deviation).

As discussed throughout this book, the correction c can be established in

different theoretical and empirical forms, then subsequently employed for a

variety of purposes.

In general, the correction is most often employed to better project the

first-time yield of a performance characteristic or product. To this end, the

correction facilitates a consideration of unknown (but yet expected) sources

of error that, ultimately, have the effect of upsetting the momentary

condition of a process center. Of course, such a correction provides us with

a more realistic basis for assigning and subsequently analyzing performance

specifications.

Calling upon the previously mentioned research, we would discover

that the general range of c, for "typical manufacturing processes," can be

confidently given in the general range of

1.4 ≤ c ≤ 1.8 . Eq. ( 10.1.3 )

Recognize that this particular range of c is considered “normal” for the

general case – per the previously mentioned research. The author personally

conducted additional investigations into these phenomena over a three-year

period of time during the mid 1980’s. The results reaffirmed the

aforementioned conclusion – empirically and theoretically. Several key

elements of this supporting research have been set forth in this book.

By algebraic manipulation of the one-way ANOVA model, it is

possible to isolate and declare the between-group root mean square (RMSB).

In turn, the RMSB term can be transformed to reveal the “typical shift”

occurring between subgroups but expressed in the form of an equivalent Z

value. This is to say that the absolute mean deviation can be normalized by


the extent of extraneous error inherent to the system of causation (short-term

standard deviation). The final result of these manipulations provides an

estimate of the “typical” momentary mean offset, but expressed in the form

of a standard normal deviate and designated as a Zshift.

Now, if we were to consider the case c = 1.8, n = 5, and g = 25, the

resulting calculations would reveal that ZshiftσST = 1.498σST. Thus, we have

the “shift expectation” of 1.5σ, as often discussed in the six sigma literature.

This would be to say that the “normalized and standardized” mean deviation

for a typical subgroup, as related to a common process, is likely to be about

1.5σST, given no other knowledge about the process or prevailing

circumstances. In other words, when the amount of long-term dynamic

expansion is such that c = 1.8, we would mathematically determine that the

typical-but-equivalent subgroup mean shift expectation would be about

1.5σST. To better understand the derivation of this quantity, it is necessary to

expand the correction c and then solve for the “typical” momentary mean

offset.

To accomplish the latter aim, we will first consider the total and

within-group error components obtained by way of a rational sampling

strategy. Given this, we may rewrite Eq. (10.1.2) as

c =

SSTng -1SSW

g(n-1) .

Eq. ( 10.1.4 )

By virtue of the additive properties associated with Eq. (9.1.6), it can be

shown that


c2 =

SSB + SSWng -1SSW

g(n-1)

= SSB + SSWSSW

. g(n-1)ng -1

.

Eq. ( 10.1.5 )

Further rearrangement reveals

SSBSSW

+1 = c2 ng-1g n-1

.

Eq. ( 10.1.6 )

We may now solve for SSB and present the result as

SSB = SSWg n-1

c2 ng-1 - g n-1.

Eq. (10.1.7 )

By expanding the left side of Eq.(10.1.7), we observe that

Eq. ( 10.1.8 )

and dividing by n reveals

Eq. ( 10.1.9 )


Rj = 1

g

Xj - X` j2

= tvW

2

nc2 ng - 1_ i - g n - 1_ i

Rj = 1

g

Xj - X` j2

= tvW

2

nc2 ng - 1_ i - g n - 1_ i

To define the average quadratic deviation, we divide both sides by g, therein

providing the relation

Eq. ( 10.1.10 )

Taking the square root of both sides, we are left with the "typical" absolute

mean deviation (shift). Of course, this is provided in the form

Eq. ( 10.1.11 )

By standardizing to the normalized case NID(0,1), we observe that

ZShift.Typ = c2(ng-1) - g(n-1)ng .

Eq. ( 10.1.12 )

However, for the case c = 1, Eq. ( 10.1.12 ) reduces to

ZShift.Typ = (ng-1) - g(n-1)ng = g-1

ng .

Eq. ( 10.1.13 )

For purposes of application, it would be highly desirable to set ZShift.Typ

= 0 for the case c = 1 so as to maintain computational appeal. To accomplish


g

Rj = 1

g

Xj - X` j2

= tvW

2

ng

tc2 ng - 1_ i- g n - 1_ i

d=vW g

Rj = 1

g

Xj - X` j2

= tvW ng

tc2 ng - 1_ i - g n - 1_ iR

T

SSS

V

X

WWW

this, we correct Eq. (10.1.12) by subtraction of Eq. (10.1.13). 71 The result

of this algebraic operation is given as

ZShift = (c2-1)(ng-1)ng .

Eq. ( 10.1.14 )

For the reader's convenience, a case specific comparison of Eq.(10.1.12) and

Eq. (10.1.14) is presented in figure 10.1.1.

Figure 10.1.1

71 The mathematically inclined reader will quickly recognize that such a proposal is somewhat spurious

from a theoretical point of view. However, the author asserts that the practical benefits tied to the proposal far outweigh the theoretical constraints. For example, when c=1, the uninformed practitioner would intuitively reason that ZShift = 0 since the variances are equal. However, from Eq. (10.1.13) it is apparent that ZShift would necessarily prove to be greater than zero, owing to a differential in the degrees of freedom. Needless to say, this would present the uninformed practitioner with a point of major contention or confusion. Although less precise, Eq. (10.1.14) provides a more intuitive result over the theoretical range of c. It should also be noted that this operation has a negligible effect on ZShift for typical combinations of n and g. As a consequence of these characteristics, the author believes the application of Eq. (10.1.13) as a corrective device or compensatory measure is justified, particularly in the spirit of many conventional design engineering practices and forms of producibility analysis.


The Absolute Difference Between Eq. ( 10.1.12 ) and Eq. ( 10.1.14 ) for the case n=5, g=50.

Of particular interest, figures 10.1.2 and 10.1.3 display the relation

between ZShift and c for a selected range of ng. The most prominent

conclusion resulting from these graphs is that the relationship is reasonably

robust to sample size and sub-grouping constraints, as well as their product.

It should also be noted that Z2Shift asymptotically approaches the quantity c2 -

1 as ng approaches infinity.

-.5

0

.5

1

1.5

2

2.5

3

.75 1 1.25 1.5 1.75 2 2.25 2.5 2.75 3 3.25

ng=10ng=100ng=1000

C

ZSHIFT

Figure 10.1.2

The Effect of ng on ZShift for c=1 to c=3.


C

Figure 10.1.3

The Effect of ng on ZShift for c=1.5 to c=2.0.

In the spirit of establishing a relative but standard mean correction

(shift factor), let us consider the general range of conventional sampling

practice. Arguably, such a range is given by the combinations formed under

the constraint that

4 ≤ n ≤ 6

and

25 ≤ g ≤ 100.

Perhaps the most commonly employed combination is that of n = 5

and g = 50. Under this combination, the total sample size is given as ng =

250. Other practitioners writing on the topic of process capability studies


have often recommended such a sample size as a general guideline,

particularly when a statistical process control (SPC) chart for variables data

is employed to facilitate the study.

10.2 Resolving the shift

When presented with the case c = 1.8 in the context of a common

rational sampling strategy (n = 5 and g = 50), we compute ZShiftσST = 1.49σST.

Notice that the given value of c corresponds to the worst-case expectation as

per the research. As with many engineering conventions, the worst-case

condition is most often utilized as a critical threshold. Thus, we have

rationally established that δ = 1.5σST can be considered as a general correction

that accounts for “typical” momentary disturbances to process centering (but

only with regard to µ and not T).

Based on these findings and other discoveries, the range of

generalization for the “shift expectation” is given as 0.50 < ZShift < 2.00,

retaining a modal condition of ZShift = 1.50. Of course, this assertion is

constrained to “normal and typical” sampling strategies involving the practice

of rational sub-grouping. It is also generally constrained to those processes

exhibiting a short-term capability in the range of 3.0 < ZST < 5.0, with the

modal condition of ZST = 4.0. The aforementioned range of short-term

capability should be considered most reasonable owing to its consistency with

the stated research and conclusions resulting from extensive empirical

benchmarking studies.72

Essentially, the argument should not be whether or not a compensatory

shift of 1.5σST is valid for each and every CTQ. We know that every CTQ

will exhibit its own unique shift value; however, when considering a great

many such CTQs, it is a safe bet that the typical shift will be about 1.5σST.

72 The vast majority of benchmarking data gathered by this researcher and practitioner (since 1984) has

revealed the process capability associated with a great many products and services to exist in the range of 3.5σ to 4.5σ, with the trailing edges dropping off at 3.0σ and 5.0σ respectively. Obviously, this tends to infer that the typical CTQ of a product or service will exhibit a performance capability of about 4.0σ. This is to be generally expected, given the conventional practice of establishing 25 percent design margins. Of course, 4.0σ is the equivalent form of such a safety margin.


This author would, therefore, suggest that any such debate on this topic should

be focused on how to get people using such “research-based” rules of thumb.

After all, it is far more rational to evaluate the reproducibility of a design

under the assumption of an unfavorably vectored 1.5σST shift (in all of the

critical components of a system) than it is to simply set those parts at their

respective nominal condition and then perform the evaluation. Obviously, the

latter type of analysis is unrealistic and will only reveal the best-case

performance condition. At the other extreme, the probability of a “worst-case

stack” is virtually zero – even when evaluating designs of relatively low

complexity. Therefore, the six sigma practice of imposing a 1.5σST shift on

each critical performance opportunity represents a rational method for

analyzing the robustness of a design.

10.3 Calculating the minimum

In accord with the one-way ANOVA model, we readily recognize that

the between-group mean square can be contrasted to the within-group mean

square so as to form a type of signal-to-noise ratio. As theoretically known in

the field of mathematical statistics, such a ratio can be evaluated via the F

distribution in the form

F = MSBMSW.

Eq. ( 10.3.1 )

By standardizing to the case NID(0,1), it will be recognized that

F = MSB Eq. ( 10.3.2 )

since MSW = 1.0. In expanded form Eq. ( 10.3.2 ) is given as

F = SSBg-1


Eq. ( 10.3.3 )

or

SSB = F g-1 .

Eq. ( 10.3.4 )

Expanding the between-group sums-of-squares yields

Xj - X

2n Σ

j = 1

g = F g-1

Eq. ( 10.3.5 )

and dividing both sides by ng gives

Xj - X

2Σ j = 1

g

g = F g-1ng .

Eq. ( 10.3.6 )

After correcting for degrees of freedom, we have

Xj - X

2Σ j = 1

g

g - g-1ng = F g-1

ng - g-1ng .

Eq. ( 10.3.7 )

By standardizing and some simple rearrangement, we are left with the

quantity

ZShift = F-1 g-1ng .


Eq. ( 10.3.8 )

Drawing upon the merits of our previous discussion, we may now state the

equality

(c2-1)(ng-1) = (F-1) (g-1) Eq. ( 10.3.9 )

from which we obtain

c = 1 + (F-1) (g-1)ng-1 .

Eq. ( 10.3.10 )

Thus, we may compute the minimum expected ZShift or corresponding

value of c at the critical threshold of Ho for any combination of α, n and g.

Of course, such estimation is predicated on application of the one-way

ANOVA model. As may be apparent, the equations presented in this portion

of the book have many implications for the practice of six sigma.

10.4 Connecting the capability

We shall now turn our focus to the various indices of capability by

building upon the analytical foundation constructed thus far. The informed

reader will recall the basic Z transformation as

Z = X -µσ

Eq. ( 10.4.1 )


where µ is the population mean, σ is the population standard deviation, and X

is a random normal measurement obtained from the corresponding population.

In terms of Z, we may now estimate the short-term performance of a process

as

Eq. ( 10.4.2 )

where T is the nominal or target specification, SL is the specification limit of

interest and σST is an estimator of the short-term population σST. Notice that

Eq. (10.4.2) assumes µ = T. However, when the mean does not coincide

with the target value, we may calculate

Eq. ( 10.4.3 )

where x is the grand mean of the sample data, or the estimator of µ.

Due to dynamic perturbations of a transient or temporal nature, we often

witness an inflation of the initial short-term standard deviation that, over many

cycles of a process, will degrade the value of Z . To compensate for this

phenomenon, we calculate the quantity

Eq. ( 10.4.4 )

When considering the simultaneous occurrence of static and dynamic sources

of error, the resultant Z value is expressed as


ZST = tvST

T - SL

ST

Z1 = tvST

X - SL

Z2 = vLT

T - SL= vSTc

T - SL

Eq. ( 10.4.5 )

By convention, we recognize that unity is often defined as existing between

the three sigma limits of a performance distribution. Given this, we may

describe the short-term process capability ratio as

Eq. ( 10.4.6 )

In accordance with existing literature, we may account for the effect of a static

mean offset by computing the ratio

k1 = T – XT – SL

Eq. ( 10.4.7 )

which may be restated in the form

1– k1 = X – SLT – SL .

Eq. ( 10.4.8 )

Finally, the cross multiplication of Eq. (10.4.6) and Eq. (10.4.8) reveals

Eq. ( 10.4.9 )

may be expressed as an equivalent Z value. As a consequence, we

may calculate the quantity


PK1Hence, C

CPK1 = CP 1 - k1_ i=3 tvST

T - SL :T - SL

X - SL=

31

tvST

X - SL=

3Z1

Z3 = tvLT

X - SL= tvSTc

X - SL

CP =31

tvST

T - SL=

3ZST

Eq. ( 10.4.10 )

or simply

Eq. ( 10.4.11 )

By analogy, we write the equation

Z2ZST

= 1–k2

Eq. ( 10.4.12 )

and

Z3ZST

= 1–k3.

Eq. ( 10.4.13 )

By the manipulation of Eq. (10.4.4) we discover that

Eq. ( 10.4.14 )

from which we observe

k2 = 1 - 1c.

Eq. ( 10.4.15 )

By substitution, we recognize that

1 - k2

c = 1 - k3 Eq. ( 10.4.16 )

which may be rearranged to reveal


3Z1 =

31 ZST 1 - K1_ i=

3

ZST 1 - K1_ i

tv1T

tvST = c1 = 1 - k2_ i

ZST

Z1 = 1 - k1_ i

k3 = 1 - 1 - k1c .

Eq. ( 10.4.17 )

Thus, from the latter arguments, it follows that

1–k3 = 1–k1 • 1c = 1–k1 1–k2 Eq. ( 10.4.18 )

from which we obtain

k3 = k1 + k2 - k1k2.

Eq. ( 10.4.19 )

Based on this, we may conclude the joint occurrence of static and

dynamic error is additive by nature, but their correlation must be statistically

accounted for. If the product is zero, Eq. (10.4.19) is reduced to k3 = k1 + k2.

In summary, we can express the long-term Cpk's as

Cpk2 = Cp 1 - k2 = Cpc

Eq. ( 10.4.20 )

and

Cpk3 = Cp 1 - k3 = Cp 1 - k1c = Cpk1

c .

Eq. ( 10.4.21 )


11.0 Harnessing the Chaos

11.1 Setting the course

During the last several decades, we have experienced an enormous explosion of technology. As a result, a large proportion of today's product designers have been liberated from many types of material and component related constraints. Because of this technological liberation, we see many relatively familiar products that operate faster, have more features, occupy less space, and in some instances, even cost less than their predecessors.

The overriding design implication of the so called "technology boom" is quite clear -- for every incremental increase in design complexity and sophistication there must be a corresponding increase in producibility; otherwise, the manufacturer will not remain competitive. In many factories, producibility has become a major business issue. It is often the key to economic success or catastrophe.

The purpose of this portion of the book is to provide the reader with a novel method to enhance the study of complex designs using a dynamic simulation method based upon chaos theory and fractal geometry. During the course of discussion, it will be demonstrated how such an approach can significantly enhance our engineering understanding of manufacturing behavior and its consequential impact on performance projections.

11.2 Framing the approach

Classical product design often employs Monte Carlo simulation to study the resultant distribution and stochastic properties of certain functions of several variables. This is accomplished by specifying a cumulative density function (c.d.f) for each of the c independent variables and then randomly selecting r members from each c.d.f., thus forming a matrix, X, consisting of r rows and c columns. The X matrix is given by

X =

x11, x12, .. . ,x1cx21, x22, .. . ,x2c

.

.

.xr1, xr2, . . . ,xrc


and the response column c+1 would then be constructed subsequent to the application of an appropriate transfer function to each of the r rows.

To illustrate the aforementioned method, let us assume the case

NID(µ,σ2) for each of the c independent variables. In this case, we might be

interested in studying the resultant distributional form, D, of their products. In

other words, D would be obtained by summing the elements in each row of X.

The response vector, V, would then be located in column c+1. Following this,

we would compose a histogram of V and compute its germane indices. In

turn, the D vector and its associated indices would serve as the basis for

drawing various conclusions. In some instances, it may be desirable to study

certain unique segments of D. One method for achieving this involves a

stratification of the c.d.f. This is accomplished by dividing the c.d.f. into k

intervals of equal probability. For example, if k = 5, the c.d.f. intervals are

from 0-.2 , ... , .8-1, respectively. From each of the k intervals, a random

selection is made. Here again, D and its associated properties would be used

to make certain decisions.

11.3 Limiting the history

The classical form of the Monte Carlo method is a very powerful and commonly used simulation tool; however, in the instance of product design, D frequently suffers a major limitation -- the approach assumes a static universe for each selected c.d.f. In most industrial applications, the µ and σ2

associated with any given c.d.f. are not necessarily immobile. In fact, it is almost an idealization to make such an assumption. For example, such nonrandom phenomenon as tool wear, supplier selection, personnel differences, equipment calibration, etc. will synergistically contribute to nonrandom parametric perturbations. For many varieties of nonlinear transfer functions, the resulting pattern of D and V will appear random. In fact, many analytical methods would substantiate such a qualitative assertion.

As may be apparent, the previously mentioned limitation can adversely influence the decision making process during the course of product configuration (design). Therefore, it is here postulated that to account for these seemingly random perturbations, it becomes mandatory to sample from


multiple distribution functions corresponding to a set of dynamic population parameters, S, where

S = µ ij , σ ij

2

j = 1 to ci = 1 to r

. Eq. (11.3.1)

Hence, the resulting response distribution, D', will more realistically

reflect the "true" manufacturing state-of-affairs. Intrinsically, the paradigm of S follows certain natural rules of mathematical order. To explore such rules and the consequential impact of S on D, we shall undertake a study of chaos theory and fractal geometry.

11.4 Understanding the chaos

To begin our discussion on the use of chaos theory and fractal geometry, let us qualitatively define what is meant by the term "chaos." According to Gleick (1987), the phenomenon of chaos may be described by its unique properties. In a mathematical sense, chaos is " ... the complicated, aperiodic, attracting orbits of certain (usually low-dimensional) dynamical systems." It is also described by " ... the irregular unpredictable behavior of deterministic, nonlinear dynamical systems." Yet another description is " ... dynamics with positive, but finite, metric entropy ... the translation from math-ease: behavior that produces information (amplifies small uncertainties), but is not utterly unpredictable."

Obviously, the latter descriptions may prove somewhat bewildering to the uniformed reader, to say the least. So that we may better understand the unique properties associated with the chaos phenomenon, let us consider a simple example. Suppose that we have some experimental space given by Q, where Q is a quadrilateral. We shall label the northwest corner as A, the southwest corner as B, the southeast corner as C, and the northeast corner as D.

With the task of labeling accomplished, we must locate a randomly selected point within the confines of Q. This is a starting point and shall be referred to as τ0. Next, we shall select a random number, r, between 0 and 1. Based on the value of r, we follow one of three simple rules:

Rule 1: If r <= .333, then move one-half the distance to vertex A


Rule 2: If r >= .667, then move one-half the distance to vertex B

Rule 3: If .333 < r < .667, then move one-half the distance to vertex C

The selection process would be iterated a substantial number of times.

The astonishing discovery is that this process of iteration reveals a distinct pattern often referred as a fractal shape or "mosaic." For the given rule set, the resulting mosaic has been displayed in figure 11.4.1.

Figure 11.4.1

Fractal Mosaic Created by Successive Iteration of a Rule Set

Such a fractal phenomenon has been observed and studied by many mathematicians, perhaps most notably by Mandelbrot (1982). In particular, Barnsley (1979) was able to make the following conclusion: "fractal shapes, thought properly viewed as the outcome of a deterministic process, has a second equally valid existence as the limit of a random process." The implications of this conclusion are quite profound for product design simulation and process control.


11.5 Evolving the heuristics

In order to evolve the aforementioned phenomenon into a practical design simulation methodology, we must set forth a generalization of the mathematics that underlie the fractal mosaic displayed in figure 11.4.1.

So as to translate the concept of rule-based geometric reduction, we will incorporate the Cartesian coordinates (x,y) in a Euclidian plane, ψ. In this instance, the boundary constraints may be given by α, β, and γ. Hence, we will now say that ψ = f (α,β,γ ). Without loss of generality, we shall assume that the vertices α and γ are 1 unit removed from the origin, β = 0,0. Recognize that the resulting x,y intersect is denoted as τi. Therefore, we may say that τi = f (x,y) at the ith generation. Furthermore, the reader must remain cognizant of the fact that τ0 is initially established as an arbitrary location within ψ.

A simple algebraic manipulation of the arbitrary rules pertaining to the chaos game, yields the three point fractal generator. In general form, the generator may be described by

xt+1 = φxt + θ 1(φ − xt)

Eq. (11.5.1) and

yt+1 = φyt + θ 2(φ − yt)

Eq. (11.5.2)

given a decision such that θI = 0 or 1. Naturally, the decisions are based on a random number, r, such that if

r ≤ ξ, then θ 1 =θ 2 =0

Rule (11.5.1) or, if

r ≥ 1 − ξ, then θ 1 =1,θ 2 =0

Rule (11.5.2)


otherwise,

θ 1 =0,θ 2 =1

Rule (11.5.3)

where r is a uniform random number between the limits 0 ≤ r ≤ 1, φ is a constant which is always less than 1, θi assumes the value of 0 or 1, and ξ is a constant such that 0 < ξ ≤ .5. For the sake of reading ease, we shall employ the notation GRS to represent any given set of rules.73

11.6 Timing the geometry

In order to make use of the fractal described in figure 1, we must project τi ... τn into a domain representative of time, T. To perform this task, we will construct a axis (Z) such that it is perpendicular to the defining axes of ψ. In two dimensional space, the latter condition can be portrayed as a straight line (λ) at an angle of π/4 radians with the X axis. The projection of τi on λ is given by

ωi = τisinθ

Eq.( 11.6.1 )

where θ is defined as

θ = α - π4

Eq.( 11.6.2 ) and

α = tan -1(τi).

73 The reader should recognize that the mosaic displayed in figure 11.4.1 was generated by letting φ = .5,

ξ = .333 and then plotting the resulting observations τi ... τr.


Eq.( 11.6.3 )

Notice that resultant projections of ωi on λ are not restricted to positive values, owing to the fact that θ can also be negative.

We may now generate the set {ω1, ω2, ..., ωn}, where the indices 1, ..., n correspond to progressive equally spaced points in time (t1 ... tn). Under this condition, ω1 would occur at time t1, ω2 at time t2, and so on. Thus, we are able to project τi ... τn into the domain T, as displayed in figure 11.6.1.

Time Axis

α = 0, 1

γ = 1, 0

β = 0, 0

.5

.5

+.3535

−.3535

ω iτi

Figure 11.6.1

Transformation of the Fractal Mosaic into a Time Series


Based on Eq. (11.5.1) and Eq. (11.5.2), it can be demonstrated that any xi,yi is derived from xi - 1, yi - 1. This phenomenon can be expressed in the form of an autoregressive model given as

zt = φzt -1

Eq.( 11.6.4 )

where z is an observation at time t and φ is a constant that ranges between -1,1. The reader will recognize that time is invariant under the τ and ω transformation schemes. As a result, the pattern in figure 11.6.1 will be the deterministic portion of the general AR(1) model.

From the preceding arguments, the reader may have gleaned the close association between the previously defined fractal rule set and a standard time series model of an autoregressive nature. In other words, the association is related to the deterministic aspects of the model. To demonstrate the aforementioned association, we shall consider the x coordinate of the Cartesian system. In this case, we will subtract xt from xt+1. This operation will yield

xt+1 = 1 + φ xt − φxt-1 + θ 1 xt-1 - xt .

Eq. ( 11.6.5 )

When θ1=0, we discover

xt+1 = 1 + φ xt − φxt-1 Eq. ( 11.6.6 )

and for the case θ1=1, we find that

xt+1 = φxt + 1 − φ xt-1 . Eq. ( 11.6.7 )

Naturally, the y coordinate of the Cartesian system would reflect the same mathematical constructs. From either perspective, Eq. (11.6.6) and Eq. (11.6.7) will be recognized as the deterministic portion of an AR(2) time


series model. Of general interest, the expression of a stationary autoregressive model of order p is given by

Eq. ( 11.6.8 )

where at is the shock at time t. Notice that the shock is also referred to as "white noise." It is imperative to understand that the stochastic nature of Eq. (11.6.8) manifests itself in the distribution of at. This particular distribution is described by g(at) = N(0,σat). Of course, it has been well established that a

great many industrial processes display a time dependency of some form. It is interesting that the time series phenomenon is also displayed by the chaotic pattern described in this book. The reader is directed to Box and Jenkins (1976) for a more thorough discussion on the nature of autoregressive models.

11.7 Exemplifying the fractal

Now that we have generated the set {ω1, ω2, ..., ωn}, each ωi may be employed as a new universe mean (µi). This is done for purposes of enhancing the Monte Carlo simulation; e.g., it provides a mechanism for introducing dynamic perturbations in process centering during the course of simulation. The same logic and methodology may be applied to the distribution variance (σ2). However, for the sake of illustration, we shall constrain the ensuing example by only perturbing µ.

Let us suppose that we are concerned with the likelihood of assembly pertaining to a certain product design, say a widget such as displayed in figure 11.7.1.


Xt = ziXt - i + at

i = 1

n

!

4.976 in. ±.003 in.

Part 1 Part 2 Part 3 Part 4

Part 5

P1 ... P4 = 1.240 in. ± .003 in.

Figure 11.7.1

Illustration of the Widget Product Example

In this case, we are concerned with predicting the probability of assembly; i.e., the likelihood that P1 ... P4 with fit into the envelope (P5), given the specified design tolerance (∆ = .003 in.). Given this, we will assume that the process capability for all Pj is Cp=1.0, where

Cp = ∆

3σ, Eq.( 11.7.1 )

T is the nominal or "target" specification74. Based on the process capability ratio we may estimate the process standard deviation as σ = ∆/3 = .001 in. If we let Xij be the measured length of any given manufactured part, the assembly gap (G) may be computed as

Eq.( 11.7.2 )

74 Recognize that such information is obtained from a process characterization study. The reader is

directed to Harry and Lawson (1988) for additional information on this topic.


Gi = Xi5 - Rij

j = 1

4!

With the aforementioned ingredients and the assumption Xij ~ NID(µ,σ), as well as the constraint µ = T, we are fully prepared to conduct a static Monte Carlo simulation as outlined in section 11.2. With an additional process parameter, k, and the aforementioned methodology, we are postured to significantly enhance simulation accuracy. For the reader's convenience, we shall define the parameter k as

k = µ - T

∆ . Eq.( 11.7.3 )

As may be apparent, k constitutes the proportion of ∆ consumed by a given amount of off-set in µ.

To begin, we must first establish the functional limits for perturbing µ, relative to T, during the course of dynamic Monte Carlo simulation. To do this, we must set α = γ = 1 so that the maximum hypotenuse of ψ is equal to .5. As we shall see, this is done to simplify subsequent calculations. In turn, this leads us to the limiting projection that is given by

ωmax = 12 2 .

Eq.( 11.7.4 )

Since the maximum mean shift is k∆, it can be demonstrated the scaling factor is

ρ = k∆2 2. Eq.( 11.7.5 )

Thus, the corrected universe mean is given by


µi = T + ρωi. Eq.( 11.7.6 )

Given the previously mentioned product parameters, we shall now evaluate the Monte Carlo outcomes under two different process centering conditions; namely, when k=.00 and k=.75. We shall say that the latter condition was established on the basis of empirical evidence resulting from a process characterization study. In this case, the simulation was conducted across N = 2,500 iterations for both values of k. Recognize that a direct comparison is possible since both simulations were conducted using the same seed (originating number). The overall results of the simulations are displayed in figure 11.7.2.

Figure 11.7.2

Effects of Dynamic Mean Perturbations on the Widget Monte Carlo Simulation (Note: ordinates no to scale) t


From this figure, the inflationary effect of chaotic perturbations in µ are quite apparent. It is also evident that the introduction of mean shifts during the course of simulation expanded the variances over many sampling intervals. Obviously, this constitutes a more realistic picture of expected performance. A more detailed comparative view of the assembly gap conditions is given in figure 11.7.3. This figure reveals the critical Z value changed from 7.1σ to 4.5σ as a result of the dynamic simulation. Tables 11.7.1 and 11.7.2 present the summary statistics related to the simulation.

Table 11.7.1. Summary Statistics for the Widget Monte Carlo Simulation Under the Condition k=.00

Index Part 1 Part 2 Part 3 Part 4 Part 5 Gap

Frequency = 2500 2500 2500 2500 2500 2500Mean = -1.24000 -1.24000 -1.24000 -1.24000 4.97600 0.01604

Median = -1.24000 -1.24000 -1.24000 -1.23990 4.97600 0.01605Std Dev = 0.00101 0.00100 0.00099 0.00101 0.00102 0.00226

Range = 0.00761 0.00649 0.00728 0.00665 0.00726 0.01662Variance = 0.00000 0.00000 0.00000 0.00000 0.00000 0.00001

Minimum = -1.24380 -1.24300 -1.24340 -1.24330 4.97250 0.00682Maximum = -1.23610 -1.23660 -1.23610 -1.23670 4.97980 0.02344Skewness = 0.06587 0.03926 -0.04197 -0.01383 -0.09580 -0.03210

Kurtosis = 0.10720 -0.01050 0.02323 -0.14992 0.23387 -0.08135

Table 11.7.2. Summary Statistics for the Widget Monte Carlo

Simulation Under the Condition k=.00

Index Part 1 Part 2 Part 3 Part 4 Part 5 Gap Frequency = 2500 2500 2500 2500 2500 2500

Mean = -1.24000 -1.24000 -1.23990 -1.23990 4.97600 0.01617Median = -1.24010 -1.24000 -1.24000 -1.23980 4.97610 0.01617Std Dev = 0.00156 0.00154 0.00154 0.00159 0.00157 0.00355

Range = 0.00990 0.00992 0.01004 0.00956 0.00951 0.02327Variance = 0.00000 0.00000 0.00000 0.00000 0.00000 0.00001

Minimum = -1.24530 -1.24460 -1.24480 -1.24480 4.97160 0.00456Maximum = -1.23540 -1.23470 -1.23470 -1.23520 4.98110 0.02783Skewness = 0.02861 -0.02511 -0.01509 -0.15426 -0.05947 -0.02460

Kurtosis = -0.40932 -0.35181 -0.38077 -0.40191 -0.45821 -0.02434


The autocorrelation of the assembly gap is located figures 11.7.4 and

11.7.5 for both values of k. It is interesting to note that the first order autoregressive model revealed a lack of fit for the case k = .00. For the case k = .75, a highly significant fit was observed, even in spite of the fact that the terminal response underwent several unique nonlinear transformations. It is generally believed that this observation supports the assertion that the pattern of simulated means will follow a time series model.

0

100

200

300

400

500

600

700

0 .005 .01 .015 .02 .025 .03 .035 .04

Cp = 1.0 , k = .00Cp = 1.0 , k = .75

Z No Shift: Z = 7.1 Shifted: Z = 4.5

Figure 11.7.3

Effect of Dynamic Mean Perturbations on the Widget Assembly Gap


.006

.008

.01

.012

.014

.016

.018

.02

.022

.024

.006 .008 .01 .012 .014 .016 .018 .02 .022 .024

y = .003585x + .015979, r2 = .000013

Figure 11.7.4 Autocorrelation (Lag=1) of the Widget Assembly

Gap Under the Condition k=.00

.0025 .005

.0075 .01

.0125 .015

.0175 .02

.0225 .025

.0275 .03

y = .485063x + .008325, r2 = .235286

Figure 11.7.5

Autocorrelation (Lag=1) of the Widget Assembly Gap Under the Condition k=.75


11.8 Synthesizing the journey

As discussed in this portion of the book, the study of complex designs can be significantly enhanced using dynamic Monte Carlo simulation. In particular, it was concluded that the use of chaos theory and fractal geometry has the potential to shed light on the structure of engineering problems and manufacturing behavior. It is also generally believed the suggested approach has the potential to constitute a new paradigm in engineering simulation methodology.

As demonstrated, the approach consists of projecting the intersect of k sets of Cartesian coordinates (resulting from a fractal rule set) into the time domain. The resulting pattern emulates the behavior of a dynamic process mean. Interestingly, it was noted that such behavior may be described by an autoregressive model. Once the projections are made, the means are used as a basis for dynamic Monte Carlo simulation. After application of the transfer function, the resulting autocorrelated response vector is formed into a histogram for subsequent study.

The net effect of such dynamic simulation results in an expanded variance when compared to that of the classical Monte Carlo method. From this perspective, the suggested methodology provides a better basis for the a priori study of producibility during the product design cycle.

12.0 Concluding the Discussion

Where the theory and practice of six sigma is concerned, there has been

much debate since its inception in 1984. Central to this debate is the idea of

reproducibility, but often discussed in the form of process capability – a contrast

of the actual operating bandwidth of a response characteristic to some established,

theoretical or expected performance bandwidth. It is widely recognized among

quality professionals that there exists many types of performance metrics to report

on process capability, and that all such valid measures are interrelated by a web of

well established statistics and application practices.

In order to better understand the pragmatic logic of six sigma, we have

underscored the key ideas, tenets and statistical concepts that form its core. In

specific, this book has presented and interrogated the theoretical underpinnings,


analytical rationale and supporting practices that facilitate the assessment of

configuration reproducibility, from both a design as well as process point of view.

We have also set forth the arguments necessary to develop and support a

technically sound design qualification procedure. Throughout the related

discussion, we have established that the valid assessment of process capability is

highly dependent upon a solid understanding of a) the inherent nature of variation,

b) the idea of “rational” sampling, c) analysis of variance and d) basic control

chart theory – just to name a few key concepts, tools and practices.

Of course, many of these concepts underlie the field often called “quality

science” and should be relatively familiar to most practitioners of six sigma.

However, embedded within the discussion of such ideas, we have set forth and

explored several unique twists and turns that depart from conventional theory and

practice. To this end, we considered conventional wisdom, then constructed the

unconventional arguments that differentiates the practice of six sigma from other

well know (but less effective) initiatives.

At this point in time, the world generally acknowledges that six sigma has

many practical applications and economic benefits, some of which are inclusive

of but not at all limited to a) benchmarking, b) parameter characterization, c)

parameter optimization, c) system design, d) detail design, e) reliability analysis,

f) product simulation, and g) process simulation. The ideas, methods and

practices set forth in this book can greatly extend the reach of such applications

by providing the bedrock upon which a network of unique knowledge can be

built.

In turn, new knowledge spawns new insights, which foster new questions.

Naturally, the process of investigation drives the discovery of answers. As a

consequence, ambiguity diminishes and new direction becomes clear. Only with

clear direction can people be mobilized toward a super-ordinate goal – six sigma.

Thus, the intellectual empowerment of this goal represents the ultimate aim of six

sigma.


Appendix A: Guidelines for the Mean Shift Guideline 1: If an opportunity-level metric is computed on the basis of discrete data

gathered over many cycles or time intervals, the equivalent Z transform should be

regarded as a long-term measure of performance. If we seek to forecast short-term

performance (ZST), we must add a shift factor (ZShift) to ZLT so as to remove time-related

sources of error that tend to degrade process capability. Recognize that the actual value

of ZShift is seldom known in practice when the measurements are discrete in nature (pass-

fail). Therefore, it may be necessary to apply the accepted convention and set ZShift at

1.50. As a consequence of this linear transformation, the resultant Z value is merely a

projection (approximation) of short-term performance. Thus, we are able to approximate

the effect of temporal influences (i.e., normal process centering errors) and remove this

influence from the analysis via the transform ZST = ZLT + ZShift.

Guideline 2: If a metric is computed on the basis of continuous date gathered over a

very limited number of cycles or time intervals, the resultant Z value should be regarded

as a short-term measure of performance. Naturally, the short-term metric ZST must be

converted to a probability by way of a table of area-under-the-normal-curve, or any

acceptable computational device. If we seek to forecast long-term performance, we must

subtract ZShift from ZST so as to approximate the long-term capability. Recognize that the

actual value of ZShift is seldom known in practice. Therefore, it may be necessary to

apply the accepted convention and set ZShift =1.50. As a consequence of this linear

transformation, the resulting Z value is a projection of long-term performance. Thus, we

are able to artificially induce the effect of temporal influences into the analysis by way of

ZST - ZShift = ZLT.

Guideline 3: In general, if the originating data are discrete by nature, the resulting Z

transform should be regarded as long-term. The logic of this guideline is simple: a fairly

large number of cycles or time intervals is often required to generate enough

nonconformities from which to generate a relatively stable estimate of Z. Hence, it is


reasonable to conclude that both random and nonrandom influences (of a transient nature)

are reflected in the data. In this instance, guideline 1 would be applied.

Guideline 4: In general, if the originating data are continuous by nature and were

gathered under the constraint of sequential or random sampling across a very limited

number of cycles or time intervals, the resulting Z value should be regarded as short-

term. The logic of this guideline is simple: data gathered over a very limited number

cycles or time intervals only reflects random influences (white noise) and, as a

consequence, tends to exclude temporal sources of variation.

Guideline 5: Whenever it is desirable to report the corresponding “sigma” of a given

performance metric, the short-term Z must be used. For example, let us suppose that we

find 6210 ppm defective. In this instance, we must translate 6210 ppm into its

corresponding sigma value. Doing so reveals ZLT = 2.50 Since the originating data was

long-term by nature, guidelines 1 and 3 apply. In this case, ZLT + ZShift = 2.5 + 1.5 = 4.0.

Since no other estimate of ZShift was available, the convention of 1.5 was employed.


References and Bibliography

Barnsley, M. and Demko, S. (1986). Chaotic Dynamic and Fractals. Academic Press, San

Diego, California.

Bender, A. (1975)."Statistical Tolerancing as it Relates to Quality Control and the

Designer." Automotive Division Newsletter of ASQC.

Briggs, J. and Peat, F. (1989). Turbulent Mirror. Harper and Row, New York, New York.

Ekeland, I. (1988). Mathematics and the Unexpected. University of Chicago Press.

Chicago, Illinois.

Evans, David H., (1974). “Statistical Tolerancing: The State of the Art, Part I:

Background,” Journal of Quality and Technology, 6 (4), pp. 188-195.

Evans, David H., (1975). “Statistical Tolerancing: The State of the Art, Part II: Methods

for Estimating Moments,” Journal of Quality and Technology, 7 (1), pp. 1-12.

Evans, David H., (1975). “Statistical Tolerancing: The State of the Art, Part III: Shifts

and Drifts,” Journal of Quality and Technology, 7 (2), pp. 72-76.

Gilson, J., (1951). “A New Approach to Engineering Tolerances,” Machinery Publishing

Co., Ltd., London.

Gleick, J. (1987). Chaos, Making a New Science. Penguin Books, New York, New York.

Grant, E.L., and Leavenworth, R.S. (1972). Statistical Quality Control (4th Edition). New

York: McGraw-Hill Book Company.

Harry, M.J. (1986). The Nature of Six Sigma Quality. Motorola University Press,

Motorola Inc., Schaumburg Illinois.

Harry, M.J. and Lawson, R.J. (1988). Six Sigma Producibility Analysis and Process

Characterization. Publication Number 6σ-3-03/88. Motorola University Press, Motorola

Inc., Schaumburg Illinois.


Harry, M.J. and Stewart, R. (1988). Six Sigma Mechanical Design Tolerancing.

Publication Number 6σ-2-10/88. Motorola University Press, Motorola Inc., Schaumburg

Illinois.

Harry, M.J. and Prins, J. (1991). The Vision of Six Sigma: Mathematical Constructs

Related to Process Centering. Publication Pending. Motorola University Press, Motorola

Inc., Schaumburg Illinois.

Harry, M.J. and Schroeder R. (2000). Six Sigma: The Breakthrough Management

Strategy Revolutionizing the World’s Top Corporations. New York, NY: Doubleday.

Juran, J.M., Gryna, F.M., and Bingham, R.S. (1979). Quality Control Handbook. New

York, NY: McGraw-Hill Book Co.

Krasner, S. (1990). The Ubiquity of Chaos. American Association for the Advancement

of Science. Washington D.C.

Mandelbrot, B. (1982). The Fractal Geometry of Nature. W.H. Freeman, San Francisco. Mood, A. and Graybill, F. (1963). Introduction To The Theory of Statistics (2nd Edition).

New York: McGraw-Hill Book Co.

Motorola Inc. (1986). Design for Manufacturability: Eng 123 (Participant Guide).

Motorola Training and Education Center, Motorola Inc., Schaumburg, IL.

Pearson, E.S. and Hartley, H.O. (1972). Biometrika Tables for Statisticians. Vol. 2,

Cambridge University Press, Cambridge.

Shewhart, W.A. (1931). Economic Control of Quality of Manufactured Products. D. Van Nostrand Company, Inc.


Resolving the Mysteries of Six Sigma (1.5 Sigma Shift...No part of this book may be used or...

Documents

Transcript of Resolving the Mysteries of Six Sigma (1.5 Sigma Shift...No part of this book may be used or...