Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ...

30
Nonparametrics on Minitab 11.11 Version 1 σχϖμ ρπθ σχϖμ ρ π θ σ χϖ μρπ θ ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and Statistics University of Canterbury

Transcript of Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ...

Page 1: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

Nonparametrics on Minitab 11.11

Version 1

σχωµρπθ

σχ

ωµ

ρπθσχωµρπ

θ

?

Stat 313 2000By Julian Visch & Irene Hudson

Department of Mathematics and StatisticsUniversity of Canterbury

Page 2: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

Nonparametrics on MinitabBased on Tony Davidson’s

A Students’ Introduction to Minitabcreated on June 21, 2000

using AMS-LaTeX

Page 3: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

Contents

1 Introduction 11.1 Terminology Used . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 General Advice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2.1 Where to find the “Help Desk” . . . . . . . . . . . . . . . . 2

2 Getting Started 32.1 Finding the Computer Rooms . . . . . . . . . . . . . . . . . . . . 32.2 Getting into the Buildings . . . . . . . . . . . . . . . . . . . . . . 42.3 Finding a Computer . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3 Using Minitab 53.1 Getting into Minitab . . . . . . . . . . . . . . . . . . . . . . . . . 53.2 The Minitab Window . . . . . . . . . . . . . . . . . . . . . . . . . 53.3 The Minitab Environment(as detailed in the help menu) . . . . 63.4 Minitab Commands . . . . . . . . . . . . . . . . . . . . . . . . . . 7

4 The Introductory Minitab Session 84.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.2 Reading a Command . . . . . . . . . . . . . . . . . . . . . . . . . 84.3 Information Command . . . . . . . . . . . . . . . . . . . . . . . . 84.4 Print Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.5 Name Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.6 Describe Command . . . . . . . . . . . . . . . . . . . . . . . . . . 94.7 Histogram Command . . . . . . . . . . . . . . . . . . . . . . . . . 104.8 Table Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114.9 Count and Tally Commands . . . . . . . . . . . . . . . . . . . . . 114.10 Copy Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124.11 LET Command. Arithmetic in MINITAB . . . . . . . . . . . . . 134.12 Data Entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144.13 Saving your data . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.14 Retrieving Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

5 Nonparametrics 165.1 Stest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165.2 Sinterval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175.3 Wtest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185.4 Winterval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205.5 Mann-Whitney . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215.6 Kruskal Wallis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225.7 Friedman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245.8 Walsh or Pairwise Averages . . . . . . . . . . . . . . . . . . . . . 265.9 Pairwise Differences . . . . . . . . . . . . . . . . . . . . . . . . . 265.10 Wslope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275.11 Rregress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

ii

Page 4: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

A Students’ Introduction to MINITAB

1 Introduction

This booklet has been compiled to introduce students to using the Minitabpackage on the University’s undergraduate computer system for statisticalcomputations.

1.1 Terminology Used

<key> means press the key shown in bold inside the < > for instance: <Enter>means press the Enter key. Text to be typed is shown in bold for instance:72<Enter> means type 72 and then press the Enter key. Some commandsrequire a key to be held down while another is pressed, this is shown with ‘+’between the keys e.g. <Ctrl>+<z> means hold down the Ctrl key and pressz1. Sometimes user supplied input is needed; this will be shown by italics.

Note: While following the instructions, read ahead. This will help with yourunderstanding and guard against you doing the next step wrongly, and gettingconfused.

1.2 General Advice

It is essential that you understand clearly what each statistical proceduredoes. Writing down the result of the procedures is one way to achieve this. Infuture work, you will have to decide for yourself the following:

• What information do I require (from a MINITAB worksheet for a partic-ular purpose)?

• How will I obtain this information? (i.e. what MINITAB commandsshould be used to obtain the information required?)

At any time, personal help may be obtained from the Help Desk on Level2 of the Computer Services Centre (phone 6060). Questions to the Help Deskshould be of a general nature. For questions specific to this course, see yourtutor or contact the Statistics secretary in order to be pointed in the rightdirection.

1Usually in such cases upper or lower case is acceptable

1

Page 5: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

1.2.1 Where to find the “Help Desk”

2

Page 6: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

2 Getting Started

2.1 Finding the Computer Rooms

There are several clusters of IBM Personal Computers (PCs) that will beavailable for your use. One cluster is called “The Vault” which consists of 5individual rooms labelled Vault 1-5, with each room containing 23 PCs anda laser printer and is located in the basement of the Commerce building.Another cluster is called “The Cave” which is in the Engineering Library Ex-tension Building on the north side of campus which consists of 48 PCs. Athird cluster is called “The Loft” which has individual PCs and MacIntosh’savailable for individual use at all times and is located on the 5th floor of theCentral Library (James Hight). And lastly there is the cluster called “TheCrypt” which consists of 2 labs; Crypt1 and Crypt2. which is in the new Mathsand Computer Science building.See below for a map showing these locations on campus.

Information TechnologyServices

Cave

Crypt Loft

Math/Comp Building

Vault

3

Page 7: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

2.2 Getting into the Buildings

Access to either computer facility is by plastic card which you should haveobtained when you enrolled. If you have not received a card or the carddoes not operate both of the doors, see the people at Registry. Please ensurethat you have your card and that it works before the commencement of theComputer Familiarisation Class. To enter the building just swipe the cardthrough the thin slot, then type in your security number, then the “in” button.You should then be able to just pull open the door and go in.

2.3 Finding a Computer

When entering the James Hight, you will need to enter through the maindoors opposite registry, take the lift(or the stairs) up to the fifth floor. In theCave the computers are on the right, at the end of the short corridor. Vault 1is in the Commerce building basement to left when exiting the lifts. Vaults 2-4are located along the same side as vault 1, and vault 5 is located near to vault4 but towards the center of the basement. The loft is upstair in the JamesHight, and Crypt1 is in the Math/Comp building down the central stairs andthen to your left.

4

Page 8: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

3 Using Minitab

3.1 Getting into Minitab

To get into Minitab, click the left mouse button on the Start Menu, move themouse2 to programs, then along to Math & Stats and then go along to Minitaband select.

3.2 The Minitab Window

At this stage you should be in Minitab with a screen something similar to thatshown below.

Input and Output is viewed here in the Session Window

Data can be entered into thecolumns of the Data window

Titles can be inserted alongthis row only

2You do not need to grip the mouse hard to do this

5

Page 9: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

3.3 The Minitab Environment(as detailed in the help menu)

1. A Worksheet that contains your data. This includes constants, vec-tors(columns) and matrices.

2. A Data window that shows columns of data as shown on the previouspage. In the Data window you can

• enter columns of data into the worksheet

• name, resize, and format columns

• move quickly to different cell locations

• cut, copy, or paste cells to and from the Clipboard

Although the Data window has rows and columns, it is not a spreadsheetlike Microsoft Excel or Lotus 1-2-3. In Minitab, cells contain values thatyou type or generate with commands. Cells do not contain formulas thatupdate based on other cells. For example, if you want column C3 to equalthe values in C1 plus the values in C2, you would use the calculator (Calc> Calculator) to generate the values for C3. If you change the values inC1, C3 does not change until you use the calculator again or use someother command to change C3s contents.

3. Menus to issue commands for statistical analysis, data manipulation,and data transformation. Menu items can directly execute a command,or open a dialog box. These menus lie at the top of your Minitab window.As frequent menu use contributes to R.S.I. or O.O.S. and tends to beslower we recommend that you wherever possible stick to using thesession window.

4. A Session window where you can type in your commands and displaysyour results as shown on the previous window.

5. Graph windows for high-resolution graphs.See the help facility (help menu) for details.

6. An Info window that displays a summary of your worksheet.See the help facility (help menu) for details.

7. A History window that lists commands you have used in your session.You can re-execute commands by copying them from the History windowand pasting them into the Command Line Editor.See the help facility (help menu) for details.

6

Page 10: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

8. Session commands are alternatives to menu commands that you can typein the

• Session windowTo type commands in the session window, you must first select En-able Command Language from the Editor menu (note not theEdit menu). To undo the process you must select Disable Com-mand Language from the Editor Menu.

• Command Line Editor which allows you to quickly edit and re-execute session commands. The Command Line editor can be foundon the Edit menu or can be activated using <Ctrl>+L.

You can intersperse menu commands and session commands throughoutyour session if you wish.

9. Context-sensitive Help for dialog boxes, Session window commands, andoverview information.

10. A complete macro language that lets you automate repetitive tasks, ex-tend Minitab’s functionality, or even design your own session commands.See the help facility (help menu) for details.

3.4 Minitab Commands

1. MINITAB commands may make reference to a column using the numberof the column or its name (if it has been given one). If a name is used, itmust be put between single quotes in the command,

e.g. histogram ’item’

2. The first four letters of any command are sufficient to identify it, e.g.hist ’item’ would have done in the example above.

3. If you make a mistake in typing a command, simply re-type it. Thisoverwrites the previous statement.

4. Some commands have subcommands. To invoke a subcommand, typea semicolon at the end of the main command. This results in the promptSUBC> appearing; then type in the subcommand. A subcommand mustconclude with either a semicolon (if there is another subcommand youwish to issue) or a full stop (if there is not another subcommand).

7

Page 11: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

4 The Introductory Minitab Session

4.1 Introduction

I will assume that you have already logged onto the computer and have clickedonto the Minitab icon, and have the Minitab window open before you. Nowselect Enable Command Language from the Editor (not the edit Menu)menu and select Save Preferences from the Edit menu. This will allow youto enter your commands in the upper half of your Minitab Window.

4.2 Reading a Command

You will now need to read in the inventory file. To do this, type

read ’k:parts.dat’ c1-c6

Note:

1. There is a single (forward) quote before the K: and after the filename.(The single forward quote is on the same key of the keyboard as thedouble quote.)

2. K: means that the file is being read from drive K (The class drive, notethe class drive may differ for you).

3. The filename is parts.dat.

4. The file is a datafile, and hence the extension .dat as part of the filename.

5. All six columns of the file are being read. We could have read any columnswe wanted,e.g.

c1 or c1 c3 or c1 c3-c6

You may need to wait while the file is being read. You will be advised on-screen when this has been done, and the Minitab prompt will appear again.

4.3 Information Command

Type info to obtain a summary of your worksheet.

What does the info command give?

8

Page 12: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

4.4 Print Command

To see a portion of your worksheet, type PRINT C1-C6 (or just PRIN C1-C6).Note the prompt at the bottom of the first screen will require a response, Y foryes or N for no. Pressing the Return key has the same effect as Y.

Type PRIN C1 to see column 1 only. Note the difference between printinga single column and printing more than one column.

4.5 Name Command

You can name each column as follows. Type name c1 ’partno’. You canuse any name instead of partno, of course, but it is clearly advantageous touse a name indicative of what it represents. A name may have up to eightcharacters, may not start or end with a blank, and cannot contain the forwardquote (’) or hash (#). Further, the name must be put between single quotes.

Several columns can be named at once, e.g. name c1 ’ ’ c2 ’ ’. Oncenamed, a column may be referred to in a Minitab command by its name, butthe name must be in single quotes in the command. Now you can name theother columns if you wish.

4.6 Describe Command

1. Typedesc c4.

What information does the describe command give?You can type help desc to find out the meaning of any of the thingsappearing with the desc command.

2. Now typedesc c4;by c6.

What is the effect of the subcommand?

9

Page 13: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

4.7 Histogram Command

1. Type hist c2.

2. Type hist c2 c6In particular, note the heading MIDPOINT and compare the differencesbetween the two histograms

3. Now typehist c2;by c6.

What is the effect of the by subcommand?

4. Now try the start and increment subcommands by typingHIST C2;STAR 15.

What is the effect here?

HIST C2;INCR 10.

What is the effect here?

5. Investigate the subcommands with other values if you wish.

6. The STEM-AND-LEAF command has the subcommands INCREMENTand BY. The BOXPLOT command has the subcommands START, IN-CREMENT and BY. You may wish to investigate these commands andtheir subcommands.

10

Page 14: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

4.8 Table Command

1. Type TABLE C5What does this command give?

2. Type TABLE C5 C6What does this command give?

3. Type TABLE C6 C5 What does this command give? Compare with theprevious case.

4. Type TABLE C6 C5;ROWP.

What does ROWP stand for? What output is produced?

5. TypeTABLE C6 C5;TOTP.

4.9 Count and Tally Commands

1. Type COUNT C5What output is produced?

2. Type TALLY C5 C6What output is produced - i e. What does the TALLY command do?

11

Page 15: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

3. Some of the subcommands TALLY has are

COUNTCUMCNT (cumulative count)PERCENTCUMPCT (cumulative percent)

Investigate these subcommands to find out what output they give.

4.10 Copy Command

1. Type COPY C6 C7 (the full command is COPY C6 INTO C7)Type PRIN C1-C7 to see the affect of this. What has happened?

2. Type COPY C1-C4 C7-C10 (or COPY C1-C4 INTO C7-C10)Describe What happens here. (In particular, note what has now hap-pened to C7, which was created in (1) above.)

3. Now delete the extra columns you have created by typing ERASE C7-C10 (or just ERAS C7-C10). Suppose we wanted the part numbers ofthose items which are supplied from source 3. We can find this by typing

COPY C1 C7; USE C6=3.

Print C1, C6 and C7 to see the effect of this. Investigate the following to

see its affect. COPY C1 C7;OMIT C6=1:2.

4. Suppose you wanted to know the part numbers of those items which wereordered in month 5. Write the appropriate MINITAB commands to findthis, then try the commands.

5. Write the appropriate MINITAB commands to find how many parts costmore than $10.50, then try the commands.

Note: The USE subcommand can be combined with the OMIT subcom-mand, e.g.

COPY C5 C8;USE C5=0:4;OMIT C4=3.

12

Page 16: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

However, USE and OMIT subcommands cannot be combined together asfollows:

COPY C5 C8;USE C5=0:4;USE C4=2.

nor COPY C5 C8;OMIT C5=3;OMIT C4=2.

4.11 LET Command. Arithmetic in MINITAB

1. Suppose the prices in C2 did not include G.S.T., and that we wished tocalculate this and show the price including G.S.T. We use the followingcommands (for simplicity, we have taken the G.S.T. rate to be 10

LET C7=0.1*C2 (the values in C2 are multiplied by 0.1, and entered inC7)

LET C8=C2+C7 (the values in C2 and C7 are added, and entered in C8).Try these, and then print C2, C7 and C8 to see the effect. (The prices inC7 and C8 will not have been rounded to the nearest cent.)

Note that if we simply wanted the price including G.S.T. and didn’t wantthe G.S.T. itself, we could have used the commandLET C8=0. 1 *C2+C2 or simply LET C8=l.l*C2 .

The LET command is used for arithmetic operations, and uses the fol-lowing where needed.+ for addition- for subtraction

for multiplication/ for division* for exponentiation (raising to a power).

2. The LET command may also be used as follows:LET K1=MEAN(C2)which will assign the mean price (column 2) to a constant, K1. TypingPRIN K1 will allow you to see the result. Similarly, the MEDIAN orSTDEV of a column could be round (they should be assigned to differentconstants, say K2 and K3). These constants could be used as follows:LET C9=(C2-Kl)/K3 .

13

Page 17: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

4.12 Data Entry

You can enter data into Minitab as follows, you can either enter it directly intothe cells or you can use the session window. e.g. 1. Typeset c11 2 34 5 67 8 9end

e.g. 2. Typeset c24(1 2 3)end

What happened to the data in c2?

e.g. 3. Typeset c3(1 2 3)4end

What happened to the data in c3?

Enter in the following data into c4

3 3 3 3 3 5 5 5 5 5 4 4 4 4 4 2 2 2 2 2 6 6 6 6 6 3 3 3 3 3 5 5 5 5 5 4 4 44 4 2 2 2 2 2 6 6 6 6 6 3 3 3 3 3 5 5 5 5 5 4 4 4 4 4 2 2 2 2 2 6 6 6 6 6

What did you type?

14

Page 18: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

4.13 Saving your data

Command SyntaxSAVE [in file in "filename" or K]

After using SAVE, the file will contain all data in the worksheet, all storedconstants, matrices, column names, and missing value information. After thefile is RETRIEVED, the worksheet, stored constants, matrices, and columnnames will be exactly as when they were SAVED. You may specify the filenameas either the name of the file in double quotes, or a stored text constant .

A SAVED worksheet can be used only by Minitab’s RETRIEVE command.You cannot edit it with an editor or even look at it. Unless you use thesubcommand PORTABLE, you can RETRIEVE it only on the same type ofcomputer on which it was SAVED. For most applications, however, SAVEDworksheets are the most efficient and convenient way to store data for use inMinitab.

If a file name is not given, the default name MINITAB.MTW is used. If youuse RETRIEVE without a file name, this default file is retrieved. The defaultfile is useful for saving temporary copies of your worksheet, throughout yoursession, as a backup in case you accidentally destroy the worksheet.

The default file extension for SAVE without any subcommands is MTW.By default, if you SAVE "filename" when the file already exists, Minitab

asks you whether or not you want to replace the file before proceeding. If youSAVE without a file name or you are in BATCH mode, Minitab automaticallyreplaces the file. You can use the subcommands REPLACE and NOREPLACEto override Minitab’s default behavior.

4.14 Retrieving Data

Command SyntaxRETRIEVE [file in "filename" or K]

Retrieves a saved worksheet from the specified file into the current work-sheet. You may specify the filename as either the name of the file in doublequotes, or a stored text constant.

Following this command, the worksheet will contain the same numbers,column names, stored constants, and matrices as when the command SAVEwas last used to save them all. If there is any data in the current worksheet,RETRIEVE erases that data and replaces it with the specified saved worksheetdata. To add data to the current worksheet without replacing it, use READ orINSERT.

If you omit the file name, Minitab looks for a file in your current directorynamed MINITAB.MTW.

The menu command File > Open Worksheet also opens Minitab savedworksheets and Lotus files (and many other types of files as well). It alsoprovides several useful options not available with RETRIEVE. See File >Open Worksheet for details.

For information on open data sets that come with Minitab, see RetrievingSample Data Sets.

15

Page 19: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

5 Nonparametrics

5.1 StestCommand SyntaxSTEST sign test [median = K] on C...CALTERNATIVE = K.

Example: Suppose we take a sample of 29 observations from a populationwith median = 115.

0 50 56 72 80 80 80 99 101 110110 110 120 140 144 145 150 180 201 210220 240 290 309 320 325 400 500 507

First let’s testHo: median = 115H1: median > 115

Statistical RationaleLet X be the number of observations over 115. X has a binomial distributionwith n = 29 and p = 0.5. Here X = 17. The probability of getting 17 or evenmore observations over 115 is 0.2291, which is the p-value for this test.

To do this analysis using Minitab you will first have to read the data setstest.dat, then typeMTB> stest 115 C1;SUBC> Alternative 1.

Note: 115 is the median you wished to test and specifying the alternativeto be 1 informs Minitab that the alternative is median > 115. If one want thealternative to be < 115 then replace 1 with -1. For a two sided test omit thealternative subcommand.

Exercise 1.Now suppose we wish to testHo: median = 115H1: median < 115

What is X in this case?

What is the p-value in this case?

Exercise 2.Now suppose we wish to testHo: median = 115H1: median /= 115

What is X in this case?

What is the p-value in this case?

16

Page 20: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

5.2 Sinterval

Command SyntaxSINTERVAL sign CI [K% confidence] on C...C

Calculates a sign confidence interval, separately for each column. You canspecify a confidence level on the command line. For example, if you enter thecommand SINTERVAL .90 C1 Minitab calculates 90% confidence intervals.If you do not specify a confidence level, SINTERVAL gives a 95% confidenceinterval.Example: Suppose we take a sample of 29 observations from a populationand wish to calculate the sinterval for its median.

0 50 56 72 80 80 80 99 101 110110 110 120 140 144 145 150 180 201 210220 240 290 309 320 325 400 500 507

To do this analysis using Minitab you will first have to read the data setstest.dat if you haven’t already done so, then typeMTB> sinterval C1.

Statistical RationaleMinitab calculates three intervals. The first gives the achievable confidencejust below K, and the third the achievable confidence just above K. Onlyrarely can you achieve confidence K using the standard procedure. The middleconfidence interval is found by a nonlinear interpolation procedure, and givesan interval with approximate confidence K.The three confidence intervals are found as follows: Let M be the true, un-known median. Suppose we take a sample of n observations. Let X be thenumber of observations which are less than M. X has a binomial distributionwith parameters n and p = 0.5. To calculate a sign confidence interval, firstrank the n observations. The interval that goes from the dth smallest obser-vation to the dth largest observation has confidence 1 - 2P (X < d). The valueof d is given under POSITION on the output, for the first and third confidenceinterval.The middle confidence interval is found by a nonlinear interpolation formula(denoted by NLI in the POSITION column). This method has the followingproperties: (a) the actual confidence level is between the confidence levels forthe bounding intervals, (b) the interpolation is a very good approximation fora wide variety of symmetric distributions including the normal distribution,the Cauchy distribution, and the uniform distribution, and (c) examples ofnonsymmetric distributions studied show fairly reasonable results, alwaysmuch more accurate than linear interpolation.

Exercise 1. What is meant by the three achieved confidences, and how arethey calculated?

Exercise 2. What does NLI stand for?

17

Page 21: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

5.3 Wtest

Command SyntaxWTEST Wilcoxon one-sample rank test [of median = K] onC...CALTERNATIVE = K.

Performs a one-sample Wilcoxon signed-rank test of the median. If you donot specify a hypothesized median, WTEST compares the sample median to0.

Example: First let’s testHo: median = 77H1: median ≠ 77

With C1: 77 88 85 74 75 62 80 70 83After typing in the data, typewtest 77 c1.

Statistical RationaleMinitab first eliminates any observations equal to the hypothesized median.The number of observations remaining is printed on the output as N FORTEST. Then the pairwise (Walsh) averages, (Yi + Yj)/2 for i < j, are formed. TheWilcoxon statistic is the number of Walsh averages exceeding the hypothesizedmedian, plus one half the number of Walsh averages equal to the hypothesizedmedian. This statistic is approximately normal. Under Ho, it has mean N(N + 1)/4, where N is the number of observations for the test. The attainedsignificance level, or p-value, is calculated using a normal approximation witha continuity correction.An algebraically equivalent form of the test is based on ranks. Subtract thehypothesized median from each observation, discard any zeros, and rank theabsolute values of these differences. The number of differences is N FORTEST. If two or more absolute differences are tied, assign the average rankto each. The Wilcoxon statistic is the sum of ranks corresponding to positivedifferences. The Wilcoxon point estimate of the population median is themedian of the Walsh averages.Minitab obtains the test statistic and point estimate of the population medianusing an algorithm based on Johnson and Mizoguchi.

Exercise 1. Would you accept or reject the null hypothesis? Why?

Exercise 2.Now suppose we wish to testHo: median = 77H1: median > 77What do you need to type in this case?

Exercise 3. What is the p-value in this case? And what is your conclusion?

18

Page 22: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

5.4 Winterval

Command SyntaxWINTERVAL Wilcoxon CI [K% confidence] on C...C

Calculates a one-sample Wilcoxon confidence interval, separately for eachcolumn. You can specify a confidence level on the command line. For example,if you enter the command WINTERVAL .90 C1 Minitab calculates a 90% con-fidence interval. If you do not specify a confidence level, WINTERVAL gives a95% confidence interval.

Example: To calculate a 1-Sample Wilcoxon confidence interval.

With C1: 77 88 85 74 75 62 80 70 83

Type wint c1.

Statistical RationaleThe confidence interval is essentially the set of values, d, for which the test ofHo: median = d is not rejected in favor of H1: median not equal to d, using a= 1 - (percent confidence)/100 for Example of 1-Sample Wilcoxon ConfidenceInterval. You do not reject Ho as the estimated median, 77.5, is containedin the confidence interval Because of the discreteness of the Wilcoxon teststatistic, it will seldom be possible to achieve the specified confidence. Minitabprints the closest value, which is computed using a normal approximation withcontinuity correction.

Exercise 1. What is the level of confidence achieved?

Exercise 2. What is the confidence interval?

Exercise 3. Given the dataC2: 23 27 20 28 20 29 32 24

Find a 80% confidence interval.

a. What is the level of confidence achieved?

b. What is the confidence interval?

19

Page 23: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

5.5 Mann-Whitney

Command SyntaxMANN-WHITNEY two-sample rank test with [K% confidence] onC CALTERNATIVE = K

Does a two-sample rank test (often called the Mann-Whitney test, or thetwo-sample Wilcoxon rank sum test) for the difference between two popula-tion medians, and calculates the corresponding point estimate and confidenceinterval. You can specify a confidence level on the command line. If you do notspecify a confidence level, MANN-WHITNEY gives a 95% confidence interval.

Example: Using the data below we can calculate a Mann-Whitney confi-dence interval and test.C1: 90 72 61 66 81 69 59 70 C2: 62 85 78 66 80 91 69 77 84MTB> mann-whitney C1 C2.

Statistical RationaleFirst, the two samples are ranked together, with the smallest observationgiven rank 1, the second smallest, rank 2, etc. If two or more observations aretied, the average rank is assigned to each. Then, the sum of the ranks of thefirst sample is calculated. This sum is the test statistic, W. A small value ofW indicates that M1 is smaller than M2; a large value indicates that M2 issmaller than M1, where M1 and M2 are the population medians.Minitab obtains the attained significance level of the test using a normalapproximation with a continuity correction factor.If there are ties in the data, the significance level adjusted for ties is alsoprinted. The unadjusted significance level is conservative if ties are present;the adjusted significance level is usually closer to the correct values, but is notalways conservative.The point estimate is the median of all the pairwise differences between ob-servations in the first sample and observations in the second sample.The confidence interval is the set of values d for which the test of Ho: M1 -M2 = d versus H1: M1 not equal to M2 is not rejected, at a = 1 - (percentconfidence)/100.

Exercise 1. What is being tested? i.e. What is the hypothesis test?

Exercise 2. What was the conclusion?

Exercise 3. What results do you get if you conduct a 85% confidence test.

20

Page 24: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

5.6 Kruskal Wallis

Command SyntaxKRUSKAL-WALLIS test for data in C, levelsin C

The factor column may be numeric or text, and may contain any value. Thelevels do not need to be in any specific order.

Example: Perform a Kruskal Wallis test for the following data

C1: 15.1 13.0 16.2 14.9 13.2 13.8 13.1 13.012.9 11.9 17.0 12.8 14.7 12.0 15.0 16.5

C2: 1 1 3 1 1 3 2 2 2 1 3 2 3 2 3 3

This test is a generalization of the procedure used by MANN-WHITNEY,and offers a nonparametric alternative to the usual one-way analysis of vari-ance. The test assumes that the data arise as k independent random samplesfrom continuous distributions, all having the same shape. The null hypothesisof no differences among the k populations is tested against the alternative ofat least one difference. The factor column may be numeric or text, and maycontain any value. The levels do not need to be in any special order.

21

Page 25: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

Statistical RationaleFirst the combined samples are ranked. If two or more observations are tied,the average rank is assigned to each. The test statistic is

H =12

∑nj[R̄j − R̄]2

N(N + 1)

where nj is the number of observations in group j, N is the total sample size,R̄j is the average of the ranks in group j, and R̄ is the average of all the ranks.Under the null hypothesis, the distribution of H can be approximated by achi-squared distribution with k - 1 degrees of freedom. The approximationis reasonably accurate if no group has fewer than five observations. Largevalues of H suggest that there are some differences in location among the kpopulations.Some authors (e.g., Lehmann) suggest adjusting H when there are ties in thedata. Suppose there are J distinct values among the N observations and, forthe jth distinct value, there are dj tied observations (dj = 1 if there are noties). Then

H(adj ) =H

1 − [∑

(d3j − dj )/(N3 − N)]

When there are no ties, H(adj) = H. Under the null hypothesis, the distributionof H(adj) is also approximately a chi-squared with k - 1 degrees of freedom.For small samples, we suggest the use of exact tables (e.g., Hollander andWolfe). Minitab prints H(adj) if there are ties.The following z-value is printed for each group. For group i,

zj =R̄j − (N + 1)/2

√(N + 1)(N/nj − 1)/12

Under the null hypothesis, zj is approximately normal with m = 0 and s = 1.The value of zj indicates how the mean rank, R̄j, for group j differs from themean rank, R-bar, for all N observations.

22

Page 26: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

Exercise 1. What is being tested? i.e. What is the hypothesis test?

Exercise 2. What was the test statistic in this case?

Exercise 3. What was the conclusion?

5.7 Friedman

Command SyntaxFRIEDMAN data in C, treatment in C, blocks in C [put residuals in C [fits inC]]

Does a nonparametric analysis of a randomized block experiment, and thusprovides an alternative to the command TWOWAY.

Randomized block experiments are a generalization of paired experiments,and FRIEDMAN is a generalization of the paired sign test. FRIEDMAN teststhe null hypothesis that treatment has no effect. Additivity is not requiredfor the test, but is for the estimate of the treatment effects.

The first column listed on the command line contains the response data;the second column, treatment levels; and the third column, blocks. The treat-ment and blocks columns may be numeric or text, and may contain any values.The levels do not need to be in any special order. Optionally, you can store theresiduals by adding a fourth column; fits, or group medians, by giving a fifthcolumn.

This command requires exactly one observation per cell; missing dataare not allowed.

Minitab prints the test statistic, which has an approximately Chi-squaredistribution, and the associated degrees of freedom (number of treatments -1). If there are ties within one or more blocks, the average rank is used, and atest statistic corrected for ties is also printed. If there are many ties, the un-corrected test statistic is conservative; the corrected version is usually closer,but may be either conservative or liberal. FRIEDMAN displays an estimatedmedian for each treatment level. The estimated median is the grand medianplus the treatment effect.Example: Using the data below we can perform a Friedman Test for a ran-domised block design.C1: 1 1 1 1 2 2 2 2 3 3 3 3C2: 1 2 3 4 1 2 3 4 1 2 3 4C3: 0.15 0.26 0.23 0.99 0.55 0.26 -0.22 0.99 0.55 0.66 0.77 0.99

MTB> friedman C3 C1 C2.

23

Page 27: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

Statistical RationaleTo calculate the test statistic S, first rank the data, separately within eachblock. Then sum the ranks for each treatment. The test statistic is a constanttimes∑

[(Rj − R̄)2], where Rj is the rank sum of ranks for treatment j, and R̄ is theaverage of the Rj ’sSee standard nonparametric texts, for details on computing S adjusted forties.To calculate the treatment effects (Doksum method), first find the mediandifference between pairs of treatment. For the data above, the pairwise dif-ferences for treatment 1 minus treatment 2 are 0.15 - 0.55 = -0.4, 0.26 - 0.26= 0, 0.23 - (-0.22) = 0.45, and 0.99 - 0.99 = 0. The median of these is 0. Doingthis for the other two pairs gives -0.4 for treatment 1 minus treatment 3, and-0.2 for treatment 2 minus treatment 3.The effect for each treatment is the average of the median differences ofthat treatment with all other treatments (including itself). For the data inExample of Friedman Test, effect(2) = [median (2 - 1) + median (2 - 2) +median (2 - 3)]/3 = (0.00 + 0.00 - 0.20)/3 = -0.0667. Similarly, effect(1) =-0.1333 and effect(3) = 0.20.Adjust each observation by subtracting the appropriate treatment effect fromthe observation. Adjusted block medians are simply the block medians of thedata adjusted for treatment effect. The grand median is the median of theseadjusted block medians. The estimated median for each treatment level is thetreatment effect plus the grand median. (Note: The average of the treatmentmedians is the grand median.) Residual = (observation adjusted for treatmenteffect) - (adjusted block median). Fit = (treatment effect) + (adjusted blockmedian) = (observation) - (residual).

Exercise 1. Which of C1-C3 is the response variable?

Exercise 2. How was one able to type in the data for C1 quickly?

Exercise 3. What is being tested? i.e. What is the hypothesis test?

Exercise 4. What was the test statistic?

Exercise 5. What was the conclusion?

24

Page 28: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

5.8 Walsh or Pairwise Averages

Command SyntaxWALSH averages for C, put into C [put indices into C C]

Calculates the average of all possible pairs of values, including each valuewith itself.Let x1,x2,...,xn be the observations. The Walsh average, (xi + xj) / 2, hasindices i and j. If you specify index columns, the value of i is put in the firstcolumn and j in the second column. If you have n observations, there will ben (n + 1) / 2 Walsh averages.This command is useful for nonparametric tests and confidence intervals.ExampleC1: 1 2 3MTB > WALSH C1 C2 C3 C4

This command does not produce output in the session window. To see theresults, look in the data window.

Exercise 1. How would one display the data in the session window?

Exercise 2. How was C2 calculated?

Exercise 3. What is C3 and C4?

Exercise 4. How would you go about naming the columns?

5.9 Pairwise DifferencesCommand SyntaxWDIFF for C and C, put into C [put indices into C and C]

Computes all possible differences between pairs of elements from twocolumns by subtracting a value in the second column from the correspond-ing value in the first column.

Let x1,x2,...,xn be the values in the first column, and y1,y2,...,ym be thevalues in the second. WDIFF finds all the differences, (xi - yj). If you specifyindex columns, the value of i is put in the first column and j in the secondcolumn.

These differences are useful for nonparametric tests and confidence inter-vals. For example, the point estimate given by MANN-WHITNEY can becomputed as the median of the differences.

25

Page 29: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

5.10 Wslope

Command SyntaxWSLOPE y in C, x in C, put slopes into C [put row indices into C C]

This command is useful in finding robust estimates of the slope of a linethrough the data.

Each row of the y-x columns defines a point in the plane. WSLOPE com-putes the slope between every pair of points and stores the slopes in the thirdcolumn given on the command line. If you specify index columns, the tworow numbers used to compute the slope are put in the corresponding row ofthese two columns. If any observations are missing or if the slope cannot bedefined (e.g., the slope of a line parallel to the y-axis), the slope is computedas missing. If there are n rows in the input columns, there will be n (n - 1) / 2slopes in the output column.

ExampleC1: 3 5 2 6C2: 1.1 2.0 1.1 3.0

MTB > wslope c1 c2 c3 c4 c5

Exercise 1. What is stored in C4 and C5?

Exercise 2. How are the elements of C3 calculated? Give 2 examples oftheir calculations.

5.11 Rregress

Command SyntaxRREGRESS y in C on K predictors in C...CNORMAL scoresWINSORIZED Wilcoxon scores with fraction K sign scoresWILCOXON scoresLEHMANN scale estimate [with t = K]WINDOW scale estimate [with shape = K]COEFFICIENTS in CFITS in CPSEUDO observations in CRESIDUALS in CNOEQUATIONHYPOTHESIS matrices M...MQFORMITERATIONS KSTARTING values in CSTEPINFO [K]TOLERANCE K for dispersion

Note: RREGRESS is an experimental command.

26

Page 30: Version 1 · PDF fileNonparametrics on Minitab 11.11 Version 1 σχωµ θ σχωµ ρ π θ σ χω π θ? Stat 313 2000 By Julian Visch & Irene Hudson Department of Mathematics and

Performs rank regression. The method for estimating the regression coef-ficients is an extension of the Mann-Whitney-Wilcoxon procedure for the two-sample problem. RREGRESS offers a robust, asymptotically distribution-freealternative to the usual least-squares analysis. The regression coefficients arefound by minimizing a measure of the dispersion of the residuals.

Click on the help menu and select “search for help on”, select “find”,find rregress and click on “Display”.You will find the same details as given here in this booklet. You will noticethat some items have been underlined.

Exercise 1. What is the general term used to describe such items?

Exercise 2. Click on “Normal”, what happened?

Exercise 3. “rregress y in C on K predictors in C..C” means what? Givean example of its use.

Exercise 4. If one wanted the printing of the regression equation and the tableof coeficients and their standard error to be suppressed, what would you type?

27