Name that tune.

39
R.G. Bias | [email protected] | Name that tune. Song title? Performer(s)? 1

description

Name that tune. .  Song title?  Performer(s)?. Descriptive Statistics. “Finding New Information” 4/5/2010. Standard Deviation. σ = SQRT( Σ (X - µ) 2 /N) (Does that give you a headache?). - PowerPoint PPT Presentation

Transcript of Name that tune.

Page 1: Name that tune.

R.G. Bias | [email protected] |

Name that tune.

 Song title?  Performer(s)?

1

Page 2: Name that tune.

R.G. Bias | [email protected] |

Descriptive Statistics

“Finding New Information”4/5/2010

2

Page 3: Name that tune.

R.G. Bias | [email protected] |3

Standard Deviation

σ = SQRT(Σ(X - µ)2/N)

(Does that give you a headache?)

Page 4: Name that tune.

R.G. Bias | [email protected] |4

USA Today has come out with a new survey - apparently, three out of every four people make up 75% of the population. – David Letterman

Page 5: Name that tune.

R.G. Bias | [email protected] |5

Statistics: The only science that enables different experts using the same figures to draw different conclusions. – Evan Esar (1899 - 1995), US humorist

Page 6: Name that tune.

R.G. Bias | [email protected] |6

Scales The data we collect can be

represented on one of FOUR types of scales:– Nominal – Ordinal– Interval– Ratio

“Scale” in the sense that an individual score is placed at some point along a continuum.

Page 7: Name that tune.

R.G. Bias | [email protected] |7

Nominal Scale Describe something by giving it a name.

(Name – Nominal. Get it?) Mutually exclusive categories. For example:

– Gender: 1 = Female, 2 = Male– Marital status: 1 = single, 2 = married,

3 = divorced, 4 = widowed

– Make of car: 1 = Ford, 2 = Chevy . . . The numbers are just names.

Page 8: Name that tune.

R.G. Bias | [email protected] |8

Ordinal Scale An ordered set of objects. But no implication about the relative

SIZE of the steps. Example:

– The 50 states in order of population: • 1 = California• 2 = Texas• 3 = New York • . . . 50 = Wyoming

Page 9: Name that tune.

R.G. Bias | [email protected] |9

Interval Scale Ordered, like an ordinal scale. Plus there are equal intervals between

each pair of scores. With Interval data, we can calculate

means (averages). However, the zero point is arbitrary. Examples:

– Temperature in Fahrenheit or Centigrade.– IQ scores

Page 10: Name that tune.

R.G. Bias | [email protected] |10

Ratio Scale Interval scale, plus an absolute zero. Sample:

– Distance, weight, height, time (but not years – e.g., the year 2002 isn’t “twice” 1001).

Page 11: Name that tune.

R.G. Bias | [email protected] |11

Scales (cont’d.)It’s possible to measure the same attribute

on different scales. Say, for instance, your midterm test. I could:

Give you a “1” if you don’t finish, and a “2” if you finish.

“1” for highest grade in class, “2” for second highest grade, . . . .

“1” for first quarter of the class, “2” for second quarter of the class,” . . .

Raw test score (100, 99, . . . .).– (NOTE: A score of 100 doesn’t mean the

person “knows” twice as much as a person who scores 50, he/she just gets twice the score.)

Page 12: Name that tune.

R.G. Bias | [email protected] |12

Scales (cont’d.)Nominal Ordinal Interval Ratio

Name = = =

Mutually-exclusive

= = =

Ordered = =

Equal interval =+ abs. 0

Gender, Yes/No

Class rank, Survey ans.

Days of wk., Temp.

Inches, Dollars

Page 13: Name that tune.

R.G. Bias | [email protected] |13

Earlier . . . We learned about frequency

distributions. I asserted that a frequency

distribution, and/or a histogram (a graphical representation of a frequency distribution), was a good way to summarize a collection of data.

There’s another, even shorter-hand way.

Page 14: Name that tune.

R.G. Bias | [email protected] |14

Measures of Central Tendency Mode

– Most frequent score (or scores – a distribution can have multiple modes)

Median– “Middle score”– 50th percentile

Mean - µ (“mu”)– “Arithmetic average”– ΣX/N

Page 15: Name that tune.

R.G. Bias | [email protected] |15

More quiz questions about measures of central

tendency4 – True or false: In a normal distribution (bell curve), the mode, median, and mean are all the same? __True __False

5 – (This one is tricky.) If the mode=mean=median, then the distribution is necessarily a bell curve?

__True __False

6 – I have a distribution of 10 scores. There was an error, and really the highest score is 5 points HIGHER than previously thought.a) What does this do to the mode?

__ Increases it __Decreases it __Nothing __Can’t tellb) What does this do to the median?

__ Increases it __Decreases it __Nothing __Can’t tellc) What does this do to the mean?

__ Increases it __Decreases it __Nothing __Can’t tell

7 – Which of the following must be an actual score from the distribution?a) Meanb) Medianc) Moded) None of the above

Page 16: Name that tune.

R.G. Bias | [email protected] |16

OK, so which do we use? Means allow further arithmetic/statistical manipulation.

But . . . It depends on:

– The type of scale of your data• Can’t use means with nominal or ordinal scale data• With nominal data, must use mode

– The distribution of your data• Tend to use medians with distributions bounded at

one end but not the other (e.g., salary). – The question you want to answer

• “Most popular score” vs. “middle score” vs. “middle of the see-saw”

• “Statistics can tell us which measures are technically correct. It cannot tell us which are ‘meaningful’” (Tal, 2001, p. 52).

Page 17: Name that tune.

R.G. Bias | [email protected] |17

Mean – “see saw” (from Tal, 2001)

Page 18: Name that tune.

R.G. Bias | [email protected] |18

Have sidled up to SHAPES of distributions

Symmetrical Skewed – positive and negative Flat

Page 19: Name that tune.

R.G. Bias | [email protected] |19

“Pulling up the mean”

Page 20: Name that tune.

R.G. Bias | [email protected] |20

Why . . . . . . isn’t a “measure of central

tendency” all we need to characterize a distribution of scores/numbers/data/stuff?

“The price for using measures of central tendency is loss of information” (Tal, 2001, p. 49).

Page 21: Name that tune.

R.G. Bias | [email protected] |21

Didja hear the one about . . .

the Aggies who were on a march and came to a river? The Aggie captain asked the farmer how deep the river was.”

“Oh, it averages two feet deep.” All the Aggies drowned.

Page 22: Name that tune.

R.G. Bias | [email protected] |22

Note . . . We started with a bunch of specific

scores. We put them in order. We drew their distribution. Now we can report their central tendency. So, we’ve moved AWAY from specifics, to

a summary. But with Central Tendency, alone, we’ve ignored the specifics altogether.– Note MANY distributions could have a

particular central tendency! If we went back to ALL the specifics, we’d

be back at square one.

Page 23: Name that tune.

R.G. Bias | [email protected] |23

Measures of Dispersion Range Semi-interquartile range Standard deviation

– σ (sigma)

Page 24: Name that tune.

R.G. Bias | [email protected] |24

Range Highest score minus the lowest score. Like the mode . . .

– Easy to calculate– Potentially misleading– Doesn’t take EVERY score into account.

What we need to do is calculate one number that will capture HOW spread out our numbers are from that measure of Central Tendency.– ‘Cause MANY different distributions of scores

can have the same central tendency!– “Standard Deviation” -- σ = SQRT(Σ(X -

µ)2/N)

Page 25: Name that tune.

R.G. Bias | [email protected] |25

Let’s do a short example

What if I asked four undergraduates how many cars they’ve owned in their lives and I got the following answers: 1 1 1 1

There would be NO variance. σ = 0. But what if the answers were 0 0 1 3What’s the mode? Median? Mean? Go with mean. So, how much do the actual scores

deviate from the mean?

Page 26: Name that tune.

R.G. Bias | [email protected] |26

So . . . Add up all the deviations and we

should have a feel for how disperse, how spread, how deviant, our distribution is.

Let’s calculate the Standard Deviation.

As always, start inside the parentheses.

Σ(X - µ)

Page 27: Name that tune.

R.G. Bias | [email protected] |27

Standard Deviation

Score (X) Mean (µ) X-µ

0 1 -1

0 1 -1

1 1 0

3 1 2

Total 0 (damn)

Page 28: Name that tune.

R.G. Bias | [email protected] |28

Damn! OK, let’s try it on

another set of numbers.

X2356

Page 29: Name that tune.

R.G. Bias | [email protected] |29

Damn! (cont’d.) OK, let’s try it on

a smaller set of numbers.

X X - µ2 -23 -15 16 2

Σ = 16 Σ = 0µ = 4 Hmm.

Page 30: Name that tune.

R.G. Bias | [email protected] |30

OK . . . . . . so mathematicians at this point

do one of two things. Take the absolute value or square

‘em. We square ‘em. Σ(X - µ)2

Page 31: Name that tune.

R.G. Bias | [email protected] |

X X - µ (X - µ)2

2 -2 43 -1 15 1 16 2 4

Σ = 16 Σ = 0 10µ = 4

31

Page 32: Name that tune.

R.G. Bias | [email protected] |32

Standard Deviation (cont’d.)

Then take the average of the squared deviations. Σ(X - µ)2/N– Remember, dividing by N was the way

we took the average of the original scores.

– 10/4 = 2.5. But this number is so BIG!

Page 33: Name that tune.

R.G. Bias | [email protected] |33

OK . . . . . . take the square root (to make up

for squaring the deviations earlier). σ = SQRT(Σ(X - µ)2/N) SQRT(2.5) = 1.58 Now this doesn’t give you a

headache, right? I said “right”?

Page 34: Name that tune.

R.G. Bias | [email protected] |34

Hmmm . . .Mode Range

Median ?????

Mean Standard Deviation

Page 35: Name that tune.

R.G. Bias | [email protected] |35

We need . . . A measure of spread that is NOT

sensitive to every little score, just as median is not.

SIQR: Semi-interquartile range. (Q3 – Q1)/2

Page 36: Name that tune.

R.G. Bias | [email protected] |36

To summarizeMode Range -Easy to calculate.

-May be misleading.

Median SIQR -Capture the center.-Not influenced by extreme scores.

Mean(µ)

SD(σ)

-Take every score into account. -Allow later manipulations.

Page 37: Name that tune.

R.G. Bias | [email protected] |37

Practice Problems I’ll send you some, tonight.

Page 39: Name that tune.

R.G. Bias | [email protected] |

References Hinton, P. R. Statistics explained. Shaughnessy, Zechmeister, and

Zechmeister. Experimental methods in psychology.

39