Lesson 15 - 6
Inferences Between Two Variables
Objectives
• Perform Spearman’s rank-correlation test
Vocabulary• Rank-correlation test -- nonparametric procedure
used to test claims regarding association between two variables.
• Spearman’s rank-correlation coefficient -- test statistic, rs
6Σdi² rs = 1 – --------------
n(n²- 1)
Association
● Parametric test for correlation: Assumption of bivariate normal is difficult to verify Used regression instead to test whether the slope
is significantly different from 0
● Nonparametric case for association: Compare the relationship between two variables
without assuming that they are bivariate normal Perform a nonparametric test of whether the
association is 0
Tale of Two Associations
Similar to our previous hypothesis tests, we can have a two-tailed, a left-tailed, or a right-tailed alternate hypothesis– A two-tailed alternative hypothesis corresponds
to a test of association
– A left-tailed alternative hypothesis corresponds to a test of negative association
– A right-tailed alternative hypothesis corresponds to a test of positive association
Test Statistic for Spearman’s Rank-Correlation Test
The test statistic will depend on the size of the sample, n, and on the sum of the squared differences (di²).
6Σdi² rs = 1 – --------------
n(n²- 1)
where di = the difference in the ranks of the two observations (Yi – Xi) in the ith ordered pair.
Spearman’s rank-correlation coefficient, rs, is our test statistic
z0 = rs √n – 1
Small Sample Case: (n ≤ 100)
Large Sample Case: (n > 100)
Critical Value for Spearman’s Rank-Correlation Test
Left-Tailed Two-Tailed Right-Tailed
Significance α α/2 α
Decision Rule
Reject if rs < -CV
Reject if rs < -CV or rs > CV
Reject if rs > CV
Using α as the level of significance, the critical value(s) is (are) obtained from Table XIII in Appendix A. For a two-tailed test, be sure to divide the level of significance, α, by 2.
Small Sample Case: (n ≤ 100)
Large Sample Case: (n > 100)
Hypothesis Tests Using Spearman’s Rank-Correlation TestStep 0 Requirements: 1. The data are a random sample of n ordered pairs. 2. Each pair of observations is two measurements taken on the same individual
Step 1 Hypotheses: (claim is made regarding relationship between two variables, X and Y) H0: see below H1: see below
Step 2 Ranks: Rank the X-values, and rank the Y-values. Compute the differences between ranks and then square these differences. Compute the sum of the squared differences.
Step 3 Level of Significance: (level of significance determines the critical value) Table XIII in Appendix A. (see below) Step 4 Compute Test Statistic:
Step 5 Critical Value Comparison: Left-Tailed Two-Tailed Right-Tailed
Significance α α/2 α
H0 not associated not associated not associated
H1 negatively associated associated positively associated
Decision Rule
Reject if rs < -CVReject if
rs < -CV or rs > CVReject if rs > CV
6Σdi² rs = 1 – --------------
n(n²- 1)
Expectations
• If X and Y were positively associated, then Small ranks of X would tend to correspond to small ranks of Y Large ranks of X would tend to correspond to large ranks of Y The differences would tend to be small positive and small
negative values The squared differences would tend to be small numbers
● If X and Y were negatively associated, then Small ranks of X would tend to correspond to large ranks of Y Large ranks of X would tend to correspond to small ranks of Y The differences would tend to be large positive and large
negative values The squared differences would tend to be large numbers
Example 1 from 15.6
S D S-Rank D-Rank d = X - Y d²
100 257 2.5 1 1.5 2.25
102 264 5 4 1 1
103 274 6 6 0 0
101 266 4 5 -1 1
105 277 7.5 8 -0.5 0.25
100 263 2.5 3 -0.5 0.25
99 258 1 2 -1 1
105 275 7.5 7 0.5 0.25
102 267 Ave Sum 6
Calculations:
Example 1 Continued
• Hypothesis: H0: X and Y are not associated Ha: X and Y are associated
• Test Statistic:
6 Σdi² 6 (6) 36 rs = 1 - ----------- = 1 – ------------- = 1 - -------- = 0.929 n(n² - 1) 8(64 - 1) 8(63)
• Critical Value: 0.738 (from table XIII)
• Conclusion: Since rs > CV, we reject H0; therefore there is a relationship between club-head speed and distance.
Example done in Excel
Club-Head Speed Distance Rank - X Rank - Y difference d²
100 257 2.5 1 1.5 2.25102 264 5 4 1 1103 274 6 6 0 0101 266 4 5 -1 1105 277 7.5 8 -0.5 0.25100 263 2.5 3 -0.5 0.2599 258 1 2 -1 1105 275 7.5 7 0.5 0.25n = 8
Club-Head
Speed Distance 6 sumClub-Head Speed 1 rs = 0.928571Distance 0.938695838 1 rc = 0.738
Rank - X Rank - YRank - X 1Rank - Y 0.927778183 1
Summary and Homework
• Summary– The Spearman rank-correlation test is a
nonparametric test for testing the association of two variables
– Test is a comparison of the ranks of the paired data values
– Critical values for small samples are given in tables– Critical values for large samples can be
approximated by a calculation with the normal distribution
• Homework– problems 3, 6, 7, 10 from the CD
Homework Problem 3
Problem 3
X Y Rank - X Rank - Y difference d22 1.4 1 1 0 04 1.8 2 2 0 08 2.1 3.5 3 0.5 0.258 2.3 3.5 4 -0.5 0.259 2.6 5 5 0 0
n = 5 0.5 sum Rank - X Rank - Y rs = 0.975
Rank - X 1 rc = 1Rank - Y 0.974679434 1 FTR
Homework Problem 6
Problem 6
X Y Rank - X Rank - Y difference d20 0.8 1 1 0 0
0.5 2.3 2 3 -1 11.4 1.9 3.5 2 1.5 2.251.4 2.5 3.5 4 -0.5 0.253.9 5 5 5 0 04.6 6.8 6 6 0 0n = 6
3.5 sumrs = 0.9
Rank - X Rank - Y rc = 0.886Rank - X 1 RejectRank - Y 0.898645105 1
Homework Problem 7Problem 7
BS % Income Rank - X Rank - Y difference d217.4 24289 1 1 0 029.8 33749 9 9 0 024.6 29043 4 4 0 022.3 26100 2 2 0 023.7 28831 3 3 0 026.8 30758 8 8 0 025 29944 6 7 -1 1
26.4 29340 7 5 2 424.7 29372 5 6 -1 1n = 9
6 sumrs = 0.95
Rank - X Rank - Y rc = 0.6Rank - X 1 RejectRank - Y 0.95 1
Homework Problem 10
Problem 10
Standings Yards Rank - X Rank - Y difference d212 4803 5 3 2 430 5215 7 6 1 17 4459 3 2 1 1
11 5134 4 5 -1 12 4972 2 4 -2 4
29 5936 6 7 -1 11 4134 1 1 0 0
n = 7 12 sum
rs = 0.785714 Rank - X Rank - Y rc = 0.714
Rank - X 1 RejectRank - Y 0.785714286 1
Top Related