Ellipsoidal Representations Regarding Correlations

1
Quite often, |ρ[ X:Y ] −ρ[ f(X) : g(Y) ] | < 0.05 if X, Y, f(X), g(Y) have no outliers, and if f, g are increasing functions. This robustness seems comaparable to the sampling error with N > 500. Note: peoples often judges things from only N=1,2 or 10 to see relations. Where does the champion come from? The winner is approximately ρ times as strong as the true guy. The two variables are assumed to form 2-dim Gaussian distribution with zero-centered. Note: quite many distribution cannot be distinguished from Gaussian distribution by Kolmogorov- Smirnov test unless N > 30 or 100. The (hyper-)ellipsoid touches the unit (hyper-)cube at 2×k points of ±( ρ 1 , ρ 2 ,.., ρ k ) with = 1,2,..,k. Spherical Model Ellipsoidal Representations Regarding Correlations Toshiyuki SHIMONO PhD. ( Info. Sci. and Tech. ), a freelance [email protected] tokyo.ac.jp The author likes to express certain basic theories that may touch upon the very roots of statistics. One is the geometric representation of ρ and the other is the mysterious robustness of ρ. Both are simple but their implications are very deep, which may even affect epistemology i.e. how humans sense/judge this ambiguous real world. Background Despite fundamental, multiple regression analysis is hard to interpret the outcomes!! How happens? -> multicolliearity -> un-intuitive sign inversion. -> perturbation by finite sampling Please read ‘ρ’ below as “correlation coefficient” developed since 1880’s. Main Findings 0. A spherical triangle gives a view. cos( ) of the inner angles give partial-ρ. 1. The ‘correlation ellipsoiddirectly gives multiple-ρ, partial-ρ, coefficients 2. The mysterious robustness but not yet fully developed Prospect s Ellipsoidal representation of correlations and regression by computer software would be useful. Influences to statistics and data analysis from these geometric views. §2. The Correlation Ellipse & Correlation Ellipsoid ▲ The “correlation ellipsoid” for ρ-matrix §3. How are the multiple/partial ρ and coeff. a i drawn ? §1. An important empirical fact : the mysterious robustness regarding ρ Assume ρ between variables Y, X 1 , X 2 are given. The multiple ρ is the similarity ratio. 重重重重重重重重重 Chance , please ! When ρ is not so strong, deformations of observations may be a very minor issue. ‘N’ (sample size) is important. These facts deeply affect how a human recognizes relationship between/ among multiple phenomena. F. Galton K. Pearson Now you can grab how Y depends on *multiple* variables X i by a bird’s- eye. -> how multicollinearity occurs / how unintuitive sign inversion occurs / etc. -> A theoretical framework telling whether any action 2012-07 ims-APRM@Tsukuba The partial ρ is read by a graduated ruler. 重重重重重 The standardized partial regression coefficients a i is read by a linear scalar field. ( 重重重 ) 重重重重重 The k-th partial ρ is read by the ruler of a straight line parallel to k-th axis of the space The k-th regression coefficient is read by the linear scalar field inside the hyper- ellipsoid, with ±1 valued at the tangential points to the facets of x k =±1, and 0 valued at the other tangential points to the other facets. The sign (plus/minus) can be determined by either of these geometric methods. You may consider how the coefficient a i changes its value according to the number of variables X i increases. For SEM, how | a i *| > 1 happens is explained. 6 teams of the Central League played 130 games in the each of past 31 years. Each dot below corresponds to each team and each year (N = 186 = 6 × 31). This is when X2 is useless to explain Y because X1 conceals the effect of X2. This happens when r = r r thus r The ratio of the red arrow to the whole red line section is equal to the multiple correlation coefficient. Although X2 and the color Y is no correlated, the determinant coefficient Y from X1 and X2 is much more than that of Y from only X1. A case when zero-correlation is not useless (0.5 , 0) (1 , - 0.8) (0.8, - 1) R=0.833 When the correlation between/among explanatory variables is/are zero, then the multiple correlation coefficient becomes √r 1 2 +r 2 2 or √ r 1 2 +..+r p 2 . determinant coefficient additivity Those above are basically my original works except the definitions of correlations and regression analysis. The author sincerely welcomes any related literature information. and patron! ANY REFERENCE? ▲ The “correlation ellipse” for given ρ This ellipse touches the unit square at (±1,±ρ), (±ρ,±1). ability resu lt This ellipsoid touches the unit cube at (±1,±ρ 12 ,±ρ 13 ), (±ρ 12 , ±1,±ρ 23 ),(±ρ 13 ,±ρ 23 ,±1) Trivial example of the correspondence between the shape of distribution of N≤5 and the correlation ellipse. Coincidentally, the Spearman’s rank ρ is 0.1 times integer when all the variables are different values. Nara Great Buddha How can one distinguish things from multiple features? What is distinguishing in the world? There seems to be several theoretical principles to be developed.. Recall Fisher’s z transformation Theoretical development is much more required, especially for higher dimensional cases The relation between ρ a nd Spearman’s rank ρ for (infinitely many pairs of) X,Y that are forming 2-dim Gaussian distribution along 0≤ρ≤1 is shown in red. The difference is as small as less than 0.05. Despite strong the deformations often cause very small effects on ρ. By choosing the explanatory variables intentionally among many candidate variables, you may freely invert the sign of the regression coefficient of any variable you want to invert. X1 X2 0.0 0.5 -0.8 X1 , X2 0.833 X Y ±1 r 1 r 2 重重 重重重 ±1 r 3 2 1 3 1 2/ 2/ arcsin r r r r

description

Presented at ims-APRM 2012 held in Tsukuba, Japan, from 2012-07-01 to 2012-07-04.

Transcript of Ellipsoidal Representations Regarding Correlations

Page 1: Ellipsoidal Representations Regarding Correlations

Quite often, |ρ[ X:Y ] −ρ[ f(X) : g(Y) ] | < 0.05 if X, Y, f(X), g(Y) have no outliers, and if f, g are increasing functions. This robustness seems comaparable to the sampling error with N > 500. Note: peoples often judges things from only N=1,2 or 10 to see relations.

 

Where does the champion come from? The winner is approximately ρ times as strong as the true guy.

The two variables are assumed to form 2-dim Gaussian distribution with zero-centered. Note: quite many distribution cannot be distinguished from Gaussian distribution by Kolmogorov-Smirnov test unless N > 30 or 100.

The (hyper-)ellipsoid touches the unit (hyper-)cube at 2×k points of ±( ρ ・ 1 , ρ ・ 2 ,.., ρ ・ k ) with ・ = 1,2,..,k.

Spherical Model

Ellipsoidal Representations Regarding Correlations

Toshiyuki SHIMONO

PhD. ( Info. Sci. and Tech. ), a freelance

[email protected]

The author likes to express certain basic theories that may touch upon the very roots of statistics. One is the geometric representation of ρ and the other is the mysterious robustness of ρ. Both are simple but their implications are very deep, which may even affect epistemology i.e. how humans sense/judge this ambiguous real world.

Background

Despite fundamental, multiple regression analysis is hard to interpret the outcomes!!

How happens? -> multicolliearity -> un-intuitive sign inversion. -> perturbation by finite sampling

Please read ‘ρ’ below as “correlation coefficient” developed since 1880’s.

Main Findings

0. A spherical triangle gives a view. cos( ) of the inner angles give partial-ρ. 1. The ‘correlation ellipsoid’ directly

gives multiple-ρ, partial-ρ, coefficients2. The mysterious robustness

— but not yet fully developed

Prospects• Ellipsoidal representation of

correlations and regression by computer software wouldbe useful.

• Influences to statistics and data analysis from these geometric views.

§2. The Correlation Ellipse & Correlation Ellipsoid

▲ The “correlation ellipsoid” for ρ-matrix§3. How are the multiple/partial ρ and coeff. a i drawn ?

§1. An important empirical fact : the mysterious robustness regarding ρ

Assume ρ between

variables Y, X1, X2

are given.

The multiple ρ is the similarity ratio.重相関係数は相似比

 Chance,

please!

When ρ is not so strong, deformations of observations may be a very minor issue. ‘N’ (sample size) is important. These facts deeply affect how a human recognizes relationship between/among multiple phenomena.

F. Galton K. Pearson

Now you can grab how Y depends on *multiple* variables X i by a bird’s-eye.-> how multicollinearity occurs / how unintuitive sign inversion occurs / etc.-> A theoretical framework telling whether any action causes positive or negative effect in daily/social real.

2012-07 ims-APRM@Tsukuba

The partial ρ is read by a graduated ruler.

偏相関係数

The standardized partial regression

coefficients a i

is read by a linear scalar field.

( 標準化 ) 偏回帰係数

The k-th partial ρ is read by the ruler of a straight line parallel tok-th axis of the space

The k-th regression coefficient is read bythe linear scalar field inside the hyper-ellipsoid, with ±1 valued at the tangential points to the

facets of xk=±1,

and 0 valued at the other tangential points to the other facets.

  

The sign (plus/minus) can be determined by either of these geometric methods. You may consider how the coefficient a i changes its value according to the number of variables X i increases.

For SEM, how | a i*| > 1 happens is explained.

 

6 teams of the Central League played 130 games in the each of past 31 years. Each dot below corresponds to each team and each year (N = 186 = 6 × 31).

This is when X2 is useless to explain Y because X1 conceals the effect of X2.This happens when r2 = r1  r12  thus r = r1 . 

The ratio of the red arrow to the whole red line section is equal to the multiple correlation coefficient.

X1 X2

0.00.5

-0.8

X1 , X2

0.833

Although  X2 and the color Y is no correlated,   the determinant coefficient Y from X1 and X2 is much more  than that of Y from only X1. —A case when zero-correlation is not useless

(0.5 , 0)

(1 , -0.8)

(0.8, -1)

R=0.833

When the correlation between/among explanatory variables is/are zero, then the multiple correlation coefficient becomes √r12+r22  or √ r12+..+rp2 .—  determinant coefficient additivity

Those above are basically my original works except the definitions of correlations and regression analysis. The author sincerely welcomes any related literature information.

 and patron!

ANY REFERENCE?

▲ The “correlation ellipse” for given ρ

 

This ellipse touches the unit square at

(±1,±ρ), (±ρ,±1).

ability

result

 

 

This ellipsoid touches the unit cube at (±1,±ρ12,±ρ13), (±ρ12, ±1,±ρ23),(±ρ13,±ρ23,±1)

Trivial example of the correspondence between the shape of distribution of N≤5 and the correlation ellipse. Coincidentally, the Spearman’s rank ρ is 0.1 times integer when all the variables are different values.

 

 

Nara Great Buddha

How can one distinguish things from multiple features? What is distinguishing in the world? There seems to be several theoretical principles to be developed..

 

Recall Fisher’s z transformation

 

Theoretical development is much more required, especially for higher dimensional cases

 

X Y ± 1r1

r2

観察 符号化

± 1r3

2 1

3 1

2 /

2 / arcsin

r r

r r

 

The relation between ρ a nd Spearman’s rank ρ for (infinitely many pairs of) X,Y that are forming 2-dim Gaussian distribution along 0≤ρ≤1 is shown in red. The difference is as small as less than 0.05.

 

Despite strong the deformations oftencause very small effects on ρ.

 

By choosing the explanatory variables intentionally among many candidate variables, you may freely invert the sign of the regression coefficient of any variable you want to invert.