Disegno del modello di analisi dei dati sperimentali Lezione 2 interpolare un modello ai dati e...

download Disegno del modello di analisi dei dati sperimentali Lezione 2 interpolare un modello ai dati e valutare i relativi parametri

of 40

  • date post

    01-May-2015
  • Category

    Documents

  • view

    227
  • download

    3

Embed Size (px)

Transcript of Disegno del modello di analisi dei dati sperimentali Lezione 2 interpolare un modello ai dati e...

  • Slide 1
  • Disegno del modello di analisi dei dati sperimentali Lezione 2 interpolare un modello ai dati e valutare i relativi parametri
  • Slide 2
  • (-2,16) (-1,7) (0,4)(1,6) (2,10) dove x 1 = x ed x 2 = x 1 2
  • Slide 3
  • (-2,16) (-1,7) (0,4)(1,6) (2,10) dove x 1 = x ed x 2 = x 1 2 i il residuo per la i-ma osservazione
  • Slide 4
  • Il modello migliore interpolante un modello che minimizza la somma delle deviazioni quadrate fra il i valori osservati ed i valori predetti dal modello, i.e.
  • Slide 5
  • Come fare i calcoli dove x 1 = x ed x 2 = x 1 2 (x,y) = (-2,16) => y = 0 (1) + 1 (-2) + 2 (4) + = 16 (x,y) = (-1,7) => y = 0 (1) + 1 (-1) + 2 (1) + = 7 (x,y) = (0,4) => y = 0 (1) + 1 (0) + 2 (0) + = 4 (x,y) = (1,6) => y = 0 (1) + 1 (1) + 2 (1) + = 6 (x,y) = (2,10) => y = 0 (1) + 1 (2) + 2 (4) + = 10 x 0 x 1 x 2 y
  • Slide 6
  • Matrice X Transposta
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Matrice Inversa di XX
  • Slide 15
  • (XX) -1 called il inverse matrix di XX. It defined as
  • Slide 16
  • Slide 17
  • Matrice di Varianza - Covarianza
  • Slide 18
  • stima della varianza residua (s 2 ) Somma degli scarti quadratici gradi di libert per s 2
  • Slide 19
  • Varianza dei parametri stimati Matrice di Varianza - Covarianza:
  • Slide 20
  • Covarianza dei parametri stimati Matrice di Varianza - Covarianza
  • Slide 21
  • limiti di confidenza per i
  • Slide 22
  • Varianza della retta predetta Assumiamo che si voglia to predire y per un assegnato valore di x Il valore scelto di x chiamato a Possiamo ora scrivere l'equazione come:
  • Slide 23
  • Ex. a = -4 nota! dovrebbe essere -1.3
  • Slide 24
  • V(x+y) = V(x) + V(y) + 2Cov(x,y) V(x-y) = V(x) + V(y) 2Cov(x,y) V(ax) = a 2 V(x) Cov(ax,by) = abCov(x,y) Una via alternativa del calcolo
  • Slide 25
  • La varianza di una nuova osservazione di y a = -4 V(y) = (1+15.80)0.829 = 13.92 SE(y) = 3.73 Varianza della retta Varianza di nuova oss
  • Slide 26
  • limiti di confidenza 95% limiti di confidenza limiti di confidenza per la retta: a = -4 95% limiti di confidenza per singole osservazioni:
  • Slide 27
  • limiti di confidenza al 95%
  • Slide 28
  • come fare questo in SAS?
  • Slide 29
  • DATA eks21; INPUT x y; CARDS; -2 16 -1 7 0 4 1 6 2 10 ; PROC GLM; MODEL y = x x*x/solution ; OUTPUT out= new p= yhat L95M= low_mean U95M = up_mean L95 = low U95 = upper; RUN; PROC PRINT; RUN;
  • Slide 30
  • Number di observations in data set = 5 General Linear Models Procedure Dependent Variable: Y Source DF Sum di Squares Mean Square F Value Pr > F Model 2 85.54285714 42.77142857 51.62 0.0190 Error 2 1.65714286 0.82857143 Corrected Total 4 87.20000000 R-Square C.V. Root MSE Y Mean 0.980996 10.58441 0.91025899 8.60000000 Source DF Type I SS Mean Square F Value Pr > F X 1 16.90000000 16.90000000 20.40 0.0457 X*X 1 68.64285714 68.64285714 82.84 0.0119 Source DF Type III SS Mean Square F Value Pr > F X 1 16.90000000 16.90000000 20.40 0.0457 X*X 1 68.64285714 68.64285714 82.84 0.0119 T per H0: Pr > |T| Std Error of Parameter Estimate Parameter=0 Estimate INTERCEPT 4.171428571 6.58 0.0224 0.63438867 X -1.300000000 -4.52 0.0457 0.28784917 X*X 2.214285714 9.10 0.0119 0.24327695 OBS X Y YHAT LOW_MEAN UP_MEAN LOW UPPER 1 -2 16 15.6286 11.9426 19.3145 10.2503 21.0068 2 -1 7 7.6857 5.2988 10.0726 3.0991 12.2723 3 0 4 4.1714 1.4419 6.9010 -0.6024 8.9453 4 1 6 5.0857 2.6988 7.4726 0.4991 9.6723 5 2 10 10.4286 6.7426 14.1145 5.0503 15.8068 s2s2 s
  • Slide 31
  • DATA eks21; INPUT x y; CARDS; -4. -3.5. -3. -2.5. -2 16 -1.5. -1 7 -0.5. 0 4 0.5. 1 6 1.5. 2 10 2.5. 3. 3.5. 4. ; PROC GLM; MODEL y = x x*x/solution ; OUTPUT out= new p= yhat L95M= low_mean U95M = up_mean L95 = low U95 = upper; RUN; PROC PRINT; RUN;
  • Slide 32
  • OBS X Y YHAT LOW_MEAN UP_MEAN LOW UPPER 1 -4.0. 44.8000 29.2321 60.3679 28.7470 60.8530 2 -3.5. 35.8464 24.1430 47.5499 23.5050 48.1878 3 -3.0. 28.0000 19.6000 36.4000 18.7318 37.2682 4 -2.5. 21.2607 15.5647 26.9568 14.3481 28.1733 5 -2.0 16 15.6286 11.9426 19.3145 10.2503 21.0068 6 -1.5. 11.1036 8.5369 13.6702 6.4210 15.7862 7 -1.0 7 7.6857 5.2988 10.0726 3.0991 12.2723 8 -0.5. 5.3750 2.7660 7.9840 0.6691 10.0809 9 0.0 4 4.1714 1.4419 6.9010 -0.6024 8.9453 10 0.5. 4.0750 1.4660 6.6840 -0.6309 8.7809 11 1.0 6 5.0857 2.6988 7.4726 0.4991 9.6723 12 1.5. 7.2036 4.6369 9.7702 2.5210 11.8862 13 2.0 10 10.4286 6.7426 14.1145 5.0503 15.8068 14 2.5. 14.7607 9.0647 20.4568 7.8481 21.6733 15 3.0. 20.2000 11.8000 28.6000 10.9318 29.4682 16 3.5. 26.7464 15.0430 38.4499 14.4050 39.0878 17 4.0. 34.4000 18.8321 49.9679 18.3470 50.4530
  • Slide 33
  • Un problema pi complesso Interpola con un modello questi dati
  • Slide 34
  • DATA polynom; INPUT x y; CARDS; 0 8.62 10 -3.99 20 6.80 30 -7.70 40 3.44 50 12.01 60 23.37 70 9.25 80 34.93 90 70.05 100 126.70 ; DATA add; SET polynom; x2 = x**2; x3 = x**3; x4 = x**4; PROC REG; MODEL y = x x2 x3 x4; RUN;
  • Slide 35
  • il SAS System 08:22 Tuesday, October 29, 2002 1 il REG Procedure Model: MODEL1 Dependent Variable: y Analysis di Varianza Sum di Mean Source DF Squares Square F Value Pr > F Model 4 15449 3862.13306 56.59 |t| Intercept 1 8.92923 7.90689 1.13 0.3019 x 1 -1.90184 1.21774 -1.56 0.1694 x2 1 0.09562 0.05335 1.79 0.1232 x3 1 -0.00165 0.00082091 -2.01 0.0917 x4 1 0.00000999 0.00000407 2.45 0.0495 polinomio di quarto ordine
  • Slide 36
  • il SAS System 08:22 Tuesday, October 29, 2002 2 Procedure REG Model: MODEL1 Dependent Variable: y Analysis di Varianza Sum di Mean Source DF Squares Square F Value Pr > F Model 3 15037 5012.44667 42.75 |t| Intercept 1 1.73490 9.62511 0.18 0.8621 x 1 0.59619 0.87649 0.68 0.5182 x2 1 -0.02928 0.02099 -1.39 0.2057 x3 1 0.00035168 0.00013776 2.55 0.0379 polinomio di terzo ordine
  • Slide 37
  • The SAS System 08:22 Tuesday, October 29, 2002 3 il REG Procedure Model: MODEL1 Dependent Variable: y Analysis di Varianza Sum di Mean Source DF Squares Square F Value Pr > F Model 2 14273 7136.65872 36.03 |t| Intercept 1 14.39524 10.72255 1.34 0.2163 x 1 -1.41540 0.49888 -2.84 0.0219 x2 1 0.02347 0.00480 4.88 0.0012 polinomio di secondo ordine
  • Slide 38
  • The SAS System 08:22 Tuesday, October 29, 2002 4 il REG Procedure Model: MODEL1 Dependent Variable: y Analysis di Varianza Sum di Mean Source DF Squares Square F Value Pr > F Model 1 9547.03680 9547.03680 13.61 0.0050 Error 9 6310.97089 701.21899 Corrected Total 10 15858 Root MSE 26.48054 R-Square 0.6020 Dependent Mean 25.77091 Adj R-Sq 0.5578 Coeff Var 102.75361 Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept 1 -20.81000 14.93704 -1.39 0.1970 x 1 0.93162 0.25248 3.69 0.0050 polinomio di primo ordine (una retta)
  • Slide 39
  • True relationship: y = 5 + 0.1x 0.02x 2 + 0.0003x 3 + normally distributed with 0 mean ed = 10 Estimated relationship: y = 14.395 1.415x + 0.0235x 2 s = 14.07 Estimated relationship: y = -20.81 + 0.932x s = 26.48 This a better fit than this
  • Slide 40
  • Notazioni Matriciali Of particular interest to us il fact that not even in regression analysis was much use made di matrix algebra. In fact one di us, as a statistics graduate student at Cambridge University in il early 1950s, had lectures on multiple regression that were couched in scalar notation! This absence di matrices ed vectors surely surprising when one thinks di A.C. Aitken. His two books, Matrices ed Determinants ed Statistical Mathematics were both first published in 1939, had fourth ed fifth editions, respectively, in 1947 ed 1948, ed are still in print. Yet, very surprisingly, il latter makes no use di matrices ed vectors which are so thoroughly dealt with in il former. There were exceptions, di course, as have already been noted, such as Kempthorne (1952) ed his co-workers, e.g. Wilk ed Kempthorne (1955, 1956) ed others, too. Even with matrix expressions available, arithmetic was a real problem. A regression analysis in il New Zealand Department di Agriculture in il mid-1950s involved 40 regressors. Using electromechanical calculators, two calculators (people) using row echelon methods needed six weeks to invert il 40 x 40 matrix. One person could do a row, then il other checked it (to a maximum capacity di 8 to 10 digits, hoping per 4- or 5-digit accuracy in il final result). That person did il next row ed passed it to il first person per checking; ed so on. This was il impasse: matrix algebra was appropriate ed not really difficult. But il arithmetic stemming therefrom could be a nightmare. (From Linear Models 1945-1995 by Shayle R. Searle ed Charles E. McCulloch in Advances in Biometry (eds. Peter Armitage ed Herbert A. David), John Wiley & Sons, 1996)