Scientific Computations 1_Gallopoulos

329
ΕΠΙΣΤΗΜΟΝΙΚΟΣ ΥΠΟΛΟΓΙΣΜΟΣ Ι Ευστράτιος Γαλλόπουλος Καθηγητής ΗΥ343 Τμήμα Μηχανικών Η/Υ και Πληροφορικής Πανεπιστήμιο Πατρών Φθινόπωρο 2008 c 2008, Ευστράτιος Γαλλόπουλος

Transcript of Scientific Computations 1_Gallopoulos

343 /

2008 c 2008,

. , . . I / . .

http://scgroup.hpclab.ceid.upatras.gr/class/sc.html. ( , , , , , on-line , , , .) . . . . ( RISC SVD). , .. cos 1 . 2 : 1 . , . , . . (. : 1992), 2 (. : , 1976). matrix . , , , ( ) array table. , polyval.m MATLAB POLYVALM Matrix polynomial evaluation. If V is a vector whose elements are the coecients of a polynomial, then POLYVALM(V,X) is the value of the polynomial evaluated with matrix argument X. See POLYVAL for1 , . - Strang . 2 .

4 polynomial evaluation in the regular or array sense. , Mathematica, , . 3 . , , . 4 . , /. , , : ) 110 (2 .): , ) 240 (4 .): . 261 (3 .) ( ) 205 ( ). , , () ( ) G. Strang, / (1996) .. .. , / (1997). ( ) MATLAB ( version 7). , . , , Scilab MATLAB , ( http://www.scilab.org/)! : 1) G. Golub and C. F. Van Loan. Matrix Computations. The Johns Hopkins University Press, Baltimore, third edition, 1996. 2) N.J. Higham. Accuracy and Stability of Numerical Algorithms. SIAM, Philadelphia, 2002, 2nd. ed. C.W. Ueberhuber. Numerical Computation, volumes 1 and 2. Springer, Berlin, 1997. . , . . . . , . ( ). . 3 , matrix, , 2 (. xxxiv). 4 . 1xvii 2 .

5 , , , , , . . (). . . , , , , . , , , , , , , , . , , , , , , . a L TEX . . . , . , , , . , . . 2008

1 1.1 . . . . . . . . . . . . . . . . . . . . . . . . 1.2 . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 . . . . . . . . . . . . . . . . . . 1.4.1 . . . . . . . . . . . . . . 1.4.2 , , . . . . . . . . . . . . . . . . . . . . . . . . 1.5 . . . . . . . . . . . . . . . . . . . 1.6 . . . . . . . . . . . . . . . . 1.7 . . . . . . . . . . . . . . . . . . . . . . . . 1.8 . . . . . . . . . . . 2 2.1 2.1.1 . . . . 2.2 . . . . . . . . . . . . . . . . . . 2.2.1 . . . 2.3 . . . . . . . . . . . . . . . 2.4 . . . . . . . . . . . . . . . . . . . . . . . 2.5 . . . . . . . . . . 5 . 5 . 7 . 9 . 12 . 12 . . . . . 14 16 16 18 19 27 28 32 33 34 39 41 41 47 48 49 51 53 54 55 58 63 65 66 68 68 70 71 86

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

3 3.1 . . . . . . . . . . . . . . . . . . 3.2 . . . . . . . . . . . 3.2.1 , . . . . . . . 3.2.2 bit . . . . . . . . . . . . 3.2.3 ... . . . . . . . . . . . . . 3.2.4 . . . . . . . . . . . . . . . . . 3.3 . 3.4 . . . 3.4.1 . . . . . . . . . . 3.4.2 3.4.3 . . . . . . . . . 3.4.4 Fused Multiply and Add (FMA) . . . . . . . . 3.4.5 Java . . . . . . . . . . . . . . . . . . . 3.5 . . . . . . . . 3.6 . . 1

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

2 3.6.1 . . . . . 3.6.2 . . . 3.7 . . . . . . 3.8 . . . . . 3.9 . . . . . . . . . . . . . 3.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . 87 . 87 . 89 . 91 . 98 . 106 111 . 111 . 112 . 113 . 114 . 114 . 118 . 121 . 123 . 127 . 129 . 132 . 132 . 138 . 139 . 140 151 . 151 . 153 . 157 . 165 . 166 . 168 . 171 . 175 . 178 . 182 . 186 . 192 . 194 . 194 . 197 . 200 . 200 . 201 . 203 217 . 218 . 218 . 219 . 220

4 4.1 . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 . . . . . . . . . . . . . . . . . . . 4.1.2 4.2 4.2.1 . . . . . . . . . 4.2.2 . . . . . . . . . 4.2.3 . . . 4.2.4 - : . 4.2.5 . . . . . 4.2.6 BLAS . . . . . . . . . . . . . . . . . . . . . . 4.3 . . . . . . . . . 4.4 . . . . . . . . . . . . . . . . . 4.5 . . . . . . . . . . . . . . . . . . . . . . . . 4.6 . . . . . . . . . . . . . . . 4.7 . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

5 II 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 . . . . . . . . . . . . . . . . . . . . . . . 5.3 . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 . . . . . . . . . . . . 5.4.1 5.4.2 . . . . . . . . . . . . . 5.5 . . . . . . . . . . . . . . . . . . . . . . . . 5.6 . . . . . . . . . . . . . . . . . . . . . . . . 5.6.1 . . . . . . . . . . . . . 5.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7.1 . . . . . . . 5.8 .1 . . . . . . . . . . . . . . . . . . . . 5.9 . . . . . . . . . . . . . . 5.9.1 Cholesky . . . . . . . . . . . . . . . 5.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.11 . . . . . . . . . . . . . . . . . . . . . . 5.12 . . . . . . . . . . . . . . . . . . 5.13 . . . . . . . . . . . . . . . . . . . 5.14 . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 6.1 QR . . . . . . . 6.2 . . . . . . . . 6.2.1 . . . . . . . . . . . . 6.2.2 Gram-Schmidt . . III . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

6.2.3 GS . . . . . . . 6.2.4 Householder 6.2.5 . . . . . . QR: Householder . . . 6.3.1 QR . 6.3.2 QR Householder . . . . . 6.3.3 .2 QR . . . . . . . . . . . . 6.3.4 . . . . . . 6.3.5 . . . . . . . . . . . . . . . . . . . Givens . . . . . . . . . . . . . . . . . 6.4.1 . . . . . . . . . . 6.4.2 Givens . . . . . . . 6.4.3 . . . . . . . . . . . . . . . . . . . 6.4.4 QR Givens . . . . . . . . . . . . . . . . 6.4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 . 222 . 223 . 225 . 226 . 227 . 228 . 230 . 231 . 233 . 234 . 235 . 236 . 237 . 237 . 238 . 240 . 241 245 . 246 . 251 . 251 . 253 . 259 . 260 . 265 . 267 . 267 275 . 275 . 278 . 278 . 279 . 287 . 290 . 298 . 304 . 310 . 310 . 314 317 . 317 . 317 . 318 . 319 . 320

6.3

6.4

6.5 6.6

7 IV 7.1 / . . . . . . . . . . . . . . . . . 7.2 . . . . . . . . . . . . . . . . 7.2.1 : . . 7.2.2 Vandermonde . . . . . . . . . . . . . 7.2.3 Toeplitz . . . . . . . . . . . . . . . . 7.2.4 Toeplitz . . . . . 7.2.5 . . . . . . . . . . . . . . . 7.3 . 7.4 . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

8 8.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 . . . . . . . . . . . . 8.2.1 . . . . . . . . . . 8.2.2 . . . . . . . . . . . . 8.2.3 Euler . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.4 Taylor, Runge-Kutta Richardson . 8.2.5 - : 8.3 . . . . . . . . . . . . . . . . . . . . . 8.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 . . . . . . . . . . . . . . . . .1 . . . . . . . . . . . . . . . . .1.1 . . . . . . .1.2 . . . . . . . . .2 .3 .

. . . . . . . . . . . . . . . . . . Lipschitz . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

4

1

1.1 , 1 . : 1980 , . Future Directions in Computational Mathematics, Algorithms and Scientic Software. : The use of modern computers in scientic and engineering research and development over the last three decades has led to the inescapable conclusion that a third branch of scientic methodology has been created. It is now widely acknowledged that, along with the traditional theoretical and experimental methodologies, advanced work in all areas of science and technology has come to rely critically on the computational approach. ( [33]). It is becoming clear that dramatic increases in computing power are necessary but insucient to making high-performance computing a reality. Necessary is also the construction of a large body of applications capable of using that computational power eectively ( [3] Alpern Carter2 .) It is essential to recognize the fact that computer experiments can both be a two-way bridge between Physical Experiments and Mathematical Models, as well as an independent source of physical understanding. Such experiments have a mind-bending potential for future explorations of natures secrets, which is only vaguely recognized today. ( , . 2. Bowen Alpern Larry Carter Computer Scientists IBM Yorktown Heights. Carter University of California, San Diego (UCSD).2 1

5

6 Jackson3 [23]).

1. c 2008, .

: , , , , . , . , , , . . , : . ( Michel Serres4 [35]) . , computational science and engineering . Computational Science and Engineering [12]. Golub Ortega5 [14, . 2]: Scientic computing is the collection of tools, techniques, and theories required to solve on a computer mathematical models of problems in science and engineering. : , , . . Mathematical Modelling [4, . 220]. (, , .) . .3 Atlee Jackson Center for Complex Systems Research Beckmann Center University of Illinois at Urbana-Champaign. Santa Fe Insitute. 4 Michel Serres 5 Gene Golub Stanford James Ortega University of Virginia.

1.2. c 2008, .

7

1.1: [26] 1 2 3 4 5 6 7 8 9 10 . . . * . . . * * * . . . . * * . . . . * . * . * * * . . . * * . . . . . . . . . * . . * . . . . . * . * * . . . . . . * . * . . * * . * . * . * . . . . * . . * * * * * * * . . . / * * * . . . . * * * * * . * . . . . . . . * * * * * * * . . * . . * . . * . . . . . .1. 3. 5. 7. 9. (FFT, ) (=multigrid) Monte Carlo 2. . / . . . 4. . 6. 8. 10.

1.2 : 1) , 2) (, ), 3) , 4) . 1.1, [26], . . [26] ( ) 6 . , . 1) (restructuring compiler) 2) .. . 1. , 6

(= legacy) (= dusty-deck) .

8

1. c 2008, . 2. .

1.1 . , (= computational kernels). , Fourier . Fourier ( FFT) ( Gauss ). . ( ) . . 1.2.1. () Fourier , - - Fourier . 7. , . ( 7 ) . [24, 34]. 1.1 , . , .. , . ( ) . - (=input-output tables). 8 . (= derivatives) [6]. , Grand Challenges of Computational Science [22]. Wassily Leontief ( ) [28] () .8 7

1.3. c 2008, .

9

- - [11] .

www.cs.sandia.gov/tech_reports/ripryor/Aspen.html.

1.3 : 1. 2. 3. . , . John Rice (Purdue University)9 : What is an Answer? : , , . , 10 . , . , . , . 1) , 2) , 3) (..) , . 4) , .. , , . , . 5) . , . , . . () ... HERMIS, 1996. (particle methods).10 9

10

1. c 2008, .

. , .. Gauss Hotelling [19], n , 4n . John von Neumann 11 . [5]: In the elimination method a series of n compound operations is performed each of which depends on the proceeding. An error at any stage aects all succeeding results and may become greatly magnied; this explains roughly why instability should be expected. It should be noticed that at each step a division is performed by a number whose size cannot be estimated in advance and which might be so small that any error in it would be greatly magnied by division... , John Wilkinson, almost every statement in it is either wrong or misleading. , . von Neumann, Herman Goldstine, , von Neumann ( Turing), , Gauss [13] . [17]. ( Oscar Wilde) . , , .. . : ( !) , , . As soon as an Analytical Engine exists, it will necessarily guide the future course of science. Whenever any result is sought by its aid, the question will then arise - By what course of calculation can these results be arrived at by the machine in the shortest time? .... Charles Babbage, [Passages from the Life of a Philosopher, 1864] . ( 11

.

1.3. c 2008, . 13 (. + . + .) (vectorizing compilers) BLAS1 BLAS3 FFT

11

O(1) pipe (1) (. .) (n/ log n)

1.2:

) . 12 1.2. . . , , . , (.. , ) 14 . : .. RAM PRAM . (benchmarks) . Linpack benchmark , .

. , . , , . . , .. ACM, LAPACK, .. .12 The Federal High Performance Computing Program 1989. 13 Alpern Carter performance programming [3]. 14 Beresford Parlett [30].

12 :

1. c 2008, .

, , , . . , , , : 1. . 2. . 3. (.. RAM) . 1.3.1. Fourier ( (n2 ) (n log n) ) : n, , . 1.3.2. Strassen, (nlog 7 ) O(n3 ), . : , Strassen ! .

1.4 . .

1.4.1

. . , . :

1.4. c 2008, .

13

1. RISC : (register les), - LOAD-STORE, pipelining) . [8] RISC . URL

http://www.ee.siue.edu/ mvinant/g_info/cpu_hist.htm#RISC.2. (, , , /). ( ), . , chip (single-chip processor), pins chip single-chip 15 chip . on-chip o-chip. on-chip : () (register les). (instruction cache). (data cache). , on chip . , . ( ) . : ... the operation count is not necessarily an adequate gure ofmerit in comparing theoretically the value of algorithms in numerical analysis [ . . . ] Other factors, such as [ . . . ] the pattern in which memory banks of the computer are referenced, may be as important as the operation count in determining the speed of a program... [18] 3. , .. (superscalar)15

single-chip pin-bandwidth limitated.

14

1. c 2008, . , (= clusters), (networks of workstations = NOW) (Grid) .

. [16]. . ( ) , , , (.. RISC, ). , , . , SIMD = Single Instruction Multiple Data) streaming Intel, (GPU = graphics processing units).

1.4.2 , , . , , . , , ( ) . . , , , . , - (= semantic gap) . , , . ( .. [7, 21]) () (= Problem Solving Environments) ( . [21]). ELLPACK [20] , , . ELLPACK . , . , , Mathematica [36], Maple [2], Matlab [1], Scilab [15]). -

1.4. c 2008, .

15

scripting16 [29] ( Python [27]) [25]. ; . . . , . , Fortran. C ( , .. C++), . Fortran, . Fortran-90, , , , , . John Backus17 I dont know what the technical characteristics of the standard language for scientic and engineering computation in the year 2000 will be ... but I know it will be called Fortran. , , . , . . . [10] [31] The Inuence of the Compiler on the Cost of Mathematical Software - in Particular on the Cost of Triangular Factorization. , 18 . . . . (1924-2007) Fortran BNF. 18 . . . [32, 9].17 16

16

1. c 2008, .

1.5 1.1 (- ) ,

A A + xy T

(1.1)

A n x, y n. : 1. Fortran ( ), 2. 3. . ) (1.1) Unix dtime user system ) () ... (Mop/s) n = 30 n = 800.

1.6 1.6.1. ; . [, . 1.1] , , . 1.6.2. ; 3 . . [, . 1.1] ( - - ) . , (.. , ..) : ) , ) Fourier, ) . 1.6.3. ; . [, . 1.3] ) , ) , ) . 1.6.4. .

1.6. c 2008, . 4

17

3.5

3

2.5time in sec

2

1.5

1

0.5

0 0 80

100

200

300

400 n

500

600

700

800

70

60

50Mflop/s

40

30

20

10

0 0

100

200

300

400 n

500

600

700

800

1.1: ) SGI Indigo-2 R4400 @ 250 MHz, 2 MB cache. 1 () , . . ) Mop/s SGI Indigo-2 R4400 @ 250 MHz, 2 MB cache. 1 () , . .

18

1. c 2008, .

. [, . 1.3] 1) , 2) , 3) (..) , . 4) , .. , , . , . 5) . 1.6.5. . . [, . 1.3] LAPACK n. n 1000. 1.6.6. / . . [, . 1.4] ) RISC, ) , ) , .

1.7 1.7.1. p(x) = j=0 j xj n1 = 0. ) m 1 , ..., m V a. , V a. ) Horner, .. MATLAB , V a V . ) n = m, ( ) V a O(n log n) ... ; V a (, j j ). . ) Vandermonde :n1

V = ... n1 1

1 1 2 1

1 2 2 2 ...n1 2

1 3 2 3 ...n1 3

... ... ..... .

1 m 2 m

...

...

n1 m

m 1 , 2 , . . . , m Vandermonde :

a=

0

1

2

...

n1

1.8. c 2008, .

19

) MATLAB ( MATLAB ). , a(i) i1 .

g = a(n)*ones(m, 1); for i=n-1:-1:1, g=g.*z + a(i)*ones(m, 1); end) Fourier x n n1

y(j) =k=0

x(k)ei2kj/n ,

j = 0, ..., n 1.

Vandermonde :

z

= =

1 1

2 e

3

... e

m . . . ei2(n1)

i2(1)

i2(2)

V a Fourier a, O(n log n).j 1.7.2. , , p(z) = j=0 j x n1 = 0, V a 8n log n .. lyes-mach, , lno-mach, , . , lno-mach, V a, 2n2 . (.. ) lno-mach. n1

1.8 1.8.1. Fourier x n n1

y(j) =k=0

x(k)ei2kj/n ,

j = 0, ..., n 1.

Vandermonde x. ) Vandermonde Fourier Fourier

20

1. c 2008, .

. ) fft MATLAB : ops n = 128 : 4 : 512. tic, toc, etime, cputime. ; O(n2 ) O(n log n). , . . . ) MATLAB Vandermonde.

function V=vand_fft(n) for j=1:n, v(j, 1) = exp(-2*pi*sqrt(-1)*(j-1)/n); end V=zeros(n);V(:, 1)=ones(n, 1); for i=2:n, V(:, i)=V(:, i-1).*v; end

1.8.2. 1.3 ( ). . MATLAB for-loops . - . . prole nd. n = 500 : 100 : 20000 m = 50 : 50 : 200, n = 50 : 50 : 300 . m, n. . . . for-loops ( prole). 1.4. - . 1.2 ( , 100 m, n). MATLAB .

1.8. c 2008, .

21

tic;n=20000;rand(state,0); figure; for k=1:1:n, A(k)=k; end for k=1:1:n, B(k)=round(rand(1)*n); end for k=1:1:n, C(k)=A(k)+B(k); end for k=1:1:n, plot(B(k),C(k),.r); hold on; end hold off;toc

tic;m=100;n=200;rand(state,0); figure; for j=1:1:n for i=1:1:m A(i,j)=rand(1); end end for i=1:1:m for j=1:1:n B(i,j)=rand(1); end end for i=1:1:m for j=1:1:n C(i,j) = A(i,j) + B(i,j); end end for i=1:1:m for j=1:1:n if C(i,j)>0.5, C(i,j)=1; elseif C(i,j)0.5))=1; C(find(C==0))=-10; C(find(C0))=0; [i, j]=find(C==-10); plot(i,j,.k); hold on; [i, j]=find(C==0); plot(i,j,.y); hold on; [i, j]=find(C==1); plot(i,j,.m); hold on; title(Given);hold off;toc

1.4: 1.8.2Execution time for code 1 20 0.02 0.018 15 time (sec) time (sec) 0.016 0.014 0.012 0.01 0.008 0 0 0.5 1 n 1.5 x 10 24

Execution time for optimized code 1

10

5

0

0.5

1 n

1.5 x 10

24

Execution time for code 2

Execution time for optimized code 2

60 time (sec) 40 20 time (sec) 200 150 0 50 100 m

0.1

0.05

0 400 200 n

0 400 200 n 0 50 100 m 200 150

1.2: ( 1.8.2).

23

[1] MATLAB: The Language of Technical Computing. In

http://www.mathworks.com/products/matlab/.[2] http://www.maplesoft.com/, 2007. [3] B. Alpern and L. Carter. Performance programming: A science waiting to happen. In U. Vishkin, editor, Developing a Computer Science Agenda for High-Performance Computing. ACM Press, New York, 1994. [4] R. Aris. Mathematical Modelling Techniques. Dover, Mineola, NY, 1994 (originally published in 1974). [5] V. Bargmann, D. Montgomery, and J. von Neumann. Solution of linear systems of high order. In A.H. Taub, editor, John von Neumann Collected Works, volume V. Pergamon, Oxford, UK, 1963. [6] E. Barucci, L. Landi, and U. Cherubini. Computational methods in nance: Option pricing. IEEE Computational Science & Engineering Mag., pages 6680, Spring 1996. [7] R.F. Boisvert and E.N. Houstis, editors. Computational Science, Mathematics and Software. Purdue University Press, 1999. [8] L. Carter. RISC from a performance programmers perspective. Invited talk at RISC in 1995 Symposium, 1995. Available from URL http://www-cse.ucsd.edu/users/carter/ppbib.html. [9] L. DeRose, K. Gallivan, E. Gallopoulos, B. Marsolf, and D. Padua. FALCON: A MATLAB Interactive Restructuring Compiler. In C.-H. Huang, et al., editor, Lecture Notes in Computer Science: Languages and Compilers for Parallel Computing, pages 269288. Springer-Verlag, New York, 1995. [10] J. J. Dongarra, F. G. Gustavson, and A. Karp. Implementing linear algebra algorithms for dense matrices on a vector pipeline machine. SIAM Rev., 26(1):91111, January 1984. [11] From Quadnet. Economic modeling from the ground up. IEEE Parallel and Distributed Technology Mag., page 80, Summer 1996. [12] E. Gallopoulos and A.H. Sameh. CSE: Content and product. IEEE Computational Science & Engineering Mag., 4(2):3943, 1997. [13] H.H. Goldstine. The Computer from Pascal to von Neumann. Princeton Univ. Press, Princeton, 5th edition, 1993. [14] G. Golub and J.M. Ortega. Scientic Computing: An Introduction with Parallel Computing. Academic Press, Inc., San Diego, CA, 1993. [15] Scilab Group. Scilab home http://www.scilab.org/index.php. page, 2007. Online at

[16] J.L. Hennessy and D.A. Patterson. Computer Architecture: A Quantitative Approach. Morgan Kaufmann, San Mateo, CA, rst edition, 1990.

24

[17] N.J. Higham. Accuracy and Stability of Numerical Algorithms. SIAM, Philadelphia, 2nd edition, 2002. [18] R. Hockney. Computers, compilers, and Poisson solvers. In U. Schumann, editor, Computers, Fast Elliptic Solvers, and Applications: Proc. GAMM Workshop, 1977. [19] H. Hotelling. Some new methods in matrix calculation. Ann. Math. Statist., 14(1):134, 1943. [20] E.N. Houstis, T.S. Papatheodorou, and J.R. Rice. Parallel ELLPACK: An expert system for the parallel processing of partial dierential equations. In Intelligent Mathematical Software Systems, pages 6373. North-Holland, Amsterdam, 1990. [21] E.N. Houstis, J.R. Rice, E. Gallopoulos, and R. Bramley, editors. Enabling Technologies For Computational Science: Frameworks, Middleware, and Enviroments. Kluwer, 2000. [22] Grand Challenges: High Performance Computing and Communications. A report by the committee on Physical, Mathematical, and Engineering Sciences. Oce of Science and Technology Policy, 1991. [23] E. A. Jackson. A rst look at the second metamorphosis of science. Technical Report Report CCSR-95-1, Santa Fe Institute, 1995. [24] W.J. Kaufmann III and L.L. Smarr. Supercomputing and the Transformation of Science. Scientic American Library, New York, 1993. [25] G. Kollias and E. Gallopoulos. Jylab: A system for portable scientic computing over distributed platforms. In E-SCIENCE 06: Proceedings of the Second IEEE International Conference on e-Science and Grid Computing, page 97, Washington, DC, USA, 2006. IEEE Computer Society. [26] D.J. Kuck, E. S. Davidson, D. L. Lawrie, and A.H. Sameh. Parallel supercomputing today and the Cedar approach. Science, 231:967974, February 1986. [27] H. P. Langtangen. Python Scripting for Computational Science. Springer, 2006. [28] W. Leontief. The Structure of the American Economy. 1945. [29] J. K. Ousterhout. Scripting: Higher-level programming for the 21st century. Computer, 31(3):2330, 1998. [30] B. N. Parlett. Progress in numerical analysis. SIAM Rev., 20(3):443455, July 1978. [31] B. N. Parlett and Y. Wang. The inuence of the compiler on the cost of mathematical software - in particular on the cost of triangular factorization. ACM TOMS, 1(1):3546, March 1975.

25

[32] C. Polychronopoulos, M. Girkar, M. Haghighat, C-L. Lee, B. Leung, and D. Schouten. Parafrase-2: An environment for parallelizing, synchronizing, and scheduling programs on multiprocessors. International J. High Speed Computing, 1(1), May 1989. [33] W. C. Rheinboldt. Computational Modeling and Mathematics Applied to the Physical Sciences. Washington DC, 1984. [34] J.R. Rice. Computational science and the future of computing research. IEEE Computational Science and Engineering Magazine, pages 3541, Winter 1995. [35] M. Serres. . Le Monde Diplomatique, (197), Nov. 2001. . [36] S. Wolfram. Mathematica: A System for Doing Mathematics by Computer. Addison-Wesley, Boston, second edition, 1991.

26

2

model: . [ ] : ... 2. (.) , ... 6. ... 7. (.) .... [. .] model: I. Representation of structure. 2e. A simplied description of a system, process, etc., put forward as a basis for theoretical or empirical understanding; f. (Math) A set of entities that satises all the formulae of a given formal or axiomatic system. .... [The New Shorter Oxford English Dictionary] What is a model? the term mathematical model ... will be used for any complete and consistent set of mathematical equations which is thought to correspond to some other entity, its prototype. The prototype may be a physical, biological, social, psychological or conceptual entity... Being derived from modus (a measure) the word model implies a change of scale in its representation ... In so far as the prototype is a physical or natural object, the mathematical model represents a change on the scale of abstraction. Certain particularities will have been removed and simplications made in obtaining the model. - Rutherford Aris [4, Chapter 1]. Models. ... It is customary nowadays, for example, to refer to a computer model of the atmosphere, even though this consists of nothing more than a programme for manipulating observed measurements of temperature, pressure, humidity, etc., according to the dynamical equations of meteorology. The notion of a model thus extends into a purely symbolic domain, where there is only an abstract similarity between the original system and its model. - John Ziman [17, Chapter 2.12]. , . , . , , , - Claude Levi-Strauss [11, . 38]. 27

28

2. c 2008, . , , , , . , - . . [15, . 263].

, , . . , .

2.1 ... ... . [Pierre Simon de Laplace, Theorie analytique des probabilites.] One shouldnt always include all the eects in a mathematical model; a huge simulation of the exact equations (even if one knows them) may be no more enlightening than the experiments that led to these equations. There are virtues in simplicity, even in caricature (...) Solving is not the same as simulating. Our Models are Our Metaphors: Princeton American Academy of Arts and Sciences, Philip Holmes (SIAM News, June 2002.) . : modus, . , modello. Unesco ( , , 1972). , , . , . , , , , .. (.. -

2.1. c 2008, . 29 .) , , . (.. , , , ) (.. ). - (.. ) , , . , , . , , , , .1 , . , , , , . . 2 (, ) . , , .. , , , . , 3 . /. , 4 . , . , , Unesco ( , , 1972). . 3 : , , [ , . .] 4 Simulation as a source of new knowledge The Sciences of the Articial [12] - - Carnegie Mellon ..., Herbert Simon ( ).2 1

30

2. c 2008, .

2.1: .

, , /, . , . 2.1.1. Evariste Galois 1830, 5 . 2.1.2. , Newton xi+1 = xi f (xi )/f (xi ) . ( ) , f . , .. , . , , x x xi+1 = xi f (xi ) f (xi i i1 ) . )f (xi1 Taylor, f (xi1 ) = f (xi ) + f (xi )(xi1 xi ) + f () xi1 xi .

f (xi ) f (xi1 ) xi xi1

(2.1)

f (xi ). (2.1) Newton .

2.1. c 2008, . 31 , (2.1) . 2.1.3. . . . / ( ) , , . . /. 2.1.1. , .. , . . 2.1.2. , . , , MIT, Paul Krugman, . , Krugman . ( . [9, . 47]): . - , - . , , :- , , , . , ( ). , , , , , .

32

2. c 2008, .

2.1.1 The study of computational complexity requires that one agrees on a model of computation, normally called a machine model, for eectuating algorithms. Unfortunately many dierent machine models have been proposed in the past, ranging from theoretical devices like the Turing machine to more or less realistic models of the random access machines and parallel computers... [16] , , .. , , , . : . . . . . 8 . : . / , . , , . , , , (artifacts) ( ). 2.1.3. . David Deutsch 5 The Fabric of Reality - The Science of Parallel Universes and its Implications [5]: ... What makes the general theory of relativity so important is not that it can predict planetary motions a shade more accurately than Newtons theory can, but that it reveals and explains previously unsuspected aspects of reality, such as the curvature of space and time. This is typical of scientic explanation.... But the ability of a theory to explain what we experience is not its most valuable attribute. Its most valuable attribute is that it explains the fabric of reality itself. ... Yet some philosophers - and even some scientists - disparage the role of explanation in science. To them, the basic purpose of a scientic theory is not to explain anything, but to predict the outcomes of experiments: its entire content lies in its predictive formulae. ... This view is called instrumentalism because it says that a theory is no more than an instrument for making predictions...5 David Deutsch , . Dirac .

2.2. c 2008, .

33

2.2

. (.. ) . RAM RAM (= Random Access Machine). ( . [3] ), . , . . 2.2.1. Horner p(x) = 0 + 1 x + + n xn . :

s = an for i = n 1 : 1 : 0 s = s x + aiend , T (n) = 2n . 2.2.1. RASP (= Random Access Stored Program) uniform cost criterion (=straight-line), . branch . / , 6 . , /. , . , . , 6 , RAM , .

34

2. c 2008, .

/, : , 7 . , 1.1 (- ) ( RAM ) ! 8 . , , .

2.2.1

I gradually and slowly found out that there were two things to talk about; the fact that knowledge is acquired, so to speak, by memory; but that when you know anything, memory doesnt come in. At any moment that you are conscious of knowing anything, memory plays no part.... You have a sense of the immediate... - Gertrude Stein [14, p. 152]. Finally, we can combine LOAD and STORE into the arithmetic operations by replacing sequences such as { LOAD a; ADD b; STORE c } by c a + b... - A. Aho, J. Hoprcroft and J. Ullmann [3].

RAM / : , , , . : load/store.7 . 8 RASP !

2.2. c 2008, .

35

K M K . , , / ... . load , . load (0)

.

, . RAM . , RAM . : = 0, . 2.2.2. , Horner : load x, an(0)

s = an for i = n 1 : 1 : 0 load ai s = s x + aiend store s 2.2.2. . , K , k = 0, ..., K 2m(k) f (k), m f 0 [2]. , . , .

36

2. c 2008, .

: ... (ops). , , ... ... .

: .

min : . . . 2.2.1. n , min n + 1. . , , load n. , store. , m , min = n + m. 2.2.3. Horner, 2.2.2 = 2n = n + 3. , min n + 3 = min . / Mop/s: Million Floating Point Operations Per Second . , Mops. [8], Million Floating Point OperationS. . . , . , , , min , . , T

T

= =

T + T + ,

T

=

T

1+

,

2.2. c 2008, .

37

:= / . , min = min / ...

T = T

1+

T

1 + min

, . . 4. (.

= 0), T = T . ,

RAM, ... . , . . ; , RAM. . (.. , ) (.. ) : IBM RS/6000 DEC Alpha 21064 ... .

38

2. c 2008, .

9 . , . 2.2.3. ( , ) . min . ( bandwidth) , , . Bmax Mbytes/sec. , 8 bytes (. IEEE), Mop/sec

max :=

Bmax . 8min

, max Mop/sec. , , . .. . min , . , ( ) . 2.2.4. , ( = prefetching). , , . : : . runtime system. (=explicit) : . , T , T , . : 1. P1 , P2 P2 P1 .9 / (timers, monitors), .. .

2.3. c 2008, .

39

2. P1 P2 P2 P1 . 3. - . [1]. Todd Mowry Stanford ( Carnegie Mellon) Tolerating Latency Through Software-Controlled Data Prefetching 10 , : This dissertation proposes and evaluates a new compiler algorithm for inserting prefetches into code. (...) The algorithm can prefetch both dense-matrix and sparse-matrix codes, thus covering a large fraction of scientic applications (...) The results of our detailed architectural simulations demonstrate that the speed of some applications can be improved by as much as a factor of two, both on uniprocessor and multiprocessor systems... MATLAB . , , .. 11 . 2.2.5. MFLOPS, . [13, 6]. , MFLOPS . 1970, ( Livermore loops), Linpack benchmark, Perfect SPEC. . [7] D. Kuck [10]. Linpack benchmark . URL www.netlib.org/benchmark/ [6]. on-line . SPEC ( = System Performance Evaluation Cooperative) URL www.spec.org.

2.3 2.3.1. () C C + AB C Rn1 n2 , A Rn1 n3 B Rn3 n2 . , min = www.cs.cmu.edu/ tcm/thesis/thesis_tech.html MATLAB, - (predenition) .11 10

40

2. c 2008, .

min / ... min .1. min n1 = n2 = 1 n3 = n. 2. min n2 = n3 = 1 n1 = 1. 3. min n1 = n2 = n3 = n. 4. min min . . 1. min = 2. min = 3. min =min min min

= = =

2n+2 2n 4 2

=2 =2 n

4n2 2n3

4. 2.2.3. 2.3.2 (, , 03-makeup). A Rnn , x Rn , R , I y = (A I)x. , min , ... ( ) O(n) ( LOAD) ( STORE). . :

y = (A I)x = Ax x min = n2 + 2n + 1. O(n) , A. : LOAD , x for i = 1 : n LOAD A(i, :)

yi = A(i, :)x xiend STORE y 2n2 + n, 2 min = n +2n+1 . 2n2 2.3.3. for i = 1:n, y(i) = a*x(i)+y(i), end. b = 3. . rem r = rem(n, m) m, n , . n = pm + r , 0 r m 1, :

2.4. c 2008, .

41

r = rem(n,b); for i = 1:r, y(i) = a*x(i)+y(i); end; for i = r+1:3:n y(i) = a*x(i)+y(i); y(i+1) = a*x(i+1)+y(i+1); y(i+2) = a*x(i+2)+y(i+2); end

2.4 2.4.1. pn (x) = j ). , MATLAB , pn (x). . :n j=1 (x

a(2) = 1; a(1) = -r(1); a(3:n+1) = 0 for j = 2:n t(2:j+1) = a(1:j), t(1) = 0 a(1:j+1) = t(1:j+1)-r(j)*a(1:j+1) end

2.5 2.5.1. MATLAB , k ,

Aj = rand(2kj+1 , 2kj ),

j = 1, . . . , k

Aj Aj eval, num2str, rand. n = 2k . , k = 1 : 10. Bk = A1 A2 A3 . . . Ak . 1. a Bk n :

Bk = (. . . ((A1 A2 )A3 ) . . . Ak1 )Ak .2. Bk n :

Bk = A1 (A2 (A3 . . . (Ak1 Ak )) . . .).3. /a n.

42

2. c 2008, . 4. MATLAB B = A1 A2 A3 . . . Ak . . MATLAB ; 5. , (3) ( ).

. k (k = 10 ) :

for j=1:k, eval([A num2str(j) =rand(2(k-j+1), 2(k-j))]); end1. C Rmn , D Rnk m(2n 1)k = 2mnk mk . : 1: (A1 A2 ) = 2n n n n n = 2 4 4n3 4

n3 16

n2 4 .

2: (A1 A2 A3 ) = 2n n n n n = 4 8 8 ...n k 1: (A1 A2 . . . Ak ) = 2n 2k13

n2 8 . n3 64

n n 3: (A1 A2 . . . A4 ) = 2n n 16 n 16 = 8 n 2k2

n2 16 . n3 22(k1) n2 . 2(k1)+1

n n 2k =

n j 22j 2n , j+1 : k1

a =j=1

(

n3 n2 n3 4n n2 2n n3 n2 2n j+1 ) = = + 2j 2 2 3 2 3 2 3

2. : 1: (Ak1 Ak ) = ... k 3: (A3 . . . Ak1 Ak ) = 2 n n 4 8 k 2: (A2 . . . Ak1 Ak ) = 2 n n 2 4 k 1: (A1 . . . Ak1 Ak ) =2

n2 22(k2)

n (= 2k2

12)n 4 n 2

= =

n2 16 n2 4 2

n 4 n 2

2n n 2

n=n n

n n j 22(kj1) 2kj1 , : k1

=j=1

(

n2 22(kj1)

2

)= kj1

n

4n2 16 4n2 4 (2n 4) = 2n 3 3 3

3. a /a :

2.5. c 2008, . 1.4

43

1.2

1

0.8 / a

0.6

0.4

0.2

0

0

50

100

150 n

200

250

300

2.2: () /a ( 2.5.1).

Omega_a=sym((n3/3)-(n2/2)+(2*n/3)); Omega_delta=sym((4*n2/3)-(2*n)+(4/3)); k=(1:8); flops_a=zeros(length(k), 1); flops_delta=zeros(length(k), 1); for i=1:length(k), n=2k(i); flops_a(i)=eval(Omega_a); flops_delta(i)=eval(Omega_delta); end plot(2.k, flops_delta./flops_a, -); xlabel(n); ylabel(\Omega_a/\Omega_\delta); 2.2 . 4. , . . MATLAB . 2.3. MATLAB , !

44

2. c 2008, .

k_max=10; n=2.(2:k_max); t_left=zeros(k_max-1, 1); t_right=zeros(k_max-1, 1); t_mat=zeros(k_max-1, 1); for k=2:k_max, an=sprintf(Matrix multiplication for n=%d, n(k-1));disp(an); for j=1:k, eval([A num2str(j) =rand(2(k-j+1), 2(k-j));]); end for m=1:100, tic; left_ex=A1; for j=2:k, left_ex=left_ex*eval(strcat(A, num2str(j))); end t_left(k-1)=t_left(k-1)+toc; end t_left(k-1)=t_left(k-1)/100; for m=1:100, tic; right_ex=eval(strcat(A, num2str(k))); for j=k-1:-1:1, right_ex=eval(strcat(A, num2str(j)))*right_ex; end t_right(k-1)=t_right(k-1)+toc; end t_right(k-1)=t_right(k-1)/100; for m=1:100, tic; mat_ex=A1; for j=2:k, mat_ex=strcat(mat_ex, *, strcat(A, num2str(j))); end mat_ex=strcat(mat_ex, ;); eval(mat_ex); t_mat(k-1)=t_mat(k-1)+toc; end t_mat(k-1)=t_mat(k-1)/100; end5. 2.4 , .

2.5. c 2008, .

45

0.25 left to right right to left matlab

0.2

0.15 time (sec) 0.1 0.05 0 0

200

400

600 n

800

1000

1200

2.3: ( 2.5.1).

2.5

2

1.5 /a 1 0.5 0 0

200

400

600 n

800

1000

1200

2.4: () /a ( 2.5.1).

46

[1] R.C. Agarwal, F.G. Gustavson, and M. Zubair. Improving performance of linear algebra algorithms for dense matrices, using algorithmic prefetch. IBM J. Res. Develop., 38(3):265275, 1994. [2] A. Aggarwal, B. Alpern, A. K. Chandra, and M. Snir. A model of hierarchical memory. In Proc. Nineteenth Annual ACM Symposium on Theory of Computing, pages 305314, May 1987. [3] A. Aho, J. E. Hopcroft, and J. D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974. [4] R. Aris. Mathematical Modelling Techniques. Dover, Mineola, NY, 1994 (originally published in 1974). [5] D. Deutsch. The Fabric of Reality. Penguin, 1997. [6] R. Giladi. Evaluating the MFLOPS measure. IEEE Micro, pages 6975, Aug. 1996. [7] J.L. Hennessy and D.A. Patterson. Computer Architecture: A Quantitative Approach. Morgan Kaufmann, San Mateo, CA, rst edition, 1990. [8] R.W. Hockney. The Science of Computer Benchmarking. SIAM, Philadelphia, 1996. [9] P. Krugman. . , 2000. [10] D.J. Kuck. The Structure of Computers and Computations. Wiley, 1978. [11] C. Levi-Strauss. La pensee Sauvage. Plon, Paris, 1962. [12] H. A. Simon. The Sciences of the Articial. MIT Press, Cambridge, Mass., second edition, 1981. [13] J.E. Smith. Characterizing computer performance with a single number. Comm. ACM, pages 12021206, Oct. 1988. [14] Gertrude Stein. How Writing is Written. Black Sparrow Press, Los Angeles, 1974. [15] . . . , , 1974. [16] P. van Emde Boas. Machine models and simulations. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science. Volume A: Algorithms and Complexity, chapter 1, pages 166. The MIT Press, Cambridge, MA, 1990. [17] J. Ziman. An Introduction to Science Studies. Cambridge University Press, Cambridge, 1984.

3

(...) , (...) (...) . Edgard Morin [29]. 1.1, - - , ! , 2.1, , : . : .. , , . . , (.. ), . , , . , , 1) , 2) . , 47

48 3. c 2008, . , . . 1 . Nick Higham2 ([17]) . , .

3.1 x x.

Eabs () = |x x|, x

Erel () = x

|x x| . |x|

|x| , . . , ( ) ; , , , . , x x ( ). . , ( . ). , , , ( ). x x x x , . Eabs () = x x Erel () = x . x x , , , , .1 2

. Manchester.

3.2. c 2008, . 49 3.1.1. p1 = [1, 0, , 0] R1000 p2 = [103 , , 103 ] R1000 x x1 x2 . 1 = p1 1 = p2 1 x1 2 : 1000 x. , , . , Cn x Cn ,

y = |x| yi = |xi |, i = 1 : n. . x, y Rn

x y xi yi , i = 1 : n. , . Rmn Cmn . , x x |x x| [|x1 x1 |/|x1 |, . . . , |xn xn |/|xn |. () ( = relative componentwise error)

maxi

|xi xi | . |xi |

(normwise analysis) (normwise analysis), .

3.2 Floating point arithmetic is by nature inexact, and it is not difcult to misuse it so that the computed answers consist almost entirely of noise. One of the principal components of numerical analysis is to determine how accurate the results of certain numerical methods will be David Knuth [22] I would be afraid to y in an airplane that was designed with oating point arithmetic. (1960) Alston Householder bytes (...). ... / 3 3 . L.N. Trefethen Predictions for Scientic Computing 50 years from now

50 3. c 2008, . . , /. , , ..., . ... F R

y = m et ,

(3.1)

t . , F . F ... F . = 2 (.. = 16 IBM/360) = 10. t m . , m ( y ), m/ t < 1 , y . mantissa 4 ( 2) y . Knuth mantissa : ... but it is an abuse of terminology to call the fraction part a mantissa, since the concept has quite a dierent meaning in connection with logarithms. Furthermore the English word mantissa means a worthless addition. [22, page 199]. : ... (3.1) bytes, .. 4 8 /. bit , () m, e. F . F M = mmax emax t = mmin emin t . , , F . F , . 32 64 bits [, M ].4 . mantis(s)a , , . . , , mantissa ( ). mantissa . .

3.2. c 2008, . 51 (3.1) ... e t. , t F . , F , Wilkinson:

, t m , (emin , emax ) e, . emin e emax , , F(, t, emin , emax ). : . . . .

3.2.1 , F F . R F

:R F (x) F F x, () . ( continuum) . G F

G := {x R : |x| M } {0} R , M ... . (x). :

x F (x) = x, . x. x G x F , (x) = x. x (x) F x. (x) = x, . , .

52 3. c 2008, .

R G

x1

x2

x4

x3

F

3.1: R F .

x G , |x| > M |x| < x = 0 . (x) F . / , F 5 3.1 . 3.2.1. = 10 t = 2 . x = 1.9, y = 0.66 z = x y = 1.254. (z) = 1.2 (z) = 1.3. 0.0431 = |1.254 1.2|/1.254 0.0367 = |1.254 1.3|/1.254. u = 12 /2 = 0.05 . ( . 3.4.4), ( ) /. Cause when you are up you are up, and when you are down you are down, but when you are only half way up you are neither up or down! [The Grand Old Duke of York, Mother Goose Nursery Rhymes.] , . x G x F . , . (x) ... y F x:

y = (x) y = arg min |y x|. y F

(y , y+ ), y , y+ F x. (x), 5

R .

3.2. c 2008, . 53 x (y , y+ ). , , . y y+ . 3.2.2. ... F(10, 4, 9, 9) x1 = 0.10005, x2 = 0.10015. (x1 ) = 0.1000 (x2 ) = 0.1002. 3.2.3. MATLAB SciLab, eps 1+ 1. , 2 . 16- 1 + eps/2 1 3f f 0000000000000 1 + eps 1.0000 3f f 0000000000001 1 + eps + eps/2 1.0000 3f f 0000000000002 1 + 2 eps 1.0000 3f f 0000000000002 1 + 2 eps + eps/2 + eps/4 1.0000 3f f 0000000000002 0. . , , (y , y+ ) x y , y+ F , (x) = sign(x) max(|y , y+ |).

3.2.2 bit ... , .. y = 0.d1 0000 e y = 0.0d1 0000 e+1 . . , m ( ) , . y R F d1 = 0. = 2, d1 = 0 d1 = 1. , 2 ( ), bit m 1. , . , , ( ) bit. 3.2.1. bit ( ). , .., bit . bit / , bit .

54 3. c 2008, . y F

y = e .d1 d2 dt

0 di 1, d1 = 0.

3.2.3 ... ... F , y = 0 F . : y F y = 0

emin 1 |y| emax (1 t ) z G z , z+ z , . z [z , z+ ] ... z , z+ z < z z z+ < z+ . t

z = . x x e ( ) z z = m et z+ = (m + 1) et . z+ z = et . , z z , , , |z (z)| = z . t et 2

|z (z)|

. 0 0 e t1 e 2 et . 2

z G

|

z (z) | z

=

et 1 e / 2 1t = u. 2

u (=unit roundo) (). z G

(z) = z(1 + ), || < u.

(3.2)

3.2. c 2008, . 55 ... ( ) ... (wobbling)

|m et (m + 1) et | |m e+1t (m + 1) e+1t |

= et = e+1t

e e+1 .

3.2.4 . 1+x = 0 x6 ! , ... () ... x .. 0 x < 1+x = 0. , , y = 1+5.5511 1017 , y ... 1. 3.2.4. MATLAB 6.5, , :

>> >>

p1 = 5.5511e-017 1+p1 ans = 1 (1+p1==1) ans = 1

... - ... - ..., x, ..., x+ , . ... x < x+ ... . , ... x , (x, x+ ) = x+ x, ... x+ . . , y1 y2 , , , [y1 , y2 ] . ... : 3.2.1. , M , 1 ..., .M

:= (1, 1+ ).

(3.3)

6 , !

56 3. c 2008, .

, : M ... 1+ 1:M

:= arg min{f l(1 + ) > 1}>0

(3.4)

, (3.3). , M ... ... t 1 , 1 1 + 21t , 3.2.1 M = 21t . 3.2.2. u = 2t . , M = 2u. 3.2.3. , (.. MATLAB) ( : , , ):

... . , . . 3.2.4. Fortran 90 M , EPSILON. 3.2.5. M . MATLAB eps. , Toshiba 320CDT Pentium II :

t=1.0 while (1.0 +t > 1.0) t=t/2.0; end t=t*2.0;

>> eps eps = 2.2204e-16 >> 1+eps >1 ans = 1 >> 1+eps/2 >1 ans = 0

3.2.1. 3.2.3 MATLAB eps M . ,

3.2. c 2008, . 57 MATLAB. Pentium MATLAB

> test = (eps/2*(1+eps) +1 >1) test = 1 e1 = eps/2*(1+eps) 1+ e1 >1 e1 < eps. eps MATLAB 1+eps > 1. M ; 3.2.6. M . MATLAB:

EPS Floating point relative accuracy. EPS is a permanent variable whose value is initially the distance from 1.0 to the next largest floating point number. EPS may be reassigned any value. EPS is used as a default tolerance by PINV and RANK. 3.2.7.

EPS MATLAB . , ... . , 7 ,

>> floor(0.75/0.25) ans = 3 >> floor(0.075/0.025) ans = 2 , ... (.. IEEE), .

>> 0.75/0.25 ans = 3 >> 0.075/0.025 ans = 3.0000 v = 3-0.075/0.025 v = 4.4409e-16 v 2 eps. v, . , 16 .

>> format hex >> 0.25 ans = 3fd0000000000000 >> 0.75/0.25 ans = 40080000000000007

. . .

58 3. c 2008, .

>> 0.075 ans = 3fb3333333333333 >> 0.025 ans = 3f9999999999999a >> 0.075/0.025 ans = 4007ffffffffffff 0.075 3, floor 2.

3.3 ... ( ) / . . F m et , {+, , , /}, x, y F x y F . R . 3.3.1. ... t 2t . 3.3.2. M F , , F . . ... /. 8 ; , , , ... . , (ALU) . , Pentium bug Intel , . , (.. ). , , 8 ... ! , .

3.3. c 2008, . 59 . , . x, y F ,

x y = (x

y) F

(3.5)

. R (. x y ) . , (=exact rounding). , . , z = x y z . , . , , , (guard digit). 3.3.3. ... = 2, t = 3 u = 213 /2 = 1/8.

+ = =

21 0.100 20 0.111 21 0.100 21 0.011|1 21 0.0001 22 0.100

21 0.100 20 0.111 21 0.100 21 0.011 21 0.001 21 0.100

(. (3.5) ),

|

22 0.1 21 0.1 | = 1 = 8u 22 0.100

(3.5). bits , (3.5) bits ( , (guard, round digits) sticky bit) . .4 [16], [12]. [23]

60 3. c 2008, . IEEE. 3.3.1. ( Cray C-90) . 3.3.2. : (x y) = (1 + )x (1 + )y, ||, || u

, R +, . , x, y, z R , . 0 x + y R . 1 : x + y = y + x. 2 x + (y + z) = (x + y) + z . 3 0 x + 0 = x x R . 4 x R x R , . x + (x) = 0. 0 x y R . 1 : x y = y x. 2 : x (y z) = (x y) z . 3 1 x 1 = x x R . 4 x 1 1 x R x ( x ) = 1. : x (y + z) = x y + x z. R F ... (3.5). . 1, 3, 4 1, 3. 0 0 F . 3.3.4. 2:

t1 = (x + y) s1 = (y + z) t2 = (t1 + z) s2 = (x + s1 ) t1 = (x + y)(1 + 1 ) t2 = (t1 + z)(1 + 2 ) t2 = ((x + y)(1 + 1 ) + z)(1 + 2 ), |j | u.

s2 = (x + (y + z)(1 + 2 ))(1 + 1 ) |j | u.

3.3. c 2008, . 61 , t2 = s2 ... . 3.3.5. 4: , x F z = 1 1 1 ( x ) = x (1 + 1 ) x z = x x (1 + 1 )(1 + 2 ) . MATLAB 1 y 200 1 ( y ) y = 1:

index = []; for i=1:200 if ((1/i)*i = 1) index = [index i]; end; end; NEC Versa SX Pentium 2 x = [49, 98, 103, 107, 161] (1/x) x = 1. F , R . , .. , . , ... x1 , x2 x3 , x4 x1 + x2 + x3 + x4 ((x1 +x2 )+x3 )+x4 (x1 +x2 )+(x3 +x4 ). , . , , , , . 3.4.4 3.2.3. ... , . ) ... ) . (), x, y F , . x y G,

|(x

y) (x |x y|

y)|

u, x

y = 0,

{+, , , /}.

(3.6)

(3.5), (x y) x y . :

62 3. c 2008, . 3.3.1. x, y F x

y G,

(x

y) = (x

y)(1 + ), || u

x y , || u 1+ x y F = 0. F x y x, y F . (x y) = 3.3.3.

x y (x

y) = (x

y)(1 + ),

(), ... ( R ) x = x(1 + ), y = y(1 + ). , , . = = /, ... ( R ) x = x, y = y(1 + ), x = x(1 + ), y = y , x y (x y)(1 + ). 3.5. 3.3.2. .

(ij ) = ij (1 + ij ), |ij | u, |(A) A| |A|u (A) = A + E, |E| u|A|, (A + B) = (A + B) + E, |E| u|A + B|, 3.1. , (3.6) , . (3.6) /. 3.3.6. ... x, y F x2 + y 2 G 0 < x F 9 ( x) = x(1 + ) || u. x z = 2 2 . .x +y

: , ( x) ... ... x.9

3.4. c 2008, . 63

...

t1 t2 t3 t3 t1

:

(x x) (y y) (t1 + t2 ) ( t3 ) ( tx ) 3

x2 (1 + 1 ) y 2 (1 + 2 ) (t1 + t2 )(1 + 3 ) t3 (1 + 4 ) x t3 (1 + 5 )

t1

= = = =

x (1 + 5 ) t3 x (1 + 5 ) t3 (1 + 4 ) x (t1 + t2 )(1 + 3 )(1 + 4 ) x

(1 + 5 ) (1 + 5 )

(x2 (1 + 1 ) + y 2 (1 + 2 ))(1 + 3 )(1 + 4 )

j u. , z t1 . 3.5 . (1+x)1 3.3.7. y = . y 1 (y) = ((1 + x)(1 + x 1 ) 1)(1 + 2 )(1 + 3 )/x

(y)

=

(1 + 1 )(1 + 2 )(1 + 3 ) +M,

1 (1 + 2 )(1 + 3 ) x

, x >

(y)

=

=

1 + (1 + 2 + 3 ) + 1 2 + 2 3 + 1 (1 + 2 )(1 + 3 ) +1 3 + 1 2 3 + x 1 + O(u)(y)y | y

|

O(u). 0 < x > 1/(1/0) = Warning: Divide ans = 0 >> 1/1/0 Warning: Divide ans = Inf >> 0/0 Warning: Divide ans = NaN >>1/0 Warning: Divide ans = Inf >> max(ans,4) ans = Inf >> min(ans,3) ans = 3

by zero

by zero

by zero

by zero

3.4.2. / . Fortran machar.f ( Cody) [6], paranoia.f ( Kahan). , ... IEEE . 3.4 . ...

3.4.2 IEEE , . , .. emin t . , .. 0 < a b < m. 3.4.2. Matlab 5.1.0 Windows Intel Pentium.

3.4. c 2008, . 67 MATLAB realmin.

>> realmin ans = 2.225073858507202e-308 >> format hex >> realmin ans = 0010000000000000 :

>> format hex >> realmin/252 ans = 0000000000000001 >> ans/2 ans = 0000000000000000

>> format hex >> realmin/252 ans = 0000000000000001 >> ans/2 ans = 0000000000000000 >> format long e >> realmin/252 ans = 4.940656458412465e-324 . , , m . , , . (=gradual underow). , . 3.4.3 (Kahan).

if (x > y), ... ... log(x-y) ... end x, y () |x y| . , 0 log(0). . 3.4.4. [8] A = [1 2; 1 5/2] ... L = [1 0; 1 1], U =

68 3. c 2008, .

[1 2; 0 1/2]. U (2, 2) , , U ! 1. . Matlab 5.1.0 Windows Intel Pentium. realmin .

> a = realmin*[1 2; 1 5/2]; > [l,u] = lu(a); > u(2,2) ans = 1.1125e-308 > u/realmin ans = 1.0000 2.0000 0 0.5000 . Demmel [8]. 3.4.3. ..., . ( . [8]).

3.4.3 IEEE (extended format)). 79 bits (mantissa 63, exponent 15), u 5.42 1020 [104932 , 104932 ]. , Pentium, ... 80 bits (= double rounding). , 80 bits 64 32 bits. 3.4.4. . ( . [16]) = 10 2 3. 1.9 0.66 = 1.254. round p (x) x p . round 2 (1.254) = 1.3 round 2 (round 3 (1.254)) = 1.2. , ( ). , Kahan 128 bits [17].

3.4.4

Fused Multiply and Add (FMA)

z + x y x y x + y . -

3.4. c 2008, . 69 Cray ( chaining), ( x y ) ( z + (x y)) (. !) , FMA DOT(x(1 : n), y(1 : n)) n + O(1) 2n + O(1). , (.. IBM RS/6000) z + x y . . , . ,

(z + x y) = (z + x y)(1 + ),

|| u.

. ( )

(z + (xy)) = (z + xy(1 + 1 ))(1 + 2 ) . . Kahan Kahan Higham: 3.4.5.

x = det

a b c d

.

x = ad bc , F , :

t1 = ad t2 = bc x = t1 t2

t1 = ad(1 + 1 ) t2 = bc(1 + 2 ) x = (t1 t2 )(1 + 3 )

| x| x

= = =

|(t1 t2 )(1 + 3 ) (ad bc)| |(ad(1 + 1 ) bc(1 + 2 ))(1 + 3 ) (ad bc)| |ad(1 + 3 ) bc(2 + 3 ) + ad1 3 bc2 3 | (|ad| + |bc|)2u + (|ad| + |bc|)u2

| x| x (|ad| + |bc|)2u + (|ad| + |bc|)u2 . |x| |x| |ad|, |bc| |x|, .. |x| , . Kahan :

70 3. c 2008, .

t1 = bc t2 = t1 b c ( FMA) t3 = a d t1 ( FMA) x = t3 + t2

t1 = bc(1 + 1 ) t2 = (t1 bc)(1 + 2 ) t3 = (ad t1 )(1 + 3 ) (1 + 4 ) = t3 + t2 x

.

| x| = x = = = =

|(t3 + t2 ) x4 (ad bc)| |(ad t1 )(1 + 3 ) + (t1 bc)(1 + 2 ) 4 (ad bc)| x |(ad bc(1 + 1 ))(1 + 3 ) + (bc(1 + 1 ) bc)(1 + 2 ) 4 (ad bc)| x |ad3 bc(1 + 3 + 1 3 ) +bc1 (1 + 2 ) x4 | |x3 x4 bc1 (2 3 )| (|x| + ||)u + |bc|2u2 x

|x| , |x|, || x , ( |bc|u > |x|). . 3.4.1. FMA . . 3.4.2. FMA . 3.4.5. ... standard IEEE. Goldberg [12] What every computer scientist should know about oating-point arithmetic. [16, Appendix]. IEEE 754 oating-point standard [20]. . [2] ... Intel Pentium. 3.4.6. Pentium bug Intel , . Alan Edelman MIT [10] [7].

3.4.5 Java Network Computing. Java. Java . (..

3.5. c 2008, . 71interfaces, , ), . ; Java : : , Java complex types. . : Java linguistically enforced exact reproducibility of all oating point results, Java ( W. Kahan cruel delusion11 .) , Java . .. Java Linpack12 .

3.5 As every physicist knows, no equation is exact; therefore, we believe that nite precision computation can be closer to physical reality than exact computation. Thus it appears possible to transform the limitations of the computer arithmetic into an asset. [5] (...): , , . , . ; (...) , . [9] Its impossible to compute things which dont exist. Its dicult to compute things which almost dont exist. [Cleve Moler] , (. R ), f : U Rm Rn . :

x U m f f (x) n x,

x , . x F . x x, x = (x).11 12

http://www.netlib.org/benchmark/linpackjava

72 3. c 2008, .

f (x ) f (x ), (. R . fprog f ... F . ,

fprog (x ) f (x)

fprog (x ) f (x) . f (x)

(3.7)

(.. ) f . f (x), (3.7) . , (3.7). accuracy precision. . . Mathematica. : , accuracy . (3.7) . precision . ... 3.5.1. Vel Kahan [21]: Precision concerns the tightness of specication. Accuracy concerns its correctness. An utterly inaccurate statement ... can be uttered quite precisely... 3.177777777777 is a rather precise (13 dec. digits) but rather inaccurate (2 signicant decimal digits) approximation to .... Although exp(10) = 0.00000454 has 3 decimal digits of precision, it is accurate to almost 6. Precision is to accuracy what intent is to accomplishment. A natural disinclination to distinguish them invites rst shoddy science and ultimately the kinds of cynical abuses brought to mind by Peoples Democracy, Correctional Facility and Free Enterprise . , f (x ) f (x) . f x x x. (, ) x. , . x, . .

3.5. c 2008, . 73

f(x)

f(x*)X X*

U

f(U )

f(y) y y*

f(y*)

3.2: ) ) . 3.5.1. f : R R y := f (x). x = x + x, f , y = f (x + x). |f (x + x) f (x)| = | y|. y

yy

= f (x + x) f (x) = f (1) (x)x + f (2) (x + x) (x)2 , (0, 1) 2!

f, f (1) , f (2) x, x + x. f (1) (x) , . ,

yy y

=

f (x)x f (x)

x + O((x)2 ) x

(3.8)f (x)x

|x| 1, (3.8) |x|, | f (x) | y x. , . f (x)x f (x). , . , f . x1 +x2 +x3 ( 0, 1, 2) R , ... , fprog . fprog x f

74 3. c 2008, .

f(x) x x*

fp or g (x)

f(x )

*

()

f(x) x fp or g (x) x* ()

f(x*)

3.3: ) . ) . . f . fprog : 3.5.1. x U x x fprog (x) f (x ). . , - . 3.3. , . , . , fprog (x) f (x ) x x. , : 3.5.2. , x x fprog (x) = f (x ). () . , . (3.7) ( ) (=forward error). ,

3.5. c 2008, . 75, x x ( ) ... . , f (x) . , .. x = (1 , . . . , N )

a1 = f1 (1 , . . . , N ), a2 = f2 (a1 , 1 , . . . , N ), , z = fn (an1 , , a1 , 1 , . . . , N ), fn f1 f fj (aj1 , , a1 , x)

{aj1 , , a1 , 1 , . . . , N }. ..., R , z z . fj . 3.3.6 3.4.5. ( ) (=forward error analysis). , : f . 3.3.6 3.4.5 , . f : Rn Rm . 3.7. . , :

a2 b2 x+ x cos 1 b b2 4c f (x + ) f (x)

(a b)(a + b) x+ + x 2 2 sin 2 |b| + b2

f

(1)

4c + Vieta 2 (x) + f (2) (x) 2 +

3.5.2. , . 1962 Ramon Moore [28].

76 3. c 2008, . x ( interval) (xL , xU ) , ( interval arithmetic). (.. (, +).) [17] [1]. (.. 3.3.6) n pn = i=1 (1 + i ) |i | u.

(1 u)n pn (1 + u)n . pn = 1 + nu + O(u2 ). . : 3.5.1. |i | u i = 1 i = 1 : n nu < 1 n

(1 + i )i = 1 + n ,i=1

|n |

nu := n . 1 nu

. . n = 1. n = 1, n

(1 + i )ii=1

= (1 + n1 )(1 + n ) = 1 + n1 + n + n1 n = |n1 + n + n1 n | (n 1)u (n 1)u2 +u+ 1 (n 1)u 1 (n 1)u (n 1)u + (n 1)u2 + u (n 1)u2 1 (n 1)u nu n . 1 (n 1)u

1 + n |n |

n = 1n

(1 + i )ii=1

= =

1 + n1 1 + n

n1

n + n + n n n1 n n = 1 + n n1 + u | |n | | 1u nu (n 1)u2 |n | n 1 (n 1)u + (n 1)u2

3.5. c 2008, . 7710-5

gamma

10

-10

u = 2e-16

10

-15

10

0

10

2

10

4

10 n

6

10

8

10

10

3.4: n .

n i=1 (1

+ i )i :=< n >

< n > < k >=< n + k >, < n > / < k >=< n + k > ( nu < 1):

n

= =

nu 1 nu nu(1 + nu + (nu)2 + ) nu + O(u2 )

,n

(1 + i ) (1 + u)n < enu .i=1

u = 2 1016 .

3.4 n

3.5.2. , 3.3.6 : ... x, y F x2 + y 2 G 0 < x F ( x) = x(1 + ) || u. ,

78 3. c 2008, . z = x x2 +y 2

Erel (t1 )

t1

= = = =

x (1 + 5 ) t3 x (1 + 5 ) t3 (1 + 4 ) x (t1 + t2 )(1 + 3 )(1 + 4 ) x

(1 + 5 ) (1 + 5 )

(x2 (1 + 1 ) + y 2 (1 + 2 ))(1 + 3 )(1 + 4 )

|j | u j = 1 : 5. . 3.5.1 .

(x2 (1 + 1 ) + y 2 (1 + 2 ))(1 + 3 )

= =

(x2 (1 + 2 ) + y 2 (1 + 2 )) (x2 + y 2 )(1 + 2 ), 2 |2 | 2 ,

(x2 + y 2 )(1 + 2 ) =

x2 + y 2 (1 + 2 ) 2 |2 | 2 .

3.5.1

t1

= = = =

x (x2 (1 (1 + 2 ) (x2 + y 2 )(1 + 2 ) x (1 + 2 ) 2 + y 2 (1 + ) x 2 x (1 + 4 ) 2 + y2 ) (x + 1 ) + x y 2 (1 + 2 ))(1 + 3 )(1 + 4 )

(1 + 5 )

|

t1 z | z

=

|4 | 4 .

(3.9)

(3.9) z t1 . , 3.5.2. x x ... z = fprog (x) z = f (x ).

z z

=

fprog (x) f (x) = f (x ) f (x) .

3.5. c 2008, . 79 , f (x ) f (x) . x x , ( ) f (= perturbations) . , , . ! 13 (= backward error analysis) 14 James Hardy Wilkinson (1919-86), . Rounding Errors in Algebraic Processes 1963, ( [36]). ( ) . 3.5.3. f (x1 , x2 , x3 ) = (x1 + x2 ) + x3 F . fprog (x1 , x2 , x3 ) = ((x1 + x2 )(1 + 1 ) + x3 )(1 + 2 )

fprog (x1 , x2 , x3 ) = x1 (1 + 1 )(1 + 2 ) + x2 (1 + 1 )(1 + 2 ) + x3 (1 + 2 ) |j | u j = 1, 2.

fprog (x1 , x2 , x3 )

= f (1 , x2 , x3 ) x

(3.10)

x1 = x1 (1 + 1 )(1 + 2 ), x2 = x2 (1 + 1 )(1 + 2 ), x3 = x3 (1 + 2 ). |j xj | = |xj (1 + 2 + 1 2 )| j = 1, 2 |3 x3 | = |x3 2 |. x x |j xj | 3u|xj |, j = 1, 2 |3 x3 | u|x3 |. x x

|fprog (x1 , x2 , x3 ) f (x1 , x2 , x3 )| |j xj | x |xj |

= |f (1 , x2 , x3 ) f (x1 , x2 , x3 )| x j u, j = 3 (j = 1, 2), 3 = 1.

. . ( ) ; . , f : R R . ,

fprog (x ) f (x)

= f (x ) f (x) prog

x = x+x x . 3.5.1,

f (x + x) f (x)13 14

= f (1) (x)x +

f (2) (x + x) (x)2 , (0, 1) 2!

. Wilkinson, (1954) Wallace Givens von Neumann Goldstine(1947) Turing (1948).

80 3. c 2008, . f, f (1) , f (2) x, x + x. (3.8) , .. | f (x) | y x. 3.5.4. f (x1 , x2 , x3 ) = (x1 + x2 ) + x3 . () f (x1 , x2 ) = x1 + x2 . , fprog (x1 , x2 ) = f (x1 (1 + 1 ), x2 (1 + 1 )). f (x + h) f (x) + [1, 1]h h = [x1 1 , x2 1 ] , f (x)x

f (x + h) f (x) f (x)

|x1 1 + x2 1 | |x1 + x2 | h |x1 + x2 |

(x1 , x2 ) x1 + x2 h . , x1 , x2 , .. , h = |x1 1 + x2 1 | u|x1 + x2 |

f (x + h) f (x) u. f (x) 3.5.5. Horner 2.2.1 x. , Horner . 2.2.1 .

sn = n for k = n 1 : 1 : 0 sk = xsk+1 + kend ... 3.5.1 :

sn1

= (xsn < 1 > +n1 ) < 1 > = xn < 2 > +n1 < 1 >

sn2 = (xsn1 < 1 > +n2 ) < 1 > ... s0 = 0 < 1 > +1 x < 3 > + n1 xn1 < 2n 1 > +n xn < 2n > = (1 + 1 )0 + (1 + 2n )n xn

s0

= =

fprog (0 , ..., n , x) f (0 (1 + 1 ), ..., n (1 + 2n ), x)

3.5. c 2008, . 81 . 2n , |j j | 2n |j | , ... . . .

|p(x) s0 | 2n |p(x)|

n k=0

|k ||x|k , |p(x)|

. 3.6 .

. . , (3.7) : 1. . . 2. () . . , , (3.7). , f x . x = x x x x x / x . . , . . 3.5.3. (= condition number) Alan Turing 1948 Rounding-o errors in matrix processes [34]. John Rice [31]. 3.5.4. ,

82 3. c 2008, . ! ( . [36] [31]). , (.2) . , x = (1 , . . . , m ), y = (1 , . . . , n ) y = f (x). x y = f (x ). fi x, mn ij = j |x f x. , ij f x f . , mn . , Kj |j j | Kj j j |. n x. , K y y 2 K x x 2 |j j |. K x. . . Kj |j j | x x Kj j x 2 2

.

. , j . Kj |j j | x x Kj . y 2 x 2

K

y y y 2

2

K

x x . x 2

, : 3.5.3 (Rice [31]). X, Y f : X Y , . x y := f (x ). x , y X, Y . f x x cond(f ; x ) := lim sup

0 h =

f (x +h)f (x ) f (x ) h x

3.5. c 2008, . 83 . f (x ) x . , cond(f ; x ) . : 15 x cond(f ; x ) =h x f (x +h)f (x ) . f (x )

x y

f |x . x

X = Rn Y = Rm , Frechet f . f ,

suph =

f (x + h) f (x ) f (h) = sup h h h =

x . , f ( ). : cond(f ; x ) =

x y

f

(3.7). , : 3.5.1. f fprog x, xprog

fprog (x) = f (xprog ) 3.5.4. (3.5.1). / () cond(fprog )

x xprog cond(fprog )u. x :

fprog (x ) f (x) f (x)15

f (x ) f (x ) f (x ) f (x) prog + . f (x) f (x)

Frechet.

84 3. c 2008, . :

f (x ) f (x ) prog f (x)

f (x ) f (x ) prog f (x ) x x prog cond(f ; x ) x cond(f ; x )cond(fprog )u.

f (x ) f (x) f (x)

x x x cond(f ; x)E. cond(f ; x)

:

fprog (x ) f (x) f (x)

cond(f ; x )cond(fprog )u + cond(f ; x)E. (3.11)

. , . : ) , . cond(fprog ) . . ) , . cond(f ; x) ) E , . , E , : < 3.5.5. , .. f := f2 f1 . cond(f ; x ) cond(f2 ; y )cond(f1 ; x ). f , .. f1 f2 ( ). 3.5.6. , John Rice [31]. 3.5.7. 3.5.1, zprog z . 1 3.5.6. f (x) := log x, cond(f ; x) = | log x| x 1.

3.5. c 2008, . 85 3.5.8. f ([A; x]) = Ax A . x.

suph =0

f (x + h) f (x) h

= =

suph =0

Ah h

A

cond(f ; x) =

x Ax

A

3.5.9. f ([A; y]) := A1 y A . y .

suph =0

f (y + h) f (y) h

= =

suph =0

A1 h h

A1

cond(f ; y)

y A1 A1 y Ax A1 x A A1 := (A)

A, y cond(f ; y) = (A), (A) A. 3.5.10. , . , , , . . , . , , 2- max /min , . 3.5.11. , (A) n. . 3.5.12. , .. Hilbert, Vandermonde .. Test Matrix Toolbox MATLAB [18] W. Gautschi Vandermonde.

86 3. c 2008, . 3.5.13. (A) . , (A) . 3.5.7.

f ([x; y]) := x y, x, y Rn X := [x; y] R2n cond(f ; X) =

X x y

f |[x;y] . X

f |[x;y] = [y; x] R12n X [x; y] |x y|

cond(f ; X) = [x; y] = [y; x] cond(f ; X)

[y; x] .

[x; y] 2 |x y|

|x y|, . cos(x, y). x , y > 0 cos(x, y) 0.

3.6 , , . . J. Wilkinson The perdious polynomial16 [35]. , , , Horner . . . n

p(x) :=k=116

k k (x),

{k }k=1:n

Pn1

.

3.6. c 2008, . 87 , . p(x) k ; p(x);

3.6.1 . , . , . ( ). , x p(a; x) = 0, a (.. ) : a p(a + a, x) = 0, x

= inf{| a a , p(a + a; ) = 0}

p(a + a; ) = 0 p(a; ) + p(a; ) = 0 0 = p(a; x) + [0 , . . . , n ][1, , ..., n ] a a ,

=

|p(a; )| [1, , ..., n ] D a

D ( . .1 ).

3.6.2 p(a; x) = n k k=0 k x . , i + i , i = i , j :[ ,0,i ,0, ,]

0 = p(a +

a

; j + j ) = 0

p(a; xj + j ) + p(; j + j ) dp i p(a; j ) +j (a; j ) + i j (3.12) dx0

88 3. c 2008, .

|

j | j

i |i j | dp |j dx (a; j )| i |i j | dp |j dx (a; j )|

|i | |i |

(3.13)

i + i , i = 1 : n

0 = p(a + a; j + j ) 0

= p(a; j + j ) + p(; j + j ) p(a; j ) +j0

dp (a; j ) + dx

n i k j k=0

|

n a [1, j , ..., j ] j | dp j |j dx (a; j )|

D

a a

(3.14)

(3.13) (3.14) . , , . : 3.6.1. z = z pn Pn pn (z; ) := pn (z) + g(z) g Pn . pn (z; ) z( )

|z( ) z +

gn (z ) pn (z )(1)

| = O( 2 ).

z m, m pn (z; )

|z( ) z [

m!gn (z )(m) pn (z )

]m

1/m

| = O(

2/m

).

z :=

, pn (z; ) z( ) z , O( ). pn (z; ). pn (z; ) pn (z; 0) . m , m!gn (z ) | (m) ]m | 1/m . , . , 17 . . 17

gn (z ) (1) pn (z )

pn

(z )

(= bifurcation).

3.7. c 2008, .

89

. J. Wilkinson The perdious polynomial [35]. . , . , .. Lagrangre Newton. Walter Gautschi, .. [11]. 3.6.1 (Wilkinson . [35]). ( ) J.H. Wilkinson. n1

pn (x) := x +j=0

n

j xj ,

j pn j = j, (j = 1 : n). 1 = 1 , ,

min condj = cond1 = O(n2 )j

max condj (5.83)n .j

. condj =

(j + n)! j n j! , (j!)2 (n j)!

j = 1 : n.

3.6.1. Wilkinson 1 = 1 n = 2, 10 .

3.7 [17] ( James Demmel.) z = f (a) f : Rn Rm p :

z = f (a), x1 = a Rn x2 = g1 (x1 ) = [x1 ; 1 ] 1 x1 .

x3 = g2 (x2 ) = [x2 ; 2 ]

90 3. c 2008, .

xp+1 z = Ixp+1

=

gp (xp ) = [xp ; p ]

I Rm(n+p+1) () . Rn g1 gp gp+1 I Rm x1 := a xk+1 = gk (xk ), k = 1 : p

xk k :

,

k R

xk+1 = gk (k ) + xk+1 x xk+1 k .

x2 x3

x4

. = . = = = . =

g1 (a) + x2 g2 (2 ) + x3 x g2 (g1 (a) + x2 ) + x3 g2 (g1 (a)) + Jg2 x2 + x3 g3 (g2 (g1 (a))) + Jg3 Jg2 x2 + Jg3 x3 + x4

z =

I[gp ( ((g1 (a) ) + Jgp Jg2 x2 + +Jgp Jg3 x3 + + Jgp xp + xp+1 ]

I m 0 1 ( ). . z = f (a) + I[(Jgp Jg2 , . . . , Jgp , I] . . . = f (a) + Jh x2 xp+1

z f (a) = Jh.

z = f (a + a) = f (a) + Jf a = f (a) + Jh a q = pn + p(p + 1)/2,

Jf a = Jh, Jf Rmn , a Rn , J Rmq , h Rq . m < n, . a .

3.8. c 2008, .

91

, . [4, 24, 25, 27, 26]. ADIFOR [14, 13, 3].

3.8 3.8.1. bit . . 3.2.2. 3.8.2 (, , 03). . ; . . 3.3. 3.8.3. ; . [, . 3.2] 1 ..., 1+ . IEEE 1.0 01 1.0 0 . t 1 ( ) (t1) = 1t . 3.8.4. ; . [, . 3.2] , , ... , , ( ) ... 1 ... . , M = 2u. 3.8.5 (Burden and Faires [32]). ( ). p p 104 p ) , ) e, ) 71/3 . . ) | | 104 (1 104 ). pi MATLAB ,

[3.14127849432443, 3.14190681285515].) , exp(1) (1 104 )

[2.71801000027620, 2.71855365664189].) , 71/3 (1 104 )

[1.91273988965411, 1.91312247589067].

92 3. c 2008, .

3.8.6 (, , 02-makeup). MATLAB . 7.0. (.: ). . . : , release 14 MATLAB ( 7.0)18 . 3.8.7 (, , 02-makeup). : 1 x Rn n x 1 x x 1 . . : x = maxi=1:n {|i |} x 1 = i |i | n x . x 1 = i |i | x x ( ) x 1 . 1 3.8.8. x Rn n x 1

x

2 x 1. . :

x

1

=j

|j | (j

|j |2 )1/2 (j

1)1/2

Cauchy-Schwartz . ,

x

2 1

=(j

|j |)2 j

|j |2

. 3.8.9 (, , 03). Ax = b, x, r := b A x r := A x + b = 1.5 1013 ( ). 2 (A) = 106 . x . . ( Rigal-Gaches), . , :x x

xx x

2(A) 1 (A) 3 107 /(1 1.5 107 ) 3 107 .

3.8.10 (, , 04). Ax = b 101 106 x ||b A|| = 1012 ||||2 = 10, ||A||2 = 10 ||b||2 = 1. x x , . ||x x||2 / ||x||. 18

http://www.mathworks.com/products/new_products/latest_features.html#ML.

3.8. c 2008, .

93

3.8.11 (Burden and Faires [32]). , p p . , , . p p

0.300 101 0.300 103 0.300 104 e e10.

0.310 101 0.310 103 0.310 104 22/7 3.1416 2.718 22000 p 0.310 101 0.310 103 0.310 104 22/7 3.1416 2.718 22000 0.1

p 0.300 101 0.300 103 0.300 104 e e10

0.1 10 0.1 103 0.001264 7.346 106 2.818 104 2.647 101

4

0.3333 101 0.3333 101 0.3333 101 4.025 104 2.338 106 1.037 104 1.202 103

, . 3.8.12. ... IEEE , ( precision) . ... IEEE . . IEEE , , 24 = 23(+1) ( ) 53 = 52(+1). , m 2E 1 m < 2 1.bs . ... , 24 . d , 10d 224 . , d = 7 24 log10 2 = 7.2247198. 10d 253 . , d = 16 53 log10 2 = 15.954589. 3.8.13. : (. 2 ) . . . : Rn R , , . J : Rn Rn

J := [

,..., ]. 1 n

j

= =

j j

n 2 i i=1 n 2 i=1 i

94 3. c 2008, . () J :=

x

2

=

x x

1 : , n x 1 = j=1 |j | 0. , .

2

.

3.8.14. A = [100, 1; 0, 10] B = [1, 0; 0, 0] MATLAB IEEE. ( ) C = A./B . . C = [100, Inf; NaN, Inf]. 3.8.15 (Heath [15]). IEEE . . . 1t /2. = 2, IEEE 2t . > 2 1t /2 = (/2) t , , . 3.8.16. ... 1960 ( M. Overton [30]) 0, , .. . . . fmax ... , a/0b/0 = fmax fmax = 0 a, b = 0, . , , program interrupt, IEEE, . 3.8.17. IEEE realmin . 1 0 boole.

temp2=realmin}/2; temp=2*temp2; boole(temp==realmin);. MATLAB , realmin

realmin = 0010000000000000.

temp2 = realmin/2 = 0008000000000000 realmin, boole = 1. 0 .

3.8. c 2008, .

95

3.8.18. f : Rm Rn . . . , f ( , , ...), () () . . 3.8.19. : G x f (x) = y ( x, y ), G. G x G(y) y y y ( 0) |G() y x|/|| . G x . . , - . x = G() |G() y y x|/|| = 0. x 3.8.20. , . . G x G(x) = y . x x , G(x) = G(). , x G(x) G(x)

G(x) G(x) = G() G(x) . x x x / x , . G G1 x = G1 (y), G() G(x) / G(x) = G( x) / G(x) x x

G( x) x G(x)

=

G( x) x xx xx G x xx G x G G1

xx G(x) x G(x) G1 (y) x xx . x

, , xx , x G

G1 ( G).

96 3. c 2008, . 3.8.21. x, y R2 , (. i i 0) xT y . . x, y ... x1 (1 + 1 ), x2 (1 + 2 ), y1 (1 + 3 ), y2 (1 + 4 ) |j | u. (xT y) = (1 1 < 3 > +2 2 < 3 >) < 1 > 1 1 < 4 > +2 2 < 4 >.

|x|T |y| |(xT y) xT y| 4 T = 4 |xT y| |x y| |xT y| = |1 1 + 2 2 | = 1 y1 + 2 2 = |x|T |y|. 3.8.22. ... 32 b0 b1 b8 b9 b31 , , 8 23 . b9 b31 :

(1)b0 2E127 1.b9 b31 E = b1 b8 127 . bj = 0 j = 1 : 31 0 E 1 0. ) , , ; ) , , ; . ) 0, 2126 . fN min1 = 10126 log10 2 1.1754943e 38. ) , , fmin1 2126 223 = 2148 2.8026e 045. : , 52 11 1024. fN min2 21022 2.2250738e 308 realmin fmin2 = 2102252 4.9406564e 324. 3.8.23. , ... IEEE , x+y = y +x. . ... IEEE, x+y = (x + y) = (y + x) = y +x. x+y = = (y + x)(1 + 2 ), 1 = 2 . (x + y)(1 + 1 ) y +x 3.8.24 (Heath [15]). x y log x log y m(x, y) := log(x) log(y) . , log(x) log(y) = log(x/y), M (x, y) := log(x/y) m(x, y). . ; ( : ;)

3.8. c 2008, .

97

. . , . , .. y = 1 log y = 0, log(x/y) = log(x). MATLAB . , 2. x = 256+512*eps y = 256. MATLAB

log2 (x) - log2 (y) = 0

log2 (x/y) = 6.661338147750939e-016 2 1 . x = eps y = realmax. m(eps, realmax) = 1076 M (eps, realmax)

Warning:

Log of zero.

3.8.25. ) fadd, fadd fmul ... (64 bits) 1 . ; ) fma fnma, , . (. fma fnma) 3 , . ; . ) , (a + b)(x + y) = (ax by) + (bx + ay), 6 . , , s = (a + b)x; t = b(x + y); p = y(a b); (s t) + (t p). 5 3 . (.. , O(n2 ) O(n3 ). 3.8.26. fma. ) fma . ) fma . ) fma . ) . . (). fma . ,

98 3. c 2008, . z + x y . (z + xy) = (z +xy)(1+), || u (z +(xy)) = (z +xy(1+1 ))(1+2 ).

3.9 3.9.1. , |x| x x R |x| , x . - - x R , |x| x . , , || . x

.

|x| x |x| .

|x x| || x

x || x x + || x x < 1 ( ) x, x > 0.

x(1 ) x x(1 + ),

|x x| |x|

=

|x x| || x || |x| x = + O( 2 ). 1

, ( ) x(1 ) x. x, x < 0. 3.9.2. ... 5 b0 b1 b2 b3 b4 , , 2 2 . b3 b4 (1)b0 2E1 1.b3 b4 E = b1 b2 1 . ) , , . ) , , ; ) , ,

3.9. c 2008, .

99

; ) ( NaN, ). . -) : 21 [0.00, 0.25, 0.50, 0.75] ( , , ), 21 [1.00, 1.25, 1.50, 1.75], [1.00, 1.25, 1.50, 1.75], 2 [1.00, 1.25, 1.50, 1.75], 4 [1.00, 1.25, 1.50, 1.75]. :

{0, 0.125, 0.250, 0.375, 0.500, 0.625, 0.750, 0.875, 1.000, 1.250, 1.500, 1.750, 2, 2.50, 3.00, 3.50, 4, 5, 6, 7}. 19 0. 0.500 = 21 0.125. ) t = 2 + 1 , u =

213 /2 = 0.125 3.9.3. () ... IEEE . ) 0.5, ) 0.1, ) 0.2, ) 23/4,