BI019 Bioinformatika Osnove teorije...
Embed Size (px)
Transcript of BI019 Bioinformatika Osnove teorije...

BI019 BioinformatikaOsnove teorije informacij
A Blejec
3. oktober 2012

Kaj je informacija
Racunalnik je stroj za predelavo informacij
GIGO

Sistemi dogodkv in izidi
Gremo v kino ali na zur?
Izberemo eno od sestih jedi.
”Josko je nas najbol’s prjatu” ali katera srecka bo zadela?

Sistemi z enakomoznimi stanji in negotovost
ω =
(o11
)
α =
(a1 a2
1/2 1/2
)
β =
(b1 b2 · · · b6
1/6 1/6 · · · 1/6
)
γ =
(c1 c2 c3 · · · c100,000
0.00001 0.00001 0.00001 · · · 0.00001
)

Sistemi z enakomoznimi stanji in negotovost
ω =
(o11
)
α =
(a1 a2
1/2 1/2
)
β =
(b1 b2 · · · b6
1/6 1/6 · · · 1/6
)
γ =
(c1 c2 c3 · · · c100,000
0.00001 0.00001 0.00001 · · · 0.00001
)

Sistemi z enakomoznimi stanji in negotovost
ω =
(o11
)
α =
(a1 a2
1/2 1/2
)
β =
(b1 b2 · · · b6
1/6 1/6 · · · 1/6
)
γ =
(c1 c2 c3 · · · c100,000
0.00001 0.00001 0.00001 · · · 0.00001
)

Sistemi z enakomoznimi stanji in negotovost
ω =
(o11
)
α =
(a1 a2
1/2 1/2
)
β =
(b1 b2 · · · b6
1/6 1/6 · · · 1/6
)
γ =
(c1 c2 c3 · · · c100,000
0.00001 0.00001 0.00001 · · · 0.00001
)

Merjenje negotovosti
Mera negotovosti
Sistem αn z n enakomoznimi stanji naj ima negotovostH(αn) = H(n)

Pravila za racunanje negotovosti
1 Sistem z enim stanjem je gotov, H(1) = 0
2 Sistem z vec stanji ima vecjo negotovost kot sistem z manjstanji
n > m⇔ H(αn) > H(αm)⇔ H(n) > H(m)
H(2) > H(1) = 0
3 Kaksno negotovost ima sestavljen sistem
δn×m = αn ⊗ βm
H(αn ⊗ βm) = H(n ×m) = H(n) + H(m)

Pravila za racunanje negotovosti
1 Sistem z enim stanjem je gotov, H(1) = 0
2 Sistem z vec stanji ima vecjo negotovost kot sistem z manjstanji
n > m⇔ H(αn) > H(αm)⇔ H(n) > H(m)
H(2) > H(1) = 0
3 Kaksno negotovost ima sestavljen sistem
δn×m = αn ⊗ βm
H(αn ⊗ βm) = H(n ×m) = H(n) + H(m)

Pravila za racunanje negotovosti
1 Sistem z enim stanjem je gotov, H(1) = 0
2 Sistem z vec stanji ima vecjo negotovost kot sistem z manjstanji
n > m⇔ H(αn) > H(αm)⇔ H(n) > H(m)
H(2) > H(1) = 0
3 Kaksno negotovost ima sestavljen sistem
δn×m = αn ⊗ βm
H(αn ⊗ βm) = H(n ×m) = H(n) + H(m)

Pravila za racunanje negotovosti
1 Sistem z enim stanjem je gotov, H(1) = 0
2 Sistem z vec stanji ima vecjo negotovost kot sistem z manjstanji
n > m⇔ H(αn) > H(αm)⇔ H(n) > H(m)
H(2) > H(1) = 0
3 Kaksno negotovost ima sestavljen sistem
δn×m = αn ⊗ βm
H(αn ⊗ βm) = H(n ×m) = H(n) + H(m)

Funkcija za racunanje negotovosti
Logaritem
H(n) = C loga n
Dvojiski logaritem
H(n) = log2 n
H(2) = 1

Funkcija za racunanje negotovosti
Logaritem
H(n) = C loga n
Dvojiski logaritem
H(n) = log2 n
H(2) = 1

Funkcija za racunanje negotovosti
Logaritem
H(n) = C loga n
Dvojiski logaritem
H(n) = log2 n
H(2) = 1

bit, nit in dit
2 log22 = 1 bite loge2 = 0.6931 nit
10 log102 = 0.301 dit

Enakomozna stanja: p = 1/n
αn =
(a1 a2 · · · anp p · · · p
)
H(n) = log2n= −log2(1/n) = −log2p= −n · (1/n)log2(1/n)= −
∑(1/n)log2(1/n)
= −∑
p · log2p

Neenakomozna stanja
αn =
(a1 a2 · · · anp1 p2 · · · pn
)
H(n) = −∑
p · log2p
nadomestimo z
Shannon-Wienerjeva formula
H(n) = −n∑
i=1
pi · log2pi
Shanon-Wiener (Weaver?) indeks diverzitete

Sistem z dvemi stanji> p <- seq(0.0001,0.9999,0.01)> x <- cbind(p,1-p)> H <- function(x) -sum(x*log(x,2))> par(mar=c(4,4,1,0))> plot(p,apply(x,1,H),ylab="H(p,1-p)")
●
●
●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
p
H(p
,1−
p)

bit ... 4 biti: 24 = 16 stanj
●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●
1111
1110
1101
1100
1011
1010
1001
1000
0111
0110
0101
0100
0011
0010
0001
0000
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
8 4 2 1
8 4 2 1

byte, ... 8 bitov: 28 = 256 stanj
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
11111111111111101111110111111100111110111111101011111001111110001111011111110110111101011111010011110011111100101111000111110000111011111110111011101101111011001110101111101010111010011110100011100111111001101110010111100100111000111110001011100001111000001101111111011110110111011101110011011011110110101101100111011000110101111101011011010101110101001101001111010010110100011101000011001111110011101100110111001100110010111100101011001001110010001100011111000110110001011100010011000011110000101100000111000000101111111011111010111101101111001011101110111010101110011011100010110111101101101011010110110100101100111011001010110001101100001010111110101110101011011010110010101011101010101010100110101000101001111010011010100101101001001010001110100010101000011010000010011111100111101001110110011100100110111001101010011001100110001001011110010110100101011001010010010011100100101001000110010000100011111000111010001101100011001000101110001010100010011000100010000111100001101000010110000100100000111000001010000001100000000111111101111110011111010111110001111011011110100111100101111000011101110111011001110101011101000111001101110010011100010111000001101111011011100110110101101100011010110110101001101001011010000110011101100110011001010110010001100011011000100110000101100000010111110101111001011101010111000101101101011010010110010101100001010111010101100101010101010100010100110101001001010001010100000100111101001110010011010100110001001011010010100100100101001000010001110100011001000101010001000100001101000010010000010100000000111111001111100011110100111100001110110011101000111001001110000011011100110110001101010011010000110011001100100011000100110000001011110010111000101101001011000010101100101010001010010010100000100111001001100010010100100100001000110010001000100001001000000001111100011110000111010001110000011011000110100001100100011000000101110001011000010101000101000001001100010010000100010001000000001111000011100000110100001100000010110000101000001001000010000000011100000110000001010000010000000011000000100000000100000000
2552542532522512502492482472462452442432422412402392382372362352342332322312302292282272262252242232222212202192182172162152142132122112102092082072062052042032022012001991981971961951941931921911901891881871861851841831821811801791781771761751741731721711701691681671661651641631621611601591581571561551541531521511501491481471461451441431421411401391381371361351341331321311301291281271261251241231221211201191181171161151141131121111101091081071061051041031021011009998979695949392919089888786858483828180797877767574737271706968676665646362616059585756555453525150494847464544434241403938373635343332313029282726252423222120191817161514131211109876543210
128 64 32 16 8 4 2 1
128 64 32 16 8 4 2 1

Stevilo bitov (H) in stevilo stanj (n)
bit stanj
1 22 43 84 165 326 647 1288 2569 512
10 102411 204812 409613 819214 1638415 3276816 65536
H = log2n
n = 2H

Kodna tabela ASCII

Nukleotidna zaporedja
Znaki: A T C G
1 Koliko bitov informacije nosi en nukleotid?
2 Zakaj aminokisline kodirjo tripleti?