Data Compression and Algorithms

download Data Compression and Algorithms

of 43

  • date post

    12-Jan-2016
  • Category

    Documents

  • view

    40
  • download

    0

Embed Size (px)

description

Data Compression and Algorithms. Συμπίεση (1). στην αποθήκευση στην μεταφορά δεδομένων μέσα από δίκτυα στην διαχείριση τους. Οι μέθοδοι συμπίεσης διακρίνονται σε :. - PowerPoint PPT Presentation

Transcript of Data Compression and Algorithms

  • Data Compression and Algorithms

  • (1) : (lossless) . , , . , . . (lossy) .

  • (2) : (entropy coding) - - bits- (source encoding) - - - - .

  • (3) - - - - - .

  • . , . , . Fourier. Fourier f(t) g() . ( ) , .

    Fourier , Hadamar, Haar Karhunen-Loeve. , .

  • 1. (DPCM) . . 2. , bit. . .

    3. .

    ADPCM, , . , DPCM, .

  • . , , . . . , . , . , , .

    .

    . , , .

  • PPM (Prediction by Partial Matching)Burrows-Wheeler ()Markov ModelsWord-based Compressors ( ( ))

  • PPM ModelPPMA: escape method A, escape 1 PPMC: escape r/(n+r), r , n PPMD: , escape r/(2n), ci (2ci-1)/(2n) Method_X: (hapax legomena) ESCAPE ( t1 (t1+1)/(n+t1+1), ci/(n+t1+1). PPM* : .

  • Burrows Wheeler (1) : s. ( ) s , (n+1) x (n+1) T s. 1 s, 2 ( ) 1 () ... . 1 T (bwt(s). bwt(s) 1 .

  • Burrows Wheeler (2) : T s=mississippi. H : msspipissii I=3

  • Burrows Wheeler (3) : : bwt(s) : s BWT:

    1) s.2) i = 2,3,,n+1 (i,n+1) ( ) (i,1) s. 3) i- 1 i- ( ).

  • Burrows Wheeler (4) : bwt(s) ( L) 1 . 2 1 bwt(s) 1 s. 3 L 2 s. s. T!!!

  • Burrows Wheeler (5)mississippiississippimssissippimisissippimisissippimiss ssippimissi sippimissis ippimississ ppimississi pimississip imississipp mississippimississippississippimimississippippimississi ississippim pimississip imississipp sissippimis sippimissis issippimissippimississssippimissi

  • Burrows Wheeler (6) bw(s) BWT bw(s) bw(s)- bw(s) = msspipissiim ississippis sissippimi mississippip pimississi i ssissippim p imississip i mississipp s issippimis s ippimissis i ssippimiss i ppimississs sippimissi =3

  • Burrows Wheeler (7)- bw(s) F bw(s) L i=2,,|s|+1, Fi Li s

  • F ::s = mississippi

    123456789101112FmsspipissiiLiiiimppssss

  • Compression Algorithms :

    Huffman Coding (Static & Dynamic)

    Arithmetic Coding

    LZW Coding

  • Static Huffman Coding Huffman .

    .

    prefix code.

    (;): Pr(s1)+log(2loge/e)=Pr(s1)+0.086

  • 3 :

    prefix code

  • 1 : 1 . END .

    1 .

  • 2 : prefix code prefix code Huffman tree : a weight(t) = freq(a) label(t) = a t1 t2 t3 t1, t2 weight(t3) = weight(t1) + weight(t2) (depth first-search),

  • 3 : : , , H-ENCODE-TREE (fout, t) , H-ENCODE-TEXT (fin, fout)

  • Huffman Encoding Huffman : H-ENCODING (fin, fout)H-COUNT (fin)t H-BUILD-TREEH-BUILD-CODE (t, 0)H-ENCODE-TREE (fout, t)H-ENCODE-TEXT (fin, fout) ( log ) . .

  • Decoding Huffman : , H-REBUILD-TREE (fin, t) , H-DECODE-TEXT (fin, fout, root)

  • Huffman Decoding Huffman :H-DECODING (fin, fout)create a new node rootH-REBUILD-TREE (fin, root)H-DECODE-TEXT (fin, fout, root )

  • Dynamic Huffman Coding Huffman. .

  • Huffman: Huffman Huffman, siblings

  • Encoding 1 : , DH-INIT2 : , DH-ENCODE-SYMBOL(a, fout) DH-ADD-NODE(a)

  • :DH-ENCODING (fin, fout)DH-INIT while not end of file fin and a is the next symbol do DH-ENCODE-SYMBOL (a, fout) DH-UPDATE(a)DH-ENCODE-SYMBOL(END, fout)

  • Decoding :1 : 2 : bits , END. DH-DECODE-SYMBOL (fin)

  • Huffman :DH-DECODING (fin, fout)DH-INITa DH-DECODE-SYMBOL (fin) while a END dowrite a in fout DH-UPDATE (a) a DH-DECODE-SYMBOL (fin)

  • DH-UPDATE(a) : n 1 siblings n m (m
  • Huffman (1)To assign a canonical Huffman to a set of symbols, supposing that symbol i is assigned a code of li bits, maxlength the maximum length and n the number of distinct symbolsFor i=1 to maxlength do numl[l]=0 For i=1 to n numl[li]=num[li]+1 Number of codes of length l is stored in numl[l]Set firstcode[maxlength]=0For l=maxlength-1 downto 1 dofirstcode[l]=(firstcode[l+1]+num[l+1])/2Integer for first code of length l is stored in firstcode[l]For l=1 to maxlength nextcode[l]=firstcode[l]For i=1 to n docodeword[i]=nextcode[li]symbol[li, nextcode[li]-firstcode[li]]=inextcode[li]=nextcode[li]+1The rightmost li bits of the integer codeword[i] are the code for symbol i.

  • Huffman (2)To decode a symbol represented in a canonical Huffman CodeSet v=nextinputbit()Set l=1While v
  • Arithmetic Coding : 0 1. ( 0.1 ) .

  • Encoding ai (1=
  • Decoding l : o ai , , ai l l :

    l=0. AR-DECODE (l,fout)while l!=0dofind ai such that write ai in file fout

  • LZW Coding (segments) . . . hashing . trie.

  • Encoding 1, . a ( w). : wa , w wa . w a ( a). wa wa w.

  • Decoding o . , : c , w c , wa a .

    - , w, w z z .