Τ 0χνικές Βαθιάς Μηχανικής Μάθη 1ης και Γνώ 1ης για...

Click here to load reader

  • date post

    06-Feb-2018
  • Category

    Documents

  • view

    224
  • download

    3

Embed Size (px)

Transcript of Τ 0χνικές Βαθιάς Μηχανικής Μάθη 1ης και Γνώ 1ης για...

  • .

    : .

    ..

    , 2015

  • .

    : .

    ..

    3 2015

    , 2015

    ............................

    ..

    ............................

    ..

    ............................

    ..

  • ...................................

    .

    ..

    Copyright . , 2015

    . All rights reserved.

    , ,

    , . ,

    , ,

    .

    .

    .

  • i

    , ,

    ,

    .

    ,

    ,

    .

    , ,

    - .

    .

    ,

    , , ,

    , ,

    , .

  • ii

    bstract

    New machine learning methodologies are presented in the current Dissertation,

    which are able to interweave computational intelligent techniques, focusing on

    deep convolutional neural networks and support vector machines based on pre-

    existing knowledge and respective semantic representation. The presented

    architecture is the basis for developing data analysis and data classification

    systems that are able, on the one hand to encompass available ontologies or

    knowledge bases and, on the other hand, to efficiently adapt their knowledge to

    different real life conditions, such as different users and context of usage.

    In this work, the above goal is achieved by using deep convolutional neural

    networks for effective extraction of specific features from data and by

    transferring existing knowledge, through kernel methods, to support vector

    machines which are able to adaptively analyse and classify the data.

    In this framework, the presented methodologies are successfully applied for

    user emotion analysis, based on facial feature extraction and analysis, in human

    machine interaction environments.

    Key Words

    Deep Learning, Convolutional Networks, Semantics and Knowledge

    Representation, Kernel Methods, Support Vector Machines, Emotion Analysis,

    Facial Feature Analysis, Human Machine Interaction.

  • iii

    . -

    .

    . ,

    . .

    .

    , , ,

    ,

    .

  • iv

  • v

    1. ......................................................................................................................... 1

    2. B ........................................................................... 6

    2.1 T ..................................................................................... 6

    2.1.1 ...................................................................... 6

    2.1.2 Perceptrons ................................................................ 7

    2.1.3 Perceptrons ....................................................... 11

    2.2 ....................................................... 17

    2.2.1 ............................................. 17

    2.2.2 ................................. 22

    2.2.3 .................................. 24

    2.3 ..................................................................................... 29

    3. ................. 34

    3.1 ........................................................................ 34

    3.1.1 ............................................................................................. 34

    3.1.2 ...................................................................................... 36

    3.1.3 ................................................................................................. 39

    3.1.4 ....................................................................... 42

    3.2 (Support Vector Machines) ...................... 43

    3.2.1 ............................................................. 43

    3.2.2 ...................................................... 46

    3.2.3 ........................................................................ 47

    3.3

    ................................................................................................................. 51

    4. ............................................................ 55

    4.1 ............................................................................. 55

    4.2 ............................................ 57

    4.3 .............................................. 63

    5. ............................................... 65

    6. ......................................................................................... 69

  • vi

    6.1 ......................................................................... 69

    6.2 ................................................................................. 72

    6.3 ...................................................... 73

    8. ............................................................... 88

    9. ................................................................................................................ 90

  • vii

  • viii

    1: FAPs....................................................59

    2: () .........................64

  • ix

  • x

    2.1:

    .........................................................................................18

    2.2:

    /...........................................................................................18

    2.3: .........21

    2.4: ..................................22

    2.5: ..............................................23

    2.6: .....................................................................24

    2.7: .................................................30

    2.8: .......................................30

    2.9: ......................................................31

    2.10: ........32

    2.11: ................................33

    4.1: .....................................................................56

    4.2: FDP.....................................................................61

    5.1: ..................68

    5.2: ,

    FAPs.....................................................................68

    6.1: : ) (+,+) ) (+,-)

    ) (-,+) ) (-,-) ) ...............................................................................................72

    6.2: 3 (

    FAPs) 5 : ) (+,+)

    ) (+,-) ) (-,+) ) (-,-) ) ...................................................................................84

    6.3:

    ..................................................................................................................87

  • xi

  • 1

    1.

    ([12]) ,

    ([13]-[15]),

    ([16]-[20]).

    ,

    ,

    . , (

    ), ,

    , , ,

    .

    ,

    (layers) pooling ,

    ,

    (

    ). ,

    pooling -,

    ,

    .

    ([21], [22], [23]),

    ([24]) ([25]),

  • 2

    ([31], [32]).

    (high level representations),

    , (low level cues)

    ,

    , ,

    , .

    . ,

    .

    ([26]).

    .

    ,

    ([27]), , ,

    .

    , ,

    .

    . .

    , ERMIS,

    HUMAINE, CALLAS, CARE (www.image.ntua.gr ),

    .

    , ,

  • 3

    ,

    .

    ,

    .

    .

    ,

    .

    , .., / ,

    ,

    .

    ,

    (connectionist) ,

    ,

    .

    ([39], [40], [41]).

    ,

    , , ,

    , , ,

    , ,

  • 4

    , ,

    . ,

    () .

    ,

    , ,

    ,

    , ,

    ,

    . ,

    ,

    ,

    (

    ) (generalisation) ([37],

    [38]).

    ,

    ([42], [43]).

    ,

    ,

    -

    .

    8 .

  • 5

    2

    , ,

    .

    3

    (SVM),

    SVM.

    4 ,

    .

    5

    2,3,4 .

    6

    ,

    5.

    7

    .

    8 .

  • 6

    2. B

    2.1

    2.1.1

    ()

    - - .

    , ,

    , ,

    , .

    .

    .

    -

    .

    . xi i- k

    , wki: i- k ( )

    , yk

    :

  • 7

    (2.1)

    ,

    , -

    .

    (feedforward multilayer neural networks).

    ().

    . ,

    .

    ( ) ,

    , ()

    ( ,

    ).

    2.1.2 Perceptrns

    (machine learning),

    , , , ,

    , .

    (Error Back-

    Propagation back-propagation). ,

    (optimisation theory)

    1

    0

    ( )N

    k i ki

    i

    y x w

  • 8

    perceptrons (multilayer

    perceptrons). ( )

    ,

    , , (desired

    outputs) (supervised training)

    (training data set).

    [ ,x d ], x () d

    x .

    .

    j, ( )x n , n =

    1,2.N, :

    ej(n)=dj(n)-yj(n) (2.2)

    dj yj j

    )(nx .

    j :

    1

    2ej

    2 (n) (2.3)

    , n-

    - :

    21( ) ( )2

    j

    j

    G n e n (2.4)

  • 9

    G(n) , ,

    n, back propagation. ,

    , , n,

    () wji(n) wji(n) j

    ( m) i (m-1)

    , m=1, , N.

    , , , ,

    . )(

    )(

    nw

    nG

    ji

    ,

    ,

    n- , ,

    (steepest descent) ([37], [38]) :

    (2.5)

    Newton, ,

    , (

    Hessian ),

    , ,

    , (learning rate) .

    :

    ji j iw (n)= (n) (n)y (2.6)

    j j - m - iy

    i (m-1), i-

    , m=1.

    ji

    ( )w (n)=

    ( )ji

    G n

    w n

  • 10

    j y

    n. , m

    , (

    ) k- :

    ( ) ( )[1 ( )][ ( ) ( )]k k k k kn y n y n d n y n (2.7)

    , m , :

    ' ( ) ( )[1 ( )] ( )k k k j jkj

    n y n y n n w

    (2.8)

    '

    k , ,

    j (m+1) .

    , ,

    ([37], [38]).

    , Back-Propagation

    ,

    , , , (forward pass)

    (reverse pass).

    ,

    , (2.1),

    ,

    () ,

    (2.7).

    , , ,

    , , () -

  • 11

    (2.8) - ,

    .

    2.1.3 Perceptrns

    , , .

    - (overtraining)

    (generalisation) .

    (pruning) (constructive) ([44])

    .

    , ,

    ,

    ,

    .

    (adaptation) .

    (multilayer, feedforward networks perceptrons)

    , - .

    , , , ,

    , -

    , . ,

    , , ,

    ,

  • 12

    , ()

    .

    multilayer perceptorns

    ,

    , - ,

    (kernel) perceptrons support vector machines.

    (, )

    -, .

    multilayer perceptrons ([45],

    [46]) , ,

    -

    .

    , , , p

    , ix ,

    . p-

    , ( )iy x

    1 2( ) ...

    p

    Ti i i

    iy x p p p (2.9)

    ij

    p i- j- .

    (training set), ,

    1 1, , , ,b bb m mS x d x d , ix id bmi ,,2,1

    i-

    p .

  • 13

    , ( )iy x

    . i-

    ,

    , ,

    , .

    , ,

    .

    bw ,

    , aw

    .

    , Sc

    , , , mc :

    1 1, , , ,c cc m mS x d x d (2.10)

    ix id cmi ,,2,1 i-

    .

    aw ,

    ,

    afaca EEE ,,

    (2.11)

    , 2

    1

    1( )

    2

    cm

    c a a i i

    i

    E z x d

    (2.12)

    , 2

    1

    1( )

    2

    bm

    f a a i i

    i

    E z x d

    (2.13)

  • 14

    , , : acE ,

    cS afE ,

    , bS .

    ( )a iz x ( )a iz x ,

    ix ix ,

    aw . ( )b iz x

    bw , ix .

    , . , , ( )b iz x

    ( )iy x .

    ()

    , 2 L2 - .

    aw .

    ([45]).

    ()

    bw

    . :

    w .

    aw ,

    , Taylor

    .

  • 15

    (2.11) (2.13)

    ,

    . (

    ,

    ),

    , :

    cS .

    , (3.64)

    c A

    .

    ,

    , ,

    bS ,

    - bS -

    .

  • 16

    ,f bE ,f aE , az

    bz .

    :

    K

    bw bS .

    , . ,

    (2.19),

    (2.16) (2.11). gradient

    projection ([45])

    .

    :

    1)

    cS .

    2) ,

    ( cS )

    ( bS ).

    3) cS

    () ,

    . , ,

    - -

    .

    4) , bS ,

    ,

  • 17

    . ,

    , ,

    cS ,

    cS bS ,

    .

    2.2

    2.2.1

    (shallow)

    (deep)

    , ,

    .

    , 2.1,

    ,

    ( Retina - LGN - V1 - V2 - V4 - PIT - AIT ....).

  • 18

    2.1

    .

    ,

    .

    ,

    .

    2.2

    /.

  • 19

    , 2.2.

    :

    E

    . ,

    ,

    backpropagataion ,

    .

    , ,

    ,

    .

    .

    , ,

    , .

    (.. ).

    /

    , .

    2

    -.

  • 20

    (2.20)

    ,

    ,

    ""(hardware), ( ,

    , ).

    2 ( ),

    .

    ,

    .

    , .

    .

    2.3.

  • 21

    2.3 .

    .

    (

    ). :

    .

    .

    , ,

    Vapnik-Chervonenkis (VC) :

    VC.

    :

    H

    2 .

    Microsoft Google

    .

  • 22

    Microsoft, Google, IBM, Nuance, AT & T

    .

    .

    2.2.2

    . ,

    . pooling,

    , ,

    2.4.

    2.4.

    :

  • 23

    2.5 .

    :

    -

    .

    ,

    , , ( )

    , .

    ,

    .

    ,

    Relu,

    (tanh), (winner-takes-all).

    , , :

  • 24

    (2.22)

    2.2.3 E

    2.6 .

    . feed-forward

    2.6.

  • 25

    ( Wi).

    i forward propagation :

    A :

    :

    (2.26)

    (2.27)

    :

    (2.28)

    Fi Wi .

    (2.29)

  • 26

    :

    (2.30)

    A :

    (2.31)

    :

    (2.32)

    :

    (2.33)

    Fi i-1 .

    Fi 2 .

    .

    ,

    , :

    (2.34)

    O :

  • 27

    (2.35)

    :

    (2.36)

    (2.37)

    M :

    (2.38)

    backward sweep

    back-propagation

    :

    O :

  • 28

    (2.39)

    .

    :

    i [1,n].

    ,

    W:

    (2.40)

    backward propagation :

    (2.41)

  • 29

    , :

    (2.42)

    backward propagation :

    (2.43)

    :

    (2.44)

    tanh .

    2.3

    , , ,

    (tagging) Google Baidu.

    , IMAGEnet, ,

    , , .

    ,

    (/) . ,

    , -, , , RGB-

    , .

  • 30

    2.7.

    2.7

    , , 200x200 ,

    400.000 ,

    16.000.000.000 . 400.000

    10x10 , 40 .

    2.8

  • 31

    ,

    (weight sharing, regions of interest).

    . ,

    , .

    ( ).

    Relu ( )

    (2.45)

    (feature map).

    (2.46)

    (3-) ,

    , 2.9.

    2.9 .

    ,

    .

  • 32

    2.10 .

    , MNIST, StreetView,

    [2011] GTSRB (IDSIA, NYU),

    INRIA (NYU, 2013),

    (IDSIA, MIT, 2009),

    IMAGEnet (2012),

    (IBM Google, 2012), -

    (IDSIA, 2011), .

    .

  • 33

    2.11 .

  • 34

    3.

    ,

    ,

    (support vector machines).

    ,

    ,

    .

    3.1

    3.1.1

    (kernel methods)

    ([47-50]).

    .

    ( ),

    .

    ,

    .

    , (kernel

    machine) (kernel function).

  • 35

    ,

    .., support vector machines (supervised) ,

    supprt vector clustering (unsupervised) .

    ,

    (hypothesis language),

    .

    , .

    , :

    (feature space).

    .

    ,

    ( ) ,

    .

    (efficiently)

    .

    (inductive bias)

    (language bias) .

    ,

    : x n (x) n (3.1)

  • 36

    ,

    .

    :

    E

    ,

    k(x, 'x ) := (x),( 'x ) = (x) ( 'x ) (3.2)

    , ,

    .

    .

    xi

    .

    ,

    .

    3.1.2

    . ,

    k

    , ,

  • 37

    : (3.3)

    :

    k(x, 'x ) := (x),( 'x ) = (x) ( 'x ) (3.4)

    , .

    k,

    , .

    k(x, 'x ). ,

    . ,

    (x).

    , , ,

    .

    ,

    .

    .

    k : X X

    : x ( Hilbert ), :

    k(x, 'x ) = (x),( 'x ) = (x) ( 'x ) (3.5)

  • 38

    x, 'x , k

    .

    .

    1: Gram (Gram Matrix)

    k : 2 , k : 2 x1,, xm ,

    m m ij := k(xi, xj) Gram

    (kernel matrix) k x1,,xm .

    2: (Positive Definite Matrix)

    m m :

    .

    0i j

    Kcc ij ij (3.6)

    ci . ,

    m m , ci ,

    .

    3: (Positive Definite Kernel)

    . k , , m

    x1 ,xm , Gram,

    .

  • 39

    . ,

    (closure) . ,

    , ,

    , .

    3.1.3

    ,

    .

    ', xx n n .

    ',)',( xxxxk (3.7)

    )',( xxk = '

    ',

    xx

    xx (3.8)

    RBF . , c

    , p + ,

    )',( xxk = ( ', xx + c)p (3.9)

    . p

  • 40

    , c

    c = 0, p

    .

    , RBF :

    )',( xxk = exx '

    2

    (3.10)

    , Support Vector Machines,

    () ,

    (support vectors).

    n

    : n )',( xxk = )'(),( xx

    Hilbert .

    .

    (matching) kd : X X :

    (3.11)

    : X H,

    kd )',( xx = )'(),( xx ,

    .

    kd )',( xx =

    0

    1

    'xx

  • 41

    - .

    (structured),

    . o

    -, ,

    .

    ,

    (attribute-value) , (convolution kernel)

    Haussler 1999.

    R

    . x, x ' X , x, 'x

    X1 XD . R : (X1

    XD) X, R-1(x) = {x : R(x,x)}.

    kd : Xd Xd , :

    1 1

    '

    1( ), ' ( ')

    ( , ') ( , )k

    D

    d d dconvdR x R x

    x x k x x

    x x

    (3.12)

    .

    . ,

    ,

    R

    .

  • 42

    .

    , x X

    (set intersection kernel or set kernel)

    kset(,) = | | (3.13)

    | | .

    (crossproduct kernel)

    ( , ) ( , )A B

    base A B

    x A x B

    k A B k x x

    (3.14)

    kbase(,) .

    3.1.4

    ,

    .

    modifier : P (X X ) (X X ), (3.15)

  • 43

    (modifier) (

    P),

    .

    polynomial(p,l)(k)( ', xx ) = (k( ', xx )+l)p ( l 0, p +) (3.16)

    gaussian()(k)(x,x) = e-[k(x,x)-2k(x,x)+k(x,x)] ( > 0) (3.17)

    normalised(k)( ', xx ) = )','(),(

    )',(

    xxkxxk

    xxk (3.18)

    3.2

    (Support Vector Machines)

    . 2.3

    2. Perceptron

    , .

    3.2.1

    C0 C1

    . w w0

    :

    wTx + w0 =

    0

    0

    x C0

    x C1

    (3.19)

  • 44

    , (w,w0)

    , .

    (margin)

    = 0 + 1 ( 0

    C0 1 C1),

    0 =

    0

    0wmin

    T

    C

    x

    w x

    w (3.20)

    1 =

    1

    0wmin

    T

    C

    x

    w x

    w (3.21)

    x C0 x ' C1 ,

    ( 0 1

    ), (support vectors).

    ) w0 (0 = 1)

    ) w w0 , :

    wTx + w0 =

    1

    1

    x C0

    x C1

    (3.22)

    0 = 1 = 1

    w =

    2

    w.

  • 45

    J(w,w0) = w2

    2

    1 (3.23)

    P

    di(wTxi + w0) 1 i = 1,P (3.24)

    Lagrange Ld(1,...,p) = - L(1,...,p),

    i, :

    L(1,...p) = 1 1 1

    1

    2i

    P P PT

    i i j i j j

    i i j

    d d

    x x (3.25)

    ,

    Ld 1,...,p :

    1

    0P

    i i

    i

    d

    i 0, i=1,,P (3.26)

    w

    .

    , io i i

    sv

    di I

    w = x (3.27)

    SV , ,i

    Lagrange (3.26).

    :

  • 46

    g*(x) = , 0Tio i i

    sv

    d wi I

    x x (3.28)

    :

    w0 = 1 1

    SV

    Ti

    SV idI i I

    w x (3.29)

    3.2.2 M

    ,

    , ,

    Jns(w,w0) = 2

    1

    1

    2

    P

    ii

    C

    w (3.30)

    P

    di(wTxi + w0) 1-i

    i 0 (3.31)

    i = 1,P , i C

    .

    Ld 1,...,p , (3.25)

  • 47

    1

    0P

    i

    di i

    0 i C, i=1,,P (3.32)

    (3.27).

    ,

    ,

    .

    3.2.3

    ,

    , . :

    (),

    .

    , .

    Cover

    :

    Cover

    .

  • 48

    , Cover

    ,

    :

    .

    ,

    .

    x ,

    m0 1

    1{ ( )}

    m

    j j

    x

    m1 . :

    1

    0

    ( ) 0j jj

    m

    w

    x ( 0(x) = 1 x) (3.33)

    wT(x) = 0 (3.34)

    :

    w =1

    ( )P

    i i i

    i

    d

    x (3.35)

    (xi) xi .

  • 49

    :

    1

    ( ) ( ) 0P

    i i i

    i

    d

    x x (3.36)

    (xi)(x)

    x xi.

    (inner-product kernel) :

    k(x, xi) = (xi)(x) = 1

    0

    ( ) ( )j j ij

    m

    x x i = 1,2,,P (3.37)

    , :

    k(x, xi) = k(xi, x) i (3.38)

    ,

    :

    1

    ( , ) 0P

    i i i

    i

    d k

    x x (3.39)

    (3.37)

    ,

    .

    :

  • 50

    {(xi, di)} i=1,P,

    Lagrange i i=1,P,

    Q() = 1 1 1

    1( , )

    2

    P P P

    i j i j i jii i j

    d d k

    x x (3.40)

    1) 1

    0P

    i ii

    d

    2) 0 i C i=1,2,.P

    k(xi, xj) ij-

    k :

    ( , ) 1

    ( , )P

    i ji j

    k

    k x x (3.41)

    Lagrange o,i

    w

    ,

    1

    ( )P

    o i i i

    i

    d

    w x (3.42)

    .

  • 51

    3.3

    ([38])

    ,

    .

    = AT , , Tox

    Aox Ind(A) .

    F = { F1, F2,., Fm } T ,

    Fpk : Ind(A) Ind(A) [0,1] Lp

    , :

    a, b Ind(A)

    1

    ( , )( , ) :

    1

    F

    p

    p pm a bik a bi m

    (3.43)

    p > 0 , i {1,.,m}, () i

    , a, b Ind(A):

    i(a,b) =

    1

    0

    12

    (Fi(a) A Fi(b) A) (Fi(a) A Fi(b) A)

    (Fi(a) A Fi(b) A) (Fi(a) A Fi(b) A)

    (3.44)

  • 52

    -:

    i(a,b) =

    1

    0

    12

    (K Fi(a) K Fi(b)) (K Fi(a) K Fi(b))

    (K Fi(a) K Fi(b)) (K Fi(a) K Fi(b))

    (3.45)

    , ,

    ,

    Aox i Fi

    1 ():

    Fi , Fi , ABox,

    TBox

    . 0 (): , ..,

    a Fi , b (

    Fi), .

    .

    - - Fi ,

    , Fi ,

    . ,

    - - Fi

    .

    ,

    -

    - i ,

    , , .

  • 53

    , (instance checking),

    ,

    i. , ,

    (

    ). ,

    ABox,

    , (3.44) i .

    . ,

    ,

    .

    ,

    (3.44) - (3.45).

    F

    ( ).

    ,

    .

    - -

    ( ).

    p > 0 F, Fpk

    ( ) .

  • 54

    Fpk .

    ,

    ,

    (closure) .

    , i (i = 1,...,n)

    (matching kernels), Fpk

    ,

    .

    SVM ,

    , ,

    ([52], [53]).

    , ,

    SVM

    :

    ATK , , Ind A

    A C = {C1,,Cs} ( )

    K .

    :

    AInda ,

    {C1,,Ct} C, K Ci(a) i {1,,t} (3.46)

  • 55

    4.

    ,

    ([62], [69],

    [72]).

    4.1

    ,

    , 6 ,

    . ,

    . , Whissel

    ([1]),

    .

    (activation) (evaluation).

    -

    .

    , .

  • 56

    , 4.1:

    (Valence):

    ,

    .

    positive/negative ( ).

    :

    .

    ,

    ( )

    ( ).

    4.1

    o

    - .

  • 57

    - ,

    . ,

    (neutral).

    - .

    , ,

    .

    .

    4.2

    (FACS). FACS

    ,

    ([56], [60], [61]). FACS

    ISO MPEG-4 ([68], [71]).

    ISO MPEG-4,

    (synthetic natural hybrid coding, SNHC). -

    SNHC MPEG-4 :

    MPEG-4 1 : .

    (avatars) FBA (

  • 58

    ) o MPEG-4

    1998. T

    . , Facial Definition Points (FDPs)

    Facial

    Animation Parameters (FAPS) .

    MPEG-4 2 : . FDPs

    FAPs, BDPs

    avatar s .

    MPEG-4 16: . MPEG-4

    , (Bone-based Animation).

    16 MPEG-4,

    ,

    (SMS).

    MPEG-4,

    (FBA)

    . FBA

    cartoons. , MPEG-4

    .

    , MPEG-4 84 FDPs

    ,

    FAPs. , (FDP)

    (FAP) MPEG-4

  • 59

    ( FDPs)

    , ( FAPs).

    FAPs

    MPEG-4,

    , .

    FAPs ,

    FDPs, ,

    (FPs),

    .

    1: FAPs

  • 60

    1 FAPs

    15 fi (i = 1..15).

    ,

    - ,

    . , s(x, y),

    x y,

    4.2.

    .

    1 s(x,y)

    FPs, Di-NEUTRAL Di

    .

  • 61

    .

    4.2 FDP.

    FAPs.

    ..

    ([57]).

    /

    - ([51], [59], [65]). ,

  • 62

    , , ,

    ([54], [58]).

    / ,

    ([64]).

    , , -

    .

    .

    ,

    ([63])

    .

    , , ,

    .

    ,

    ( )

    , blobs

    .

    , .

    FPs.

  • 63

    4.3

    MPEG-4

    . FAPs,

    MPEG-4

    ([2]), (AUs),

    (FACS) ([3],

    [67]).

    Whissel ( 4.1),

    ( -

    - ).

    ,

    FAPs ([4]).

    FAPs , , ,

    ([5]).

    2.

    ([5]).

  • 64

    2: ()

    FAPs

    2 F3_M+F4_L+F5_L+[F53+F54]_H

    +[F19+F21]_H+[F20+F22]_H

    (+,+)

    7 F3_L+F4_L+F5_H+[F53+F54]_H+

    [F19+F21]_H+[F20+F22]_H+[F37

    +F38]_M+F59_H+F60_H

    (+,+)

    13 F3_L+F4_M+F5_H+F31_L+F32_L

    +F33_L+F34_L+F37_H+F38_H+

    F59_M+F60_M

    (-,+)

    16 F3_L+F4_M+F5_L+F31_L+F32_L

    +F33_L+F34_L+F37_H+F38_H+

    F59_M+F60_M

    (-,+)

    26 F3_M+F5_L+[F19+F21]_H+[F20+

    F22]_H+F31_H+F32_H+F33_M+

    F34_M+F35_M+F36_M+[F37+F3

    8]_H

    (-,-)

    34 F3_L+F4_L+[F53+F54]_L+F31_M

    +F32_M+F33_M+F34_M+F35_M

    +F36_M+F37_M+F38_M

    (-,-)

    41 F3_L+F4_M+F31_M+F32_M+F33

    _M+F34_M+F35_M+F36_M+F37

    _M+F38_M+ F59_M+F60_M

    Neutral

  • 65

    5.

    ,

    :

    ,

    () - .

    , ,

    ,

    . ,

    , .

    ,

    ,

    , (SVM).

    SVM

    , - SVM

    .

  • 66

    ,

    . , ,

    ,

    .

    , .

    FAPs -

    FAPs

    4 .

    .

    ,

    ,

    .

    .

    5.1. ( PAL)

    (

    250x250 ) , -

    .

  • 67

    4 .

    pooling.

    , softmax ,

    .

    ( )

    (Relu).

    (

    5.2)

    - .

    ,

    ([33]) ( ) ,

    ,

    . ,

    ,

    ,

    .

    ,

    .

    (FAPs), ,

    4,

    - . ,

    FAPs .

  • 68

    5.1 .

    SVM

    -

    . -

    -- (

    , FAPs) , SVM

    SVM .

    SVM

    . 5.2

    .

    5.2 ,

    FAPs.

  • 69

    6.

    6.1

    , naturalistic

    Queens

    (QUB) C FP5 IST C FP6 IST

    Humaine (etwork of Excellence) C FP7 IST

    Semaine.

    , 5,

    ,

    .

    6.1 4

    - .

    ,

    ,

    .

  • 70

    ()

    ()

  • 71

    ()

    ()

  • 72

    ()

    6.1 : ) (+,+) ) (+,-)

    ) (-,+) ) (-,-) )

    6.2

    ,

    (Dr Roddie Cowie, Professor QUB),

    , , .

    ,

    ,

    (FAPs) ,

    .

  • 73

    ,

    FAPs,

    . ,

    ( , )

    .

    ( ),

    FAPs

    ..

    ([6], [55], [66], [70]).

    6.3

    2 5

    , 11.152 .

    40% (+, +)

    4.1, 30% (-, +) , 15% (+, -),

    10% (-, -) 5 % .

    ([51], OpenCV)

    (

    OpenCV) 256 x 256 .

  • 74

    - -

    )

    - , )

    FAP (Low, Medium, High ).

    6.2

    ( FAPs)

    5 (/,

    /, ).

  • 75

  • 76

    ()

  • 77

  • 78

    ()

  • 79

  • 80

    ()

  • 81

  • 82

    ()

  • 83

  • 84

    ()

    6.2 3 (

    FAPs) 5 : ) (+,+)

    ) (+,-) ) (-,+) ) (-,-) ) .

    ,

    4 , 5. Caffe

    .

    pooling , max-pooling.

    20 50, 5x5

    . :

    0,001 0.002 bias,

  • 85

    0,9 bias,

    (weight decay) 0,004 . pooling

    2x2.

    -

    , 1.

    100 (Relu) 5 (

    ) Softmax .

    83%

    ( 80-20%) .

    -

    Relu 500

    210,

    FAPs .

    FAPs

    95% ,

    FAPs .

    ,

    FAPs, ([5]),

    ( - )

    210 FAPs. 67 (

    32%)

    ,

    w FAPs

    .

  • 86

    5

    ([5]) SVM

    (3.43)-(3.45) SVM

    67 .

    SVM FAP 60%,

    SVM .

    , 50 FAPs o

    o FAP SVM. SVM

    85%,

    SVM

    . SVM ( )

    6.3.

  • 87

    6.3:

  • 88

    7.

    ,

    ,

    ,

    .

    . ,

    ,

    ,

    ,

    .

    ,

    ([73]).

    , :

    .

  • 89

    ( ,

    ).

    .

    .

    .

    .

    (integrating

    feed-forward and feedback transfer of information). H Boltzmann

    , .

    ,

    (),

    ""

    ,

    , .. Stephane

    Mallat, .

    ,

    .

  • 90

    8.

    [1] C. M. Whissel, The dictionary of affect in language, R. Plutchnik and H.

    Kellerman (Eds) Emotion: Theory, research and experience: vol 4, The measurement

    of emotions. Academic Press, New York, 1989.

    [2] M. Tekalp, "Face and 2-D Mesh Animation in MPEG-4", Tutorial Issue On The

    MPEG-4 Standard, Image Communication Journal, Elsevier, 1999.

    [3] P. Ekman and W. Friesen, The Facial Action Coding System, Consulting

    Psychologists Press, San Francisco, CA, 1978.

    [4] Raouzaiou, A., Tsapatsoulis, N., Karpouzis, K., & Kollias, S. (2002).

    Parameterized facial expression synthesis based on MPEG-4. EURASIP Journal on

    Applied Signal Processing, 2002(10), 10211038. Hindawi Publishing Corporation.

    [5] S. Ioannou, A. Raouzaiou, V. Tzouvaras, T. Mailis, K. Karpouzis, S. Kollias,

    "Emotion recognition through facial expression analysis based on a neurofuzzy

    network", Special Issue on Emotion: Understanding & Recognition, Neural

    Networks, Elsevier, Volume 18, Issue 4, May 2005, Pages 423-435.

    [6] S. Asteriadis, P. Tzouveli, K. Karpouzis, S. Kollias, "Non-verbal feedback on user

    interest based on gaze direction and head pose", 2nd International Workshop on

    Semantic Media Adaptation and Personalization (SMAP 2007), London, United

    Kingdom, 17-18 December 2007.

    [7] Fanizzi, N., d Amato, C., Esposito, F.: Statistical learning for inductive query

    answering on owl ontologies. In: Proceedings of the 7th International Semantic Web

    Conference (ISWC), pp. 195212 (2008).

    [8] Avila Garcez, A.S., Broda, K., Gabbay, D.: Symbolic knowledge extraction from

    trained neural networks: A sound approach. Artificial Intelligence 125, 155207

    (2001).

  • 91

    [9] Avila Garcez, A.S., Broda, K., Gabbay, D.: The connectionist inductive learning

    and logic programming system. Applied Intelligence, Special Issue on Neural

    networks and Structured Knowledge 11, 5977 (1999).

    [10] Hitzler, P., Holldobler, S., Seda, A.: Logic programs and connectionist networks.

    Journal of Applied Logic, 245272 (2004).

    [11] Pinkas, G.: Propositional non-monotonic reasoning and inconsistency in

    symmetric neural networks. In: Proceedings of the 12th International Joint

    Conference on Artificial Intelligence, pp. 525530 (1991).

    [12] Y. Bengio, Learning deep architectures for AI, Foundations and Trends in

    Machine Learning, vol. 2, no. 1, pp. 1127, 2009.

    [13] N. Morgan, Deep and wide: Multiple layers in automatic speech recognition,

    Audio, Speech, and Language Processing, IEEE Trans- actions on, vol. 20, no. 1, pp.

    713, 2012.

    [14] A. Mohamed, G.E. Dahl, and G. Hinton, Acoustic modeling using deep belief

    networks, Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20,

    no. 1, pp. 1422, 2012.

    [15] G. Sivaram and H. Hermansky, Sparse multilayer perceptron for phoneme

    recognition, Audio, Speech, and Language Processing, IEEE Transactions on, vol.

    20, no. 1, pp. 2329, 2012.

    [16] H. Lee, C. Ekanadham, and A. Ng, Sparse deep belief net model for visual area

    v2, Advances in neural information processing systems, vol. 20, pp. 873880, 2008.

    [17] Y. Tang and C. Eliasmith, Deep networks for robust visual recognition, in

    International Conference on Machine Learning. Citeseer, 2010, vol. 28.

    [18] H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng, Unsupervised learn- ing of

    hierarchical representations with convolutional deep belief net- works,

    Communications of the ACM, vol. 54, no. 10, pp. 95103, 2011.

    [19] K. Sohn, D.Y. Jung, H. Lee, and A.O. Hero, Efficient learning of sparse,

    distributed, convolutional feature representations for object recognition, in Computer

    Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011, pp. 26432650.

  • 92

    [20] A. Krizhevsky, I. Sutskever, and G.E. Hinton, Imagenet classifica- tion with

    deep convolutional neural networks, in Advances in Neural Information Processing

    Systems 25, 2012, pp. 11061114.

    [21] Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep

    convolutional neural networks. In: NIPS, pp. 11061114 (2012).B. Schoelkopf, A.

    Smola Learning with Kernels, The MIT Press, Cambridge, 2002.

    [22] Neverova, N., Wolf, C., Taylor, G.W., Nebout, F.: Moddrop: adaptive multi-

    modal gesture recognition. arXiv:1501.00102 (2014).

    [23] Kahou, S.E., Froumenty, P., Pal, C.: Facial expression analysis based on high

    dimensional binary features. In: ECCV Workshop on Computer Vision with Local

    Binary Patterns Variants. Zurich, Switzerland (2014).

    [24] Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network

    for modelling sentences. arXiv:1404.2188 (2014).

    [25] Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.R., Jaitly, N., Senior, A.,

    Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic

    modeling in speech recognition: The shared views of four research groups. IEEE Sig.

    Proc. Magazine, 29(6), 8297 (2012).

    [26] C.N. Anagnostopoulos, T. Iliou, and I. Giannoukos, Features and clas- sifiers for

    emotion recognition from speech: a survey from 2000 to 2011, Artificial Intelligence

    Review, pp. 123, 2012.

    [27] M. El Ayadi, M.S. Kamel, and F. Karray, Survey on speech emotion recognition:

    Features, classification schemes, and databases, Pattern Recognition, vol. 44, no. 3,

    pp. 572587, 2011.

    [28] D. Ververidis and C. Kotropoulos, Fast and accurate sequential floating forward

    feature selection with the bayes classifier applied to speech emotion recognition,

    Signal Processing, vol. 88, no. 12, pp. 2956 2970, 2008.

    [29] C. Busso, Z. Deng, S. Yildirim, M. Bulut, C.-M. Lee, A. Kazemzadeh, S. Lee, U.

    Neumann, and S. Narayanan, Analysis of emotion recognition using facial

    expressions, speech and multimodal information, in Proceedings of the 6th

    international conference on Multimodal inter- faces. ACM, 2004, pp. 205211.

  • 93

    [30] G.W. Taylor, G.E. Hinton, and S.T. Roweis, Modeling human motion using

    binary latent variables, Advances in neural information process- ing systems, vol.

    19, pp. 1345, 2007.

    [31] S.E. Kahou, et al, Facial expression analysis based on high dimensional binary

    features, ECCV Workshop on Computer Vision, Zurich, Switzerland, 2014.

    [32] Y. Kim, H. Lee and E.M. Provost, Deep learning for robust feature generation in

    audiovisual emotion recognition, ICASSP, Vancouver, British Columbia, Canada,

    May 2013.N. Fanizzi, C. d Amato. A declarative kernel for ALC concept

    descriptions. In : F. Esposito, Z. W. Ras, D. Malerba, G. Semeraro (eds) ISMIS 2006.

    LNCS (LNAI) Springer, vol. 4203, pp. 322-331, 2006.

    [33] S.E. Kahou, et al, Emonets: Multimodal deep learning approaches for emotion

    recognition in video, arXiv:1503.01800v2, 30 March 2015.

    [34] Y. Li, S. Wang, Y. Zhao and Q. Ji, Simultaneous facial feature tracking and

    facial expression recognition, IEEE Transaction on Image Processing, vol.22, pp.

    2559-2573, 2013

    [35] A. Saeed, A. Al-Hamadi, R. Niese and M. Elzobi, Frame-based facial

    expression recognition using geometric features, Advances in Human-Computer

    Interaction, vol. 2014, pp. 1-13, 2014

    [36] A. C. Cruz, B. Bhanu and N. S. Thakoor, Vision and attention theory base

    esampling for continuous facial emotion recognition, IEEE Transaction on Affective

    Computing, vol. 5, pp. 418-431, 2014

    [37] S. Haykin. Neural Networks: A Comprehensive Foundation, Prentice Hall

    International, Inc, 1999.

    [38] . . , , 2007.

    [39] Sebastian Bader, Pascal Hitzler, Steffen Hlldobler. Connectionist Model

    Generation: A First-Order Approach, Neurocomputing, Elsevier, 2008.

  • 94

    [40] B. Hammer, P. Hitzler (eds.). Perspectives of Neural-Symbolic Integration.

    Studies in Computational Intelligence, Vol. 77. Springer, ISBN 978-3-540-73952-1,

    2007.

    [41] A. Chortaras, G. Stamou, A. Stafylopatis, Adaptation of Weighted Fuzzy

    Programs, Lecture Notes in Computer Science, 4132, pp. 45-54, 2006.

    [42] W. Duch. Similarity based methods: a general framework for classification,

    approximation and association, Control and Cybernetics 29 (4), pp. 937-968, 2000.

    [43] M. Ehrig, P. Haase, M. Hefke, N. Stojanovic. Similarity for ontologies a

    comprehensive framework. In : Proceedings of the 13th European Conference on

    Information Systems, ECIS 2005.

    [44] N. Fanizzi, C. d Amato, and F. Esposito. Randomized metric induction and

    evolutionary conceptual clustering for semantic knowledge bases. In Proceedings of

    CIKM 07, 2007.

    [45] N. Simou, Th. Athanasiadis, S. Kollias, G. Stamou, and A. Stafylopatis.

    Semantic adaptation of neural network classifiers in image segmentation. 18th

    International Conference on Artificial Neural Networks (ICANN 2008), pp. 907-916,

    September 2008, Prague, Czech Republic, 2008.

    [46] N. Doulamis, A. Doulamis, and S. Kollias. On-line retrainable neural

    networks: Improving performance of neural networks in image analysis problems.

    IEEE Transactions on Neural Networks 11, pp. 1-20, 2000.

    [47] B. Schoelkopf, A. Smola Learning with Kernels, The MIT Press, Cambridge,

    2002.

    [48] T. Gartner, J. Lloyd, P. Flach. Kernels and distances for structured data.

    Machine Learning, 57, pp. 205-232, 2004.

    [49] J. Shawe-Taylor, N. Cristianini. Kernel Methods for Pattern Analysis,

    Cambridge University Press, 2004.

  • 95

    [50] C. Bishop. Pattern Recognition and Machine Learning, Springer, 2006.

    [51] Ming-Hsuan Yang, David J. Kriegman, and Narendra Ahuja, Detecting Faces

    in Images: A Survey, IEEE Transactions on Pattern Analysis and Machine

    Intelligence, Vol. 24, No. 1, January 2002.

    [52] C. J. C. Burges, A Tutorial on Support Vector Machines for Pattern

    Recognition, Knowledge Discovery and Data Mining, 2(2), 1998.

    [53] E. Osuna, R. Freund, and F. Girosi, Training support vector machines: An

    application to face detection, , In Proceedings of CVPR'97, Puerto Rico, 1997.

    [54] J. Ahlberg, An Optimisation Approach to facial feature extraction, Proc. of

    IEEE Workshop on Real-Time Analysis and Tracking of Face and Gesture in Real-

    Time Systems, Kerkyra, Greece, September 1999.

    [55] A.L. Yuille, D.S. Cohen and P.W. Hallinan, Feature Extraction from Faces

    Using Deformable Templates, International Journal Computer Vision, vol. 8, no. 2,

    pp. 99-111, August 1992.

    [56] R. Chellappa, C. L. Wilson and S. Sirohey, Human and Machine Recognition

    of Faces: A Survey, Proc .of the IEEE, vol. 83, no. 5, pp. 705-740, May 1995.

    [57] R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, G. Votsis, S. Kollias, W.

    Fellenz and J. G. Taylor, Emotion Recognition in Human Computer Interaction,

    IEEE Signal Processing Magazine, vol.18, no. 1, Jan.2001.

    [58] P. Eisert and B. Girod, Model-Based Estimation of Facial Expression

    Parameters from Image Sequences, Proc. of IEEE International Conference on

    Image Processing, 1997.

    [59] Y. Tian, T. Kanade and J. F. Cohn, Multi-State Based Facial Feature

    Tracking and Detection, tech. report CMURITR9918, Robotics Institute, Carnegie

    Mellon University, Aug.1999.

  • 96

    [60] L.D. Harmon, M.K. Khan, R. Lasch and P.F. Ramig, Machine Identification

    of Human Faces, Pattern Recognition, vol. 13, no. 2, pp. 97-110, 1981.

    [61] X. Jia and M.S. Nixon, Extending the Feature Vector for Automatic Face

    Recognition, IEEE Tans. Pattern Analysis and Machine Intelligence, vol. 17, no. 12,

    pp. 1167-1176, December 1995.

    [62] K. Karpouzis, G. Votsis, N. Tsapatsoulis and S. Kollias, Compact 3D Model

    Generation based on 2D Views of Human Faces: Application to Face Recognition,

    Machine Graphics and Vision, vol. 7, no.1-2, pp. 75-85, 1998.

    [63] M. Kass, A. Witkin and D. Terzopoulos, Snakes: Active Contour Models,

    International Journal of Computer Vision, vol. 1, no. 4, pp. 321-331, 1988.

    [64] B.K. Low and M.K. Ibrahim, A Fast and Accurate Algorithm for Facial

    Feature Segmentation, Proc. of IEEE International Conference on Image

    Processing, 1997.

    [65] K. Sobottka and I. Pitas, Looking for Faces and Facial Features in Color

    Images, Pattern Recognition and Image Analysis: Advances in Mathematical Theory

    and Applications, vol. 7, no. 1, 1997.

    [66] D. Metaxas, Deformable Model and HMM-Based Tracking, Analysis and

    Recognition of Gestures and Faces, Proc. of IEEE Workshop on Real-Time Analysis

    and Tracking of Face and Gesture in Real-Time Systems, Kerkyra, Greece,

    September 1999.

    [67] A. Samal and P.A. Iyengar, Automatic Recognition and Analysis of Human

    Faces and Facial Expressions: A Survey, Pattern Recognition, vol. 25, no. 1, pp. 65-

    76, 1992.

    [68] M. Preda & F. Prteux, Advanced animation framework for virtual characters

    within the MPEG-4 standard, Proc. of the Intl Conference on Image Processing,

    Rochester, NY, 2002.

  • 97

    [69] Y. Votsis, N. Drosopoulos and S. Kollias, Facial feature segmentation: a

    modularly optimal approach on real sequences Signal Processing: Image

    Communication, vol. 18 , no 1, pp. 67-89, 2003.

    [70] G. Tsechpenakis, N. Tsapatsoulis and S. Kollias, "Probabilistic Boundary-

    Based Contour Tracking with Snakes in Natural Cluttered Video Sequences",

    International Journal of Image and Graphics: Special Issue on Deformable Models for

    Image Analysis and Pattern Recognition, to appear, 2003.

    [71] N. Tsapatsoulis, A. Raouzaiou, S. Kollias, R. Cowie and E. Douglas-Cowie,

    Emotion Recognition and Synthesis based on MPEG-4 FAPs, in MPEG-4 Facial

    Animation, Igor Pandzic, R. Forchheimer (eds), John Wiley & Sons, UK, 2002.

    [72] K. Karpouzis, A. Raouzaiou, A. Drosopoulos, S. Ioannou, T. Balomenos, N.

    Tsapatsoulis and S. Kollias, Facial Expression and Gesture Analysis for

    Emotionally-rich Man-machine Interaction, N. Sarris, M. Strintzis, (eds.), 3D

    Modeling and Animation: Synthesis and Analysis Techniques, Idea Group Publ.,

    2003.

    [73] D. Kollias, G. Marandianos, A. Raouzaiou and A. Stafylopatis, Interweaving

    Deep Learning and Semantic Techniques for Emotion Analysis in Human-Machine

    Interaction, SMAP2015, Italy, November 2015

    [74] Y. LeCun, M. Ranzato, Deep Learning Tutorial, ICML, Atlanta, 2013