Supervised learning: Mixture Of Experts (MOE) Network.

13
Supervised learning: Mixture Of Experts (MOE) Network
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    331
  • download

    2

Transcript of Supervised learning: Mixture Of Experts (MOE) Network.

Page 1: Supervised learning: Mixture Of Experts (MOE) Network.

Supervised learning: Mixture Of Experts

(MOE) Network

Page 2: Supervised learning: Mixture Of Experts (MOE) Network.

MOE Module

LocalExpert

GatingNetwork

LocalExpert

LocalExpert

P(y|x,

P(y|x, P(y|x,

P(y|x,

x

a (x

a (xa (x

Page 3: Supervised learning: Mixture Of Experts (MOE) Network.

P( y | x , Φ) = Σj P( y | x , Θj) aj( x )

For a given input x , the posterior probability of generating class y given x using K experts can be computed as

The objective is to estimate the model parameters so as to attain the highest probability of the training set given the estimated parameters.

Page 4: Supervised learning: Mixture Of Experts (MOE) Network.

Each RBF Gaussian kernel can be viewed as an local expert.

MAXNET

GatingNET

jCp k |

Page 5: Supervised learning: Mixture Of Experts (MOE) Network.

MAXNET

MOE Classifier

P(ωc|x,Ek)

ΣkP(Ek|x)P(ωc|x,Ek)

ωwinner

P(Ek|x,)

Page 6: Supervised learning: Mixture Of Experts (MOE) Network.

Given a pattern, each expert network estimates the pattern's conditional a posteriori probability on the (adaptively-tuned or pre-assigned) feature space. Each local expert network performs multi-way classification over K classes by using either K independent binomial model, each modeling only one class, or one multinomial model for all classes.

Mixture of ExpertsThe MOE [Jacobs91] exhibits an explicit relationship with statistical pattern classification methods as well as a close resemblance to fuzzy inference systems.

Page 7: Supervised learning: Mixture Of Experts (MOE) Network.

Two Components of MOE

• local experts:

• gating network:

Page 8: Supervised learning: Mixture Of Experts (MOE) Network.

•The design of modular neural networks hinges upon the choice of local experts.

•Usually, a local expert is adaptively trained to extract a certain it local feature particularly relevant to its local decision.

•Sometimes, a local expert can be assigned a predetermined feature space.

•Based on the local feature, a local expert gives its local recommendation .

Local Experts

Page 9: Supervised learning: Mixture Of Experts (MOE) Network.

LBF vs RBF Local Expertss

MLP RBFHyperplane Kernel function

Page 10: Supervised learning: Mixture Of Experts (MOE) Network.

Mixture of Experts

Class 1

Class 2

Page 11: Supervised learning: Mixture Of Experts (MOE) Network.

Mixture of Experts

Expert #1Expert #2

Page 12: Supervised learning: Mixture Of Experts (MOE) Network.

•The gating network serves the function of computing the proper weights to be used for the final weighted decision.

•A probabilistic rule is used to integrate recommendations from several local experts taking into account the experts' confidence levels.

Gating Network

Page 13: Supervised learning: Mixture Of Experts (MOE) Network.

The training of the local experts as well as (the confidence levels in) the gating network of the MOE network is based on the expectation-maximization (EM) algorithm.