Learning From Data Lecture 15 Reflecting on Our Path...
Transcript of Learning From Data Lecture 15 Reflecting on Our Path...
Learning From Data
Lecture 15
Reflecting on Our Path - Epilogue to Part I
What We Did
The Machine Learning Zoo
Moving Forward
M. Magdon-IsmailCSCI 4100/6100
recap: Three Learning Principles
Occam’s razor: simpler is better; falsifiable.
Scientist 1 Scientist 2 Scientist 3
temperature T
resistivityρ
temperature T
resistivityρ
temperature T
resistivityρ
not falsifiable falsifiable
Sampling bias: ensure that training and testdistributions are the same, or else acknowl-
edge/account for it. You cannot sample from onebin and use your estimates for another bin.
Data snooping: you are charged forevery choice influenced by D. Choose thelearning process (usually H) before looking at D.
We know the price of choosing g from H.
?h ∈ H ?
?g
Data
Dyour choices
−→ g
c© AML Creator: Malik Magdon-Ismail Reflecting on Our Path: 2 /11 Zen Moment−→
YZen Moment
c© AML Creator: Malik Magdon-Ismail Reflecting on Our Path: 3 /11 Our Plan −→
Our Plan
1. What is Learning?
Output g ≈ f after looking at data (xn, yn).
2. Can We do it?
Ein ≈ Eout simple H, finite dvc, large N
Ein ≈ 0 good H, algorithms
3. How to do it?
Linear models, nonlinear transforms
Algorithms: PLA, pseudoinverse, gradient descent
4. How to do it well?
Overfitting: stochastic & deterministic noise
Cures: regularization, validation.
5. General principles?
Occams razor, sampling bias, data snooping
6. Advanced techniques.
7. Other Learning Paradigms.
conceptstheorypractice
c© AML Creator: Malik Magdon-Ismail Reflecting on Our Path: 4 /11 LFD Jungle −→
Learning From Data: It’s A Jungle Out There
overfitting stochastic noise K-means stochastic gradient descent exploration
reinforcementexploitation
augmented errorill-posed
Gaussian processesbootstrapping
Lloyds algorithm
deterministic noisedistribution free learning
data snoopingQ-learning
unlabelled dataexpectation-maximizationlogistic regression
Rademacher complexitylinear regressionCARTbaggingBayesian VC dimension
transfer learning learning curve gans
sampling biasneural networks Markov Chain Monte Carlo (MCMC)
nonlinear transformation
Mercer’s theoremsupport vectors
Gibbs samplingdecision trees
adaboostSVM
graphical models bioinformatics
linear modelsordinal regression
training versus testingno free lunch
extrapolation
DEEP LEARNINGcross validation HMMs bias-variance tradeoff
PAC-learningbiometricserror measures
MDLmulticlassone versus all
active learning
types of learning
random forests unsupervisedweak learning
online-learning
RBF
is learning feasible?
data contaminationperceptron learning
noisy targetsranking
momentum
Occam’s razor
conjugate gradientsLevenberg-Marquardt
RKHS
kernel methodsmixture of expertsboosting
ensemble methodsAICpermutation complexity
multi-agent systemsclassification
primal-dualPCA
LLEkernel-PCA
colaborative filtering semi-supervised learningclustering
regularization
weight decayBig Data
Boltzmann machine
c© AML Creator: Malik Magdon-Ismail Reflecting on Our Path: 5 /11 Theory −→
Navigating the Jungle: Theory
THEORY
VC-analysis
bias-variance
complexity
Bayesian
Rademacher
SRM...
c© AML Creator: Malik Magdon-Ismail Reflecting on Our Path: 6 /11 Techniques −→
Navigating the Jungle: Techniques
THEORY
VC-analysis
bias-variance
complexity
Bayesian
Rademacher
SRM...
TECHNIQUES
Models Methods
c© AML Creator: Malik Magdon-Ismail Reflecting on Our Path: 7 /11 Models −→
Navigating the Jungle: Models
THEORY
VC-analysis
bias-variance
complexity
Bayesian
Rademacher
SRM...
TECHNIQUES
Models
linear
neural networks
SVM
similarity
Gaussian processes
graphical models
bilinear/SVD...
Methods
c© AML Creator: Malik Magdon-Ismail Reflecting on Our Path: 8 /11 Methods −→
Navigating the Jungle: Methods
THEORY
VC-analysis
bias-variance
complexity
Bayesian
Rademacher
SRM...
TECHNIQUES
Models
linear
neural networks
SVM
similarity
Gaussian processes
graphical models
bilinear/SVD...
Methods
regularization
validation
aggregation
preprocessing...
c© AML Creator: Malik Magdon-Ismail Reflecting on Our Path: 9 /11 Paradigms −→
Navigating the Jungle: Paradigms
THEORY
VC-analysis
bias-variance
complexity
Bayesian
Rademacher
SRM...
TECHNIQUES
Models
linear
neural networks
SVM
similarity
Gaussian processes
graphical models
bilinear/SVD...
Methods
regularization
validation
aggregation
preprocessing...
PARADIGMS
supervised
unsupervised
reinforcement
active
online
unlabeled
transfer learning
big data...
c© AML Creator: Malik Magdon-Ismail Reflecting on Our Path: 10 /11 Moving Forward −→
Moving Forward
1. What is Learning?
Output g ≈ f after looking at data (xn, yn).
2. Can We do it?
Ein ≈ Eout simple H, finite dvc, large NEin ≈ 0 good H, algorithms
3. How to do it?
Linear models, nonlinear transforms
Algorithms: PLA, pseudoinverse, gradient descent
4. How to do it well?
Overfitting: stochastic & deterministic noise
Cures: regularization, validation.
5. General principles?
Occams razor, sampling bias, data snooping
6. Advanced techniques.
Similarity, neural networks, SVMs, preprocessing & aggregation
7. Other Learning Paradigms.
Unsupervised, reinforcement
conceptstheorypractice
c© AML Creator: Malik Magdon-Ismail Reflecting on Our Path: 11 /11