Transcript of Learning to Segment with Diverse Data M. Pawan Kumar Stanford University.
- Slide 1
- Learning to Segment with Diverse Data M. Pawan Kumar Stanford
University
- Slide 2
- Semantic Segmentation car road grass tree sky
- Slide 3
- Segmentation Models car road grass tree sky MODEL w xy P(x,y;
w) Learn accurate parameters y* = argmax y P(x,y; w) P(x,y; w)
exp(-E(x,y;w)) y* = argmin y E(x,y; w)
- Slide 4
- Fully Supervised Data
- Slide 5
- Fully Supervised Data Specific foreground classes, generic
background class PASCAL VOC Segmentation Datasets
- Slide 6
- Fully Supervised Data Specific background classes, generic
foreground class Stanford Background Datasets
- Slide 7
- J. Gonfaus et al. Harmony Potentials for Joint Classification
and Segmentation. CVPR, 2010 S. Gould et al. Multi-Class
Segmentation with Relative Location Prior. IJCV, 2008 S. Gould et
al. Decomposing a Scene into Geometric and Semantically Consistent
Regions. ICCV, 2009 X. He et al. Multiscale Conditional Random
Fields for Image Labeling. CVPR, 2004 S. Konishi et al. Statistical
Cues for Domain Specific Image Segmentation with Performance
Analysis. CVPR, 2000 L. Ladicky et al. Associative Hierarchical
CRFs for Object Class Image Segmentation. ICCV, 2009 F. Li et al.
Object Recognition as Ranking Holistic Figure-Ground Hypotheses.
CVPR, 2010 J. Shotton et al. TextonBoost: Joint Appearance, Shape
and Context Modeling for Multi-Class Object Recognition and
Segmentation. ECCV, 2006 J. Verbeek et al. Scene Segmentation with
Conditional Random Fields Learned from Partially Labeled Images.
NIPS, 2007 Y. Yang et al. Layered Object Detection for Multi-Class
Segmentation. CVPR, 2010 Supervised Learning Generic classes,
burdensome annotation
- Slide 8
- PASCAL VOC Detection Datasets Thousands of images Weakly
Supervised Data Bounding Boxes for Objects
- Slide 9
- Car Weakly Supervised Data Thousands of images ImageNet,
Caltech Image-Level Labels
- Slide 10
- B. Alexe et al. ClassCut for Unsupervised Class Segmentation.
ECCV, 2010 H. Arora et al. Unsupervised Segmentation of Objects
Using Efficient Learning. CVPR, 2007 L. Cao et al. Spatially
Coherent Latent Topic Model for Concurrent Segmentation and
Classification of Objects and Scenes. ICCV, 2007 J. Winn et al.
LOCUS: Learning Object Classes with Unsupervised Segmentation.
ICCV, 2005 Weakly Supervised Learning Binary segmentation, limited
data
- Slide 11
- Diverse Data Car
- Slide 12
- Diverse Data Learning Avoid generic classes Take advantage of
Cleanliness of supervised data Vast availability of weakly
supervised data
- Slide 13
- Outline Model Energy Minimization Parameter Learning Results
Future Work
- Slide 14
- Region-Based Model Pixels Regions Gould, Fulton and Koller,
ICCV 2009 Unary Potential r (i) = w i T r (x) For example, r (x) =
Average [R G B] w water = [0 0 -10] w grass = [0 -10 0] Features
extracted from region r of image x Pairwise Potential rr (i,j) = w
ij T rr (x) For example, rr (x) = constant > 0 w car above
ground > 0
- Slide 15
- Region-based Model E(x,y) -log P(x,y) = Unaries + Pairwise
E(x,y) = w T (x,y) Best segmentation of an image?Accurate w? x
y
- Slide 16
- Outline Model Energy Minimization Parameter Learning Results
Future Work Kumar and Koller, CVPR 2010
- Slide 17
- Besag. On the Statistical Analysis of Dirty Pictures, JRSS,
1986 Boykov et al. Fast Approximate Energy Minimization via Graph
Cuts, PAMI, 2001 Komodakis et al. Fast, Approximately Optimal
Solutions for Single and Dynamic MRFs, CVPR, 2007 Lempitsky et al.
Fusion Moves for Markov Random Field Optimization, PAMI, 2010
Move-Making T. Minka. Expectation Propagation for Approximate
Bayesian Inference, UAI, 2001 Murphy. Loopy Belief Propagation: An
Empirical Study, UAI, 1999 J. Winn et al. Variational Message
Passing, JMLR, 2005 J. Yedidia et al. Generalized Belief
Propagation, NIPS, 2001 Message-Passing Chekuri et al.
Approximation Algorithms for Metric Labeling, SODA, 2001 M. Goemans
et al. Improved Approximate Algorithms for Maximum-Cut, JACM, 1995
M. Muramatsu et al. A New SOCP Relaxation for Max-Cut, JORJ, 2003
Ravikumar et al. QP Relaxations for Metric Labeling, ICML, 2006
Convex Relaxations K. Alahari et al. Dynamic Hybrid Algorithms for
MAP Inference, PAMI 2010 P. Kohli et al. On Partial Optimality in
Multilabel MRFs, ICML, 2008 C. Rother et al. Optimizing Binary MRFs
via Extended Roof Duality, CVPR, 2007 Hybrid Algorithms Which one
is the best relaxation?
- Slide 18
- Convex Relaxations Time LP 1976 SOCP 2003 QP 2006 Tightness We
expect . Kumar, Kolmogorov and Torr, NIPS, 2007 Use LP!! LP
provably better than QP, SOCP.
- Slide 19
- Energy Minimization Find Regions Find Labels Fixed Regions LP
Relaxation
- Slide 20
- Energy Minimization Good region homogenous appearance,
textureBad region inhomogenous appearance, texture Low-level
segmentation for candidate regions Find Regions Find Labels Can we
prune regions? Super-exponential in Number of Pixels
- Slide 21
- Energy Minimization Spatial Bandwidth = 10 Mean-Shift
Segmentation
- Slide 22
- Energy Minimization Spatial Bandwidth = 20 Mean-Shift
Segmentation
- Slide 23
- Energy Minimization Spatial Bandwidth = 30 Mean-Shift
Segmentation
- Slide 24
- Energy Minimization Combine Multiple Segmentations Car
- Slide 25
- Dictionary of Regions Select Regions, Assign Classes y r (i)
{0,1}, for i = 0, 1, 2, , C Not Selected Selected regions cover
entire image No two selected regions overlap min r (i)y r (i) + rr
(i,j)y r (i)y r (j) Pixel Regions Kumar and Koller, CVPR 2010
Efficient DD. Komodakis and Paragios, CVPR, 2009 2323 3
- Slide 26
- Comparison Energy Accuracy IMAGEIMAGE GOULDGOULD OUROUR
Parameters learned using Gould, Fulton and Koller, ICCV 2009
Statistically significant improvement (paired t-test)
- Slide 27
- Outline Model Energy Minimization Parameter Learning Results
Future Work Kumar, Turki, Preston and Koller, In Submission
- Slide 28
- Supervised Learning x1x1 y1y1 x2x2 y2y2 P(x,y) exp(-E(x,y)) =
exp(w T (x,y)) P(y|x 1 ) y P(y|x 2 ) y y1y1 y2y2 Well-studied
problem, efficient solutions
- Slide 29
- Diverse Data Learning xa h Generic Class Annotation
- Slide 30
- Diverse Data Learning xa h Bounding Box Annotation
- Slide 31
- Diverse Data Learning x a = Cow h Image Level Annotation
- Slide 32
- Learning with Missing Information Expectation Maximization A.
Dempster et al. Maximum Likelihood from Incomplete Data via the EM
Algorithm. JRSS, 1977. M. Jamshadian et al. Acceleration of the EM
Algorithm by Using Quasi-Newton Methods. JRSS, 1997. R. Neal et al.
A View of the EM Algorithm that Justifies Incremental, Sparse, and
Other Variants. LGM, 1999. R. Sundberg. Maximum Likelihood Theory
for Incomplete Data from an Exponential Family. SJS 1974. Latent
Support Vector Machine P. Felzenszwalb et al. A Discriminatively
Trained, Multiscale, Deformable Part Model. CVPR, 2008. C.-N. Yu et
al. Learning Structural SVMs with Latent Variables. ICML, 2009.
Computationally Inefficient Only requires an energy minimization
algorithm Hard EM
- Slide 33
- Latent SVM w T (x i,a i, h i ) w T (x i,a,h) i min i i + w 2 ||
min h i Energy of Ground-truth Energy of Other Labelings
User-defined loss Difference of ConvexCCCP + (a i,a,h) Number of
disagreements Felzenszwalb et al., NIPS 2007, Yu et al., ICML
2008
- Slide 34
- CCCP Start with an initial estimate w 0 Update Update w t+1 by
solving a convex problem min i i w T (x i,a i,h i ) - w T (x i,a,h)
(a i,a,h) - i h i = min h w t T (x i,a i,h) Felzenszwalb et al.,
NIPS 2007, Yu et al., ICML 2008 + w 2 || Energy Minimization
- Slide 35
- Generic Class Annotation Generic background with specific
background Generic foreground with specific foreground
- Slide 36
- Bounding Box Annotation Every row contains the object Every
column contains the object
- Slide 37
- Image Level Annotation The image contains the object Cow
- Slide 38
- CCCP Start with an initial estimate w 0 Update Update w t+1 by
solving a convex problem min i i w T (x i,a i,h i ) - w T (x i,a,h)
(a i,a,h) - i h i = min h w t T (x i,a i,h) Felzenszwalb et al.,
NIPS 2007, Yu et al., ICML 2008 + w 2 || Energy Minimization Bad
Local Minimum!!
- Slide 39
- White sky Grey road EASY Green grass
- Slide 40
- White sky Blue water Green grass EASY
- Slide 41
- Cow? Cat? Horse? HARD
- Slide 42
- Red Sky? Black Mountain? All images are not equal HARD
- Slide 43
- Real Numbers Imaginary Numbers e i +1 = 0 Math is for losers
!!
- Slide 44
- Real Numbers Imaginary Numbers e i +1 = 0 Euler was a genius!!
Self-Paced Learning
- Slide 45
- Easy vs. Hard Easy for human Easy for machine Simultaneously
estimate easiness and parameters
- Slide 46
- Self-Paced Learning Start with an initial estimate w 0 Update
Update w t+1 by solving a convex problem h i = min h w t T (x i,a
i,h) Kumar, Packer and Koller, NIPS 2010 min I i w T (x i,a i,h i )
- w T (x i,a,h) (a i,a,h) - i + w 2 || vivi - i v i /K v i {0,1} v
i [0,1] v i = 1 for easy examplesv i = 0 for hard examples Biconvex
Optimization Alternate Convex Search
- Slide 47
- Self-Paced Learning Start with an initial estimate w 0 Update
Update w t+1 by solving a biconvex problem min I i v i w T (x i,a
i,h i ) - w T (x i,a,h) (a i,a,h) - i h i = min h w t T (x i,a i,h)
Kumar, Packer and Koller, NIPS 2010 + w 2 || - i v i /K Decrease K
K/ As Simple As CCCP!!
- Slide 48
- Self-Paced Learning Kumar, Packer and Koller, NIPS 2010 h x a =
Deer Test Error Image Classification x a = -1 or +1 h = Motif
Position Test Error Motif Finding
- Slide 49
- Learning to Segment CCCP SPL
- Slide 50
- Learning to Segment CCCP SPL Iteration 1
- Slide 51
- Learning to Segment CCCP SPL Iteration 3
- Slide 52
- Learning to Segment CCCP SPL Iteration 6
- Slide 53
- Learning to Segment CCCP SPL
- Slide 54
- Learning to Segment CCCP SPL Iteration 1
- Slide 55
- Learning to Segment CCCP SPL Iteration 2
- Slide 56
- Learning to Segment CCCP SPL Iteration 4
- Slide 57
- Outline Model Energy Minimization Parameter Learning Results
Future Work
- Slide 58
- Dataset Stanford Background Generic background class 20
foreground classes Generic foreground class 7 background classes
PASCAL VOC 2009 +
- Slide 59
- Dataset Stanford BackgroundPASCAL VOC 2009 + Train - 572 images
Validation - 53 images Test - 90 images Train - 1274 images
Validation - 225 images Test - 750 images
- Slide 60
- Baseline Results for SBD Gould, Fulton and Koller, ICCV 2009
Classes Overlap Score Foreground 36.0% Road 70.1% CLL Average 53.1%
Mountain 0%
- Slide 61
- Improvement for SBD Classes Difference (SPL-CLL) Input CLLSPL
Road 75.5% (+5.4) CLL Average 53.1% SPL Average 54.3% Foreground
39.1% (+3.1)
- Slide 62
- Baseline Results for VOC Gould, Fulton and Koller, ICCV 2009
Overlap Score Classes Bird 9.5% Aeroplane 32.1% TV 23.6% CLL
Average 24.7%
- Slide 63
- Improvement for VOC Input CLLSPL Difference (SPL-CLL) Classes
Aeroplane 41.4% (+9.3) TV 31.3% (+7.7) CLL Average 24.7% SPL
Average 26.9%
- Slide 64
- Weakly Supervised Dataset ImageNetVOC Detection 2009 + Train -
1564 imagesTrain - 1000 images Bounding Box Data Image-Level
Data
- Slide 65
- Improvement for SBD Input GenericAll Difference (All-Generic)
Classes Generic Average 54.3% All Average 55.3% Foreground 41.3%
(+2.2) Water 60.1% (+5.0)
- Slide 66
- Improvement for VOC Difference (All-Generic) Classes Input
GenericAll Generic Average 26.9% All Average 28.8% Motorbike 40.4%
(+6.9) Person 42.2% (+4.9)
- Slide 67
- Improvement over CCCP Classes Difference (SPL-CCCP) CCCP 24.7%
SPL 28.8% CCCP 53.8% SPL 55.3% No Improvement with CCCP SPL is
Essential!! Difference (SPL-CCCP) Classes
- Slide 68
- Energy minimization for region-based model Tight LP relaxation
of integer program Self-paced learning Simultaneously select
examples and learn parameters Even weak annotation is useful
Summary
- Slide 69
- Outline Model Energy Minimization Parameter Learning Results
Future Work
- Slide 70
- Learning with Diverse Data Noise in LabelsSize of Problem
- Slide 71
- Learning Diverse Tasks Object Detection Action Recognition Pose
Estimation 3D Reconstruction
- Slide 72
- Daphne Koller Stephen GouldBen Packer Haithem Turki Dan Preston
Andrew Zisserman Phil Torr Vladimir Kolmogorov
- Slide 73
- Summary Questions? Energy minimization for region-based model
Tight LP relaxation of integer program Self-paced learning
Simultaneously select examples and learn parameters Even weak
annotation is useful