Tom Lee, Sanja Fidler, Sven Dickinsontshlee/pub/iccv15-midlevel-poster.pdf · Ø w-block (S-SVM):...

1
Ø w-block (S-SVM): Ø λ-block (loss-augmented parametric energy minimization): Learning to Combine Mid-level Cues for Object Proposal Generation Tom Lee, Sanja Fidler, Sven Dickinson Ø A novel Parametric Min-Loss (PML) structured learning framework for parametric energy functions. Ø PML learns to predict multiple outputs using a novel loss function. Ø PML bridges the gap between learning and inference for parametric energy functions. Ø PML is applicable to any domain that uses parametric energy functions. Contributions Ø Object proposals reduce an exhaustive set of hypotheses to a few plausible candidate segments. Ø Object proposals are often predictions from parametric energy functions (CPMC [2] etc.) Ø Parametric energy functions can encode relevant bottom-up grouping cues [4]. Ø But no previous approach exists for learning to predict multiple outputs with parametric energy functions. Motivation 100 200 300 400 500 600 700 800 900 1000 0.42 0.43 0.44 0.45 0.46 0.47 0.48 number of proposals average best overlap Segment Overlap by #Proposals, VOC’12 Vall τ=1 (SSVM) τ=10 τ=20 τ=30 τ=40 τ=50 500 1000 1500 2000 2500 3000 3500 4000 4500 0.54 0.56 0.58 0.6 0.62 0.64 0.66 0.68 0.7 number of proposals average best overlap Segment Overlap by #Proposals, VOC’12 Val ours CPMC Selective Search 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 number of proposals average best overlap Segment Overlap by #Proposals, COCO’14 Val ours Multicue Superpixel Closure MCG Results Ø We achieve results comparable with CPMC [2] and MCG [1] Ø We outperform methods that lack learning, e.g. Selective Search [5] Ø Bias energy to different locations Ø Maximum superpixel distance Location- and color-based diversification Postprocessing Ø Discard non-maximum proposals among proposals with high overlap. Ø Train SVM on deep features to assign an objectness score to each proposal. Ø Bias energy to different foreground-background color pairs Ø Gaussian mixture model of superpixel colors Ø The appearance cue discourages division of similar colors and textures: Ø The closure cue discourages gaps along boundaries: Ø The symmetry cue discourages division of symmetric parts: Ø The energy is normalized by area by a factor λ: Ø Evaluate multiple predicted segments against one correct ground truth segment. Ø Loss function ideally expresses a “min”: Ø Inner loss function measures the error of a single predicted segment: Ø Upper bound for inner loss function (hinge loss): Ø Upper bound for loss function (min-hinge loss [3]): Ø Regularized training objective: Ø Nonnegative weights and nonnegative λ coefficientsguarantee a small set of solutions from parametric maxflow. Ø One prediction for a specific λ: Ø A set of predictions over a range of λ: Parametric energy function Multiple-output prediction ˆ y = arg min y E λ (x, y, w) ˆ y(x, w) = arg max y w T φ λ (x, y) ˆ Y (x, w)= { ˆ y λ (x, w): λ 2 [-1, 0]} L( ˆ Y, y) = min ˆ y2 ˆ Y `y, y) `y, y(g )) = 1 |g | X p |p| ( v p ˆ y p =0 1 - v p ˆ y p =1, H (w)= min λ2[-1,0] h(w, λ) h(w, λ) = max ˆ y `y, y)+ w T φ λ (x, ˆ y) - w T φ λ (x, y) min w 1 2 ||w|| 2 + C N N X n=1 min λ n 2[-1,0] h n (w, λ n ) arg min λ2[-1,0] h(w, λ) arg min w 1 2 ||w|| 2 + C N N X n=1 h n (w, λ n ) 8λ 2 [-1, 0], min ˆ y -`y, y) - w T φ λ (x, ˆ y) decreasing λ Block-coordinate descent E λ (x, y )= E app (x, y )+ E clo (x, y )+ E sym (x, y )+ E λ scale (x, y ) Parametric Min-Loss learning [1] Arbelaez et al., CVPR 2014. [4] Lee et al., ACCV 2014. [2] Carreira & Sminchisescu, PAMI 2012. [5] Uijlings et al., IJCV 2013. [3] Guzman-Rivera et al., NIPS 2012. Diversification Learning

Transcript of Tom Lee, Sanja Fidler, Sven Dickinsontshlee/pub/iccv15-midlevel-poster.pdf · Ø w-block (S-SVM):...

Page 1: Tom Lee, Sanja Fidler, Sven Dickinsontshlee/pub/iccv15-midlevel-poster.pdf · Ø w-block (S-SVM): Ø λ-block (loss-augmented parametric energy minimization): Learning to Combine

Ø w-block(S-SVM):

Ø λ-block(loss-augmentedparametricenergyminimization):

LearningtoCombineMid-levelCuesforObjectProposalGenerationTomLee,Sanja Fidler,SvenDickinson

Ø AnovelParametricMin-Loss(PML)structuredlearningframeworkforparametricenergyfunctions.

Ø PMLlearnstopredictmultipleoutputsusinganovellossfunction.

Ø PMLbridgesthegapbetweenlearningandinferenceforparametricenergyfunctions.

Ø PMLisapplicabletoanydomainthatusesparametricenergyfunctions.

Contributions

Ø Objectproposalsreduceanexhaustivesetofhypothesestoafewplausiblecandidatesegments.

Ø Objectproposalsareoftenpredictionsfromparametricenergyfunctions (CPMC[2]etc.)

Ø Parametricenergyfunctionscanencoderelevantbottom-upgroupingcues[4].

Ø Butnopreviousapproachexistsforlearningtopredictmultipleoutputswithparametricenergyfunctions.

Motivation

100 200 300 400 500 600 700 800 900 10000.42

0.43

0.44

0.45

0.46

0.47

0.48

number of proposals

ave

rage b

est

ove

rlap

Segment Overlap by #Proposals, VOC’12 Vall

τ=1 (S−SVM)

τ=10

τ=20

τ=30

τ=40

τ=50

500 1000 1500 2000 2500 3000 3500 4000 45000.54

0.56

0.58

0.6

0.62

0.64

0.66

0.68

0.7

number of proposals

ave

rag

e b

est

ove

rla

p

Segment Overlap by #Proposals, VOC’12 Val

oursCPMCSelective Search

500 1000 1500 2000 2500 3000 3500 4000 4500 50000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

number of proposals

ave

rage b

est

ove

rlap

Segment Overlap by #Proposals, COCO’14 Val

oursMulticueSuperpixel ClosureMCG

Results

Ø WeachieveresultscomparablewithCPMC[2]andMCG[1]Ø Weoutperformmethodsthatlacklearning,e.g.SelectiveSearch[5]

Ø BiasenergytodifferentlocationsØ Maximumsuperpixel distance

Location- andcolor-baseddiversification Postprocessing

Ø Discardnon-maximumproposalsamongproposalswithhighoverlap.

Ø TrainSVMondeepfeaturestoassignanobjectnessscoretoeachproposal.Ø Biasenergytodifferentforeground-backgroundcolorpairs

Ø Gaussianmixturemodelofsuperpixel colors

Ø Theappearancecuediscouragesdivisionofsimilarcolorsandtextures:

Ø Theclosurecuediscouragesgapsalongboundaries:

Ø Thesymmetrycuediscouragesdivisionofsymmetricparts:

Ø Theenergyisnormalizedbyareabyafactorλ:

Ø Evaluatemultiplepredictedsegmentsagainstonecorrectgroundtruthsegment.

Ø Lossfunctionideallyexpressesa“min”:

Ø Innerlossfunctionmeasurestheerrorofasinglepredictedsegment:

Ø Upperboundforinnerlossfunction(hingeloss):

Ø Upperboundforlossfunction(min-hingeloss[3]):

Ø Regularizedtrainingobjective:Ø Nonnegativeweightsandnonnegative

λ coefficientsguaranteeasmallsetofsolutionsfromparametricmaxflow.

Ø Onepredictionforaspecificλ:

Ø Asetofpredictionsoverarangeofλ:

Parametricenergyfunction

Multiple-outputprediction

y = argminy

E

�(x,y,w)

ˆy(x,w) = argmax

ywT

�(x,y)

Y (x,w) = {y�(x,w) : � 2 [�1, 0]}

L(Y ,y) = miny2Y

`(y,y)

`(y,y(g)) =1

|g|X

p

|p|(vp yp = 0

1� vp yp = 1,

H(w) = min�2[�1,0]

h(w,�)

h(w,�) = max

y`(

ˆy,y) +wT�

�(x,

ˆy)�wT�

�(x,y)

minw

1

2||w||2 + C

N

NX

n=1

min�n2[�1,0]

hn(w,�n)

arg min�2[�1,0]

h(w,�)

w = argminw

1

2||w||2 + C

N

NX

n=1

hn(w,�n)

8� 2 [�1, 0],miny

�`(y,y)�wT�

�(x, y)

decreasingλ

Block-coordinatedescent

E

�(x,y) = E

app

(x,y) + E

clo

(x,y) + E

sym

(x,y) + E

�scale

(x,y)

ParametricMin-Losslearning

[1]Arbelaez etal.,CVPR2014. [4]Leeetal.,ACCV2014.[2]Carreira &Sminchisescu,PAMI2012. [5]Uijlings etal.,IJCV2013.[3]Guzman-Riveraetal.,NIPS2012.

Diversification

Learning