Coherent Scene Understanding with 3D Geometric Reasoning

76
Coherent Scene Understanding with 3D Geometric Reasoning Jiyan Pan 12/3/2012

description

Coherent Scene Understanding with 3D Geometric Reasoning. Jiyan Pan 12/3/2012. Task. Detect objects. Identify surface regions. Geometrically coherent in the 3D world. Estimate ground plane. Infer gravity direction. 3D geometric context. Coordinate system. - PowerPoint PPT Presentation

Transcript of Coherent Scene Understanding with 3D Geometric Reasoning

Page 1: Coherent Scene Understanding with  3D Geometric Reasoning

Coherent Scene Understanding with 3D Geometric Reasoning

Jiyan Pan12/3/2012

Page 2: Coherent Scene Understanding with  3D Geometric Reasoning
Page 3: Coherent Scene Understanding with  3D Geometric Reasoning

TaskDetect objects

Identify surface regions

Estimate ground plane

Infer gravity direction

Geometrically coherent in the

3D world

3D geometric context

Page 4: Coherent Scene Understanding with  3D Geometric Reasoning

O

xy

z

xbdb

dt

γ

nv

θ

xt

np

hp

ng

α Hf

ground plane

image plane(inverse) gravity

ground plane orientation

ground plane height

object vertical orientation

real world heightobject depthcamera center

focal length

object pitch and roll angles

object landmarks

Coordinate system

Deterministic relationships

Variables of global 3D geometries:

ng, np, hp

Page 5: Coherent Scene Understanding with  3D Geometric Reasoning

O

xy

z

xbdb

dt

γ

nv

θ

xt

np

hp

ng

α Hf

ground plane

image plane(inverse) gravity

ground plane orientation

ground plane height

object vertical orientation

real world heightobject depthcamera center

focal length

object pitch and roll angles

object landmarks

Coordinate system

Probabilistic relationships

Derived from appearance

Prior knowledge

Page 6: Coherent Scene Understanding with  3D Geometric Reasoning

Can we solve them all for a coherent solution?

• Non-linear• Non-deterministic• Even invalid equations from false detections

Page 7: Coherent Scene Understanding with  3D Geometric Reasoning

X

Global 3D context

Local 3D context

Page 8: Coherent Scene Understanding with  3D Geometric Reasoning

X

“Chicken and egg” problem: Local entities could be validated by global 3D context Global 3D context is induced from local entities

Global 3D context

Local 3D context

?

Page 9: Coherent Scene Understanding with  3D Geometric Reasoning

Possible solution (All in PGM)• Put both global 3D geometries and local entities in a PGM [1]

– Precision issue: Have to quantize continuous variables– Complexity issue: Pairwise potential would contain up to ~1e6 entries

[1] D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008

Ground

o1

o2

ok

Gravity

100(pitch) × 100 (roll) × 100 (height)

Page 10: Coherent Scene Understanding with  3D Geometric Reasoning

Possible solution (Fixed global geometries as hypotheses)

• Task much easier under a fixed hypothesis of global 3D geometries

Ground

o1

o2

ok

Gravity

× × × × × ×

Page 11: Coherent Scene Understanding with  3D Geometric Reasoning

• Task much easier under a fixed hypothesis of global 3D geometries

Possible solution (Fixed global geometries as hypotheses)

o1

o2

ok

ω1

ω2

ω3

How to generate global 3D geometry hypotheses?

Page 12: Coherent Scene Understanding with  3D Geometric Reasoning

Possible solution(Hypotheses by exhaustive search)

• Exhaustive search over the quantized space of global 3D geometries [2]

– Computational complexity tends to limit search space

[2] S. Bao et al. Toward coherent object detection and scene layout understanding. IVC, 2011

Page 13: Coherent Scene Understanding with  3D Geometric Reasoning

Possible solution(Hypotheses by Hough voting)

• Each local entity casts vote to the Hough voting space of the global 3D geometries and peaks are selected[3]

– False detections could corrupt the votes– Would applying EM help? Not likely, if false detections overwhelm

[3] M. Sun et al. Object detection with geometrical context feedback loop. BMVC, 2010

L1 L2 L3L5L4 L7L6

Page 14: Coherent Scene Understanding with  3D Geometric Reasoning

Our solution• We take a RANSAC-like approach: Randomly mix the

contributions of local entities

L1 L2 L3L5L4 L7L6

Page 15: Coherent Scene Understanding with  3D Geometric Reasoning

Our solution• We take a RANSAC-like approach: Randomly mix the

contributions of local entities

L1 L2 L3L5L4 L7L6

Page 16: Coherent Scene Understanding with  3D Geometric Reasoning

Our solution• We take a RANSAC-like approach: Randomly mix the

contributions of local entities– Compared to averaging over all local entities: More robust against outliers– Compared to directly using estimates from each single local entity: More robust against noise

L1 L2 L3L5L4 L7L6

Page 17: Coherent Scene Understanding with  3D Geometric Reasoning
Page 18: Coherent Scene Understanding with  3D Geometric Reasoning
Page 19: Coherent Scene Understanding with  3D Geometric Reasoning
Page 20: Coherent Scene Understanding with  3D Geometric Reasoning
Page 21: Coherent Scene Understanding with  3D Geometric Reasoning
Page 22: Coherent Scene Understanding with  3D Geometric Reasoning

0 5 10 15 20 25 30 35 40 45 501.6

1.8

2

2.2

2.4

2.6

2.8

3

Number of random mixtures

Min

imum

hyp

othe

sis

erro

r

Gravity Direction

IndividualMixtureAverage

Page 23: Coherent Scene Understanding with  3D Geometric Reasoning

0 5 10 15 20 25 30 35 40 45 501.6

1.8

2

2.2

2.4

2.6

2.8

3

3.2

Number of random mixtures

Min

imum

hyp

othe

sis

erro

r

Ground Plane Orientation

IndividualMixtureAverage

Page 24: Coherent Scene Understanding with  3D Geometric Reasoning

X

Local 3D context

Global 3D context

Page 25: Coherent Scene Understanding with  3D Geometric Reasoning

3D geometric context

ground plane orientation valid

valid invalid (#1)

invalid (#1)invalid

(#1)

ground plane

#1: Common ground (global)

Page 26: Coherent Scene Understanding with  3D Geometric Reasoning

3D geometric context

#2: Gravity direction (global)

(inverse) gravity

ground plane orientation invalid

(#2)

ground plane

Page 27: Coherent Scene Understanding with  3D Geometric Reasoning

3D geometric context

#3: Depth ordering (local)

(inverse) gravity

ground plane orientation

incompatible (#3)

ground plane

Page 28: Coherent Scene Understanding with  3D Geometric Reasoning

3D geometric context

#4: Space occupancy (local)

(inverse) gravity

ground plane orientation

incompatible (#4)

ground plane

Page 29: Coherent Scene Understanding with  3D Geometric Reasoning

2

345

6

1

Page 30: Coherent Scene Understanding with  3D Geometric Reasoning

2

345

6

1

Global geometric compatibility for an object:

Orientation:

Given a global 3D geometry hypothesis

Page 31: Coherent Scene Understanding with  3D Geometric Reasoning

2

345

6

1

Global geometric compatibility for an object:

Orientation:

Height:

Given a global 3D geometry hypothesis

Page 32: Coherent Scene Understanding with  3D Geometric Reasoning

2

345

6

1

Global geometric compatibility for a surface:

Orientation: local estimates vs. or

Location: horizontal surface region vs. ground horizon

Given a global 3D geometry hypothesis

Page 33: Coherent Scene Understanding with  3D Geometric Reasoning

2

345

6

1

Local geometric compatibility for two objects:

Depth ordering:

Space occupancy:

Given a global 3D geometry hypothesis

Page 34: Coherent Scene Understanding with  3D Geometric Reasoning

2

345

6

1

Objective function of the CRF:

Given a global 3D geometry hypothesis

0,01,5.0

ooss

o dg

else,0

1,,min,

)()(ji

ocpij

oclij

jiijooss

oo

Page 35: Coherent Scene Understanding with  3D Geometric Reasoning

X

Local 3D context

Global 3D context

Best hypothesis

Page 36: Coherent Scene Understanding with  3D Geometric Reasoning

3D reasoning agrees with raw detector

3D reasoning recovers detection rejected by raw detector

3D reasoning rejects detection accepted by raw detector

Page 37: Coherent Scene Understanding with  3D Geometric Reasoning

3D reasoning agrees with raw detector

3D reasoning recovers detection rejected by raw detector

3D reasoning rejects detection accepted by raw detector

Page 38: Coherent Scene Understanding with  3D Geometric Reasoning

3D reasoning agrees with raw detector

3D reasoning recovers detection rejected by raw detector

3D reasoning rejects detection accepted by raw detector

Page 39: Coherent Scene Understanding with  3D Geometric Reasoning

3D reasoning agrees with raw detector

3D reasoning recovers detection rejected by raw detector

3D reasoning rejects detection accepted by raw detector

Page 40: Coherent Scene Understanding with  3D Geometric Reasoning

3D reasoning agrees with raw detector

3D reasoning recovers detection rejected by raw detector

3D reasoning rejects detection accepted by raw detector

Page 41: Coherent Scene Understanding with  3D Geometric Reasoning

3D reasoning agrees with raw detector

3D reasoning recovers detection rejected by raw detector

3D reasoning rejects detection accepted by raw detector

Page 42: Coherent Scene Understanding with  3D Geometric Reasoning

3D reasoning agrees with raw detector

3D reasoning recovers detection rejected by raw detector

3D reasoning rejects detection accepted by raw detector

Page 43: Coherent Scene Understanding with  3D Geometric Reasoning

3D reasoning agrees with raw detector

3D reasoning recovers detection rejected by raw detector

3D reasoning rejects detection accepted by raw detector

Page 44: Coherent Scene Understanding with  3D Geometric Reasoning

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

False Positive per Image

True

Pos

itive

Rat

eDeformable Part Model Detector

Baseline

Hoiem

Ours

3D geometric reasoning improves object detection performance

D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008

Page 45: Coherent Scene Understanding with  3D Geometric Reasoning

0 0.2 0.4 0.6 0.8 1 1.20

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

False Positive per Image

True

Pos

itive

Rat

eDalal-Triggs Detector

Baseline

Hoiem

Ours

3D geometric reasoning improves object detection performance

D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008

Page 46: Coherent Scene Understanding with  3D Geometric Reasoning

Improvement in AP over baseline detector

Ours 10.4%

Hoiem 4.8%

Sun 5.1%

M. Sun et al. Object detection with geometrical context feedback loop. BMVC, 2010D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008

3D geometric reasoning improves object detection performance

Page 47: Coherent Scene Understanding with  3D Geometric Reasoning

Horizon estimation median error

Ours 2.05⁰

Hoiem 3.15⁰

Sun 2.41⁰

M. Sun et al. Object detection with geometrical context feedback loop. BMVC, 2010D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008

Page 48: Coherent Scene Understanding with  3D Geometric Reasoning

X

Local 3D context

Global 3D context

Best hypothesis

Page 49: Coherent Scene Understanding with  3D Geometric Reasoning

Contributions of different geometric context

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

False Positive per Image

True

Pos

itive

Rat

eDetection ROC Curve

Det

Det+IdvlGeo

Det+PairGeo

Det+FullGeo

Page 50: Coherent Scene Understanding with  3D Geometric Reasoning

Benefit is mutual

Error in gravity direction

Error in ground orientation

Vanishing points alone 2.62⁰ 4.85⁰

Whole system 2.05⁰ 2.21⁰

Page 51: Coherent Scene Understanding with  3D Geometric Reasoning

Extensions– Improved depth ordering constraint– Local geometric constraints involving vertical surfaces– Multiple supporting planes– Using more prior knowledge of objects– Utilizing semantic categories of surface regions

Page 52: Coherent Scene Understanding with  3D Geometric Reasoning
Page 53: Coherent Scene Understanding with  3D Geometric Reasoning

closer object

farther object

closer object farther object

occlusion mask of the farther object

intersection region of the two object masks

X

Fully cover?

Fully cover?

Page 54: Coherent Scene Understanding with  3D Geometric Reasoning

Occlusion: bottleneck in our system

– Missed detection– Erroneous estimation of local properties– Less effective depth ordering constraint

Page 55: Coherent Scene Understanding with  3D Geometric Reasoning

Generalized Hough voting: better at handle occlusions

K. Rematas et al. CORP 2011

B. Leibe et al. IJCV 2008

Page 56: Coherent Scene Understanding with  3D Geometric Reasoning
Page 57: Coherent Scene Understanding with  3D Geometric Reasoning

Occlusion-and-geometry-aware Hough voting

Page 58: Coherent Scene Understanding with  3D Geometric Reasoning

X

Local 3D context

Global 3D context

Best hypothesis

Page 59: Coherent Scene Understanding with  3D Geometric Reasoning

• So far we have treated the entire region labeled as "vertical" as a whole

Page 60: Coherent Scene Understanding with  3D Geometric Reasoning

Decompose vertical region into surface segments Occlusion boundary recovery (Hoiem et al. IJCV’11)Vanishing line sweeping (Lee et al. CVPR’09)

Page 61: Coherent Scene Understanding with  3D Geometric Reasoning
Page 62: Coherent Scene Understanding with  3D Geometric Reasoning

ground plane

inverse gravity

vertical surface candidate 1

vertical surface candidate 2

Page 63: Coherent Scene Understanding with  3D Geometric Reasoning

ground plane

vertical surface candidate 1

inverse gravity

vertical surface candidate 2

X

Page 64: Coherent Scene Understanding with  3D Geometric Reasoning
Page 65: Coherent Scene Understanding with  3D Geometric Reasoning

ground plane

vertical surface candidateinverse gravity

object candidate

Page 66: Coherent Scene Understanding with  3D Geometric Reasoning

object candidate

ground plane

vertical surface candidateinverse gravity

X

Page 67: Coherent Scene Understanding with  3D Geometric Reasoning

Given object layout, erect surfaces one by one “Interpretation by synthesis” (Gupta et al. ECCV’10)

Page 68: Coherent Scene Understanding with  3D Geometric Reasoning
Page 69: Coherent Scene Understanding with  3D Geometric Reasoning

supporting plane 1

Page 70: Coherent Scene Understanding with  3D Geometric Reasoning

supporting plane 1

supporting plane 2

Page 71: Coherent Scene Understanding with  3D Geometric Reasoning

O

xy

z

ground plane

pn~

ph~

bx

vn~

bd

gn~

tx td tX

bt XX

bX

0H

Page 72: Coherent Scene Understanding with  3D Geometric Reasoning

w

l

β

pn~

ph~

Page 73: Coherent Scene Understanding with  3D Geometric Reasoning
Page 74: Coherent Scene Understanding with  3D Geometric Reasoning

• Spring 2013 (ICCV’13 submission)– Improved depth ordering constraint– Using more prior knowledge of objects– Multiple supporting planes

• Fall 2013 (CVPR’14 submission)– Local geometric constraints involving vertical surfaces– Utilizing semantic categories of surface regions

• During Spring Semester of 2014– Thesis writing

Page 75: Coherent Scene Understanding with  3D Geometric Reasoning

Expected Contributions

• Systematically model the relationships among global and local geometric variables

• Develop a RANSAC-CRF scheme to handle non-linear, non-deterministic, and possibly invalid relationships

• Occlusion-and-geometry-aware object detection for finer depth order reasoning

• Joint reasoning among global geometries, surface segments, and objects

Page 76: Coherent Scene Understanding with  3D Geometric Reasoning

Thank you!