776 Computer Vision

61
776 Computer Vision Jan-Michael Frahm, Enrique Dunn Spring 2012

description

776 Computer Vision. Jan-Michael Frahm , Enrique Dunn Spring 2012. From Previous Lecture. Homographies Fundamental matrix Normalized 8-point Algorithm Essential Matrix. Plane Homography for Calibrated Cameras. For the plane at infinity H = K’RK -1. In the calibrated case - PowerPoint PPT Presentation

Transcript of 776 Computer Vision

Page 1: 776 Computer Vision

776 Computer Vision

Jan-Michael Frahm, Enrique DunnSpring 2012

Page 2: 776 Computer Vision

From Previous Lecture• Homographies• Fundamental matrix• Normalized 8-point Algorithm• Essential Matrix

Page 3: 776 Computer Vision

Plane Homography for Calibrated Cameras

• In the calibrated case o Two cameras P=K[I |0] and P’ = K’[R | t]o A plane π=(nT,d) T

• The homography is given by x’=Hx

H = K’(R – tnT/d)K-1

• For the plane at infinityH = K’RK-1

Page 4: 776 Computer Vision

The Fundamental Matrix F

P0m0

L

l1

M

m1

M

P1

Hm0

Epipole e1

F = [e]xH = Fundamental Matrix

1 1 0Tm I 1 0I Fm 1 0 0Tm Fm

1 0Te F

Page 5: 776 Computer Vision

The eight-point algorithmx = (u, v, 1)T, x’ = (u’, v’, 1)T

Minimize:

under the constraintF33

= 1

2

1

)( i

N

i

Ti xFx

Page 6: 776 Computer Vision

Essential Matrix(Longuet-Higgins, 1981)

Epipolar constraint: Calibrated case

0)]([ xRtx RtExExT ][with0

X

x x’

The vectors x, t, and Rx’ are coplanar slide: S. Lazebnik

Page 7: 776 Computer Vision

Essential Matrix

Epipolar constraint: Calibrated case

0)]([ xRtx RtExExT ][with0

X

x x’

The vectors x, t, and Rx’ are coplanar slide: S. Lazebnik

1det( ) 0, ( ) 02

T TE EE E trace EE E 'FKKE T

Cubic constraint

Page 8: 776 Computer Vision

Today: Binocular stereo• Given a calibrated binocular stereo pair, fuse it to

produce a depth image

Where does the depth information come from?

Page 9: 776 Computer Vision

Binocular stereo• Given a calibrated binocular stereo pair, fuse it to

produce a depth imageo Humans can do it

Stereograms: Invented by Sir Charles Wheatstone, 1838

Page 10: 776 Computer Vision

Binocular stereo• Given a calibrated binocular stereo pair, fuse it to

produce a depth imageo Humans can do it

Autostereograms: www.magiceye.com

Page 11: 776 Computer Vision

Binocular stereo• Given a calibrated binocular stereo pair, fuse it to

produce a depth imageo Humans can do it

Autostereograms: www.magiceye.com

Page 12: 776 Computer Vision

Real-time stereo

• Used for robot navigation (and other tasks)o Software-based real-time stereo techniques

Nomad robot searches for meteorites in Antarticahttp://www.frc.ri.cmu.edu/projects/meteorobot/index.html

slide: R. Szeliski

Page 13: 776 Computer Vision

Stereo image pair

slide: R. Szeliski

Page 14: 776 Computer Vision

Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923

Anaglyphs

http://www.rainbowsymphony.com/freestuff.html

(Wikipedia for images)

slide: R. Szeliski

Page 15: 776 Computer Vision

Stereo: epipolar geometry

• Match features along epipolar lines

viewing ray

epipolar plane

epipolar line

slide: R. Szeliski

Page 16: 776 Computer Vision

Simplest Case: Parallel images

• Image planes of cameras are parallel to each other and to the baseline

• Camera centers are at same height

• Focal lengths are the same

slide: S. Lazebnik

Page 17: 776 Computer Vision

Simplest Case: Parallel images

• Image planes of cameras are parallel to each other and to the baseline

• Camera centers are at same height

• Focal lengths are the same

• Then, epipolar lines fall along the horizontal scan lines of the images

slide: S. Lazebnik

Page 18: 776 Computer Vision

Essential matrix for parallel images

RtExExT ][,0

0000

000][

TTRtE

R = I t = (T, 0, 0)

Epipolar constraint:

00

0][

xy

xz

yz

aaaa

aaa

t

x

x’

Page 19: 776 Computer Vision

Essential matrix for parallel images

RtExExT ][,0

0000

000][

TTRtE

Epipolar constraint:

vTTvvTTvuv

u

TTvu

0

010

10000

0001

R = I t = (T, 0, 0)

The y-coordinates of corresponding points are the same!

t

x

x’

Page 20: 776 Computer Vision

Depth from disparity

f

x x’

BaselineB

z

O O’

X

f

zfBxxdisparity

Disparity is inversely proportional to depth!

Page 21: 776 Computer Vision

Depth Sampling Depth sampling for integer pixel disparity

Quadratic precision loss with depth!

Page 22: 776 Computer Vision

Depth Sampling Depth sampling for wider baseline

Page 23: 776 Computer Vision

Depth Sampling Depth sampling is in O(resolution6)

Page 24: 776 Computer Vision

Stereo: epipolar geometry• for two images (or images with collinear camera

centers), can find epipolar lines• epipolar lines are the projection of the pencil of

planes passing through the centers

• Rectification: warping the input images (perspective transformation) so that epipolar lines are horizontal

slide: R. Szeliski

Page 25: 776 Computer Vision

Rectification• Project each image onto same plane, which is

parallel to the epipole• Resample lines (and shear/stretch) to place lines

in correspondence, and minimize distortion

• [Loop and Zhang, CVPR’99]

slide: R. Szeliski

Page 26: 776 Computer Vision

Rectification

BAD!

slide: R. Szeliski

Page 27: 776 Computer Vision

Rectification

GOOD!

slide: R. Szeliski

Page 28: 776 Computer Vision

Problem: Rectification for forward moving

cameras

• Required image can become very large (infinitely large) when the epipole is in the image

• Alternative rectifications are available using epipolar lines directly in the imageso Pollefeys et al. 1999, “A simple and efficient method for general

motion”, ICCV

Page 29: 776 Computer Vision

Your basic stereo algorithm

For each epipolar lineFor each pixel in the left image

• compare with every pixel on same epipolar line in right image• pick pixel with minimum match cost

Improvement: match windows• This should look familar...

slide: R. Szeliski

Page 30: 776 Computer Vision

Finding correspondences

• apply feature matching criterion (e.g., correlation or Lucas-Kanade) at all pixels simultaneously

• search only over epipolar lines (many fewer candidate positions)

slide: R. Szeliski

Page 31: 776 Computer Vision

Matching cost

disparity

Left Right

scanline

Correspondence search

• Slide a window along the right scanline and compare contents of that window with the reference window in the left image

• Matching cost: SSD or normalized correlation

slide: S. Lazebnik

Page 32: 776 Computer Vision

Left Right

scanline

Correspondence search

SSDslide: S. Lazebnik

Page 33: 776 Computer Vision

Left Right

scanline

Correspondence search

Norm. corrslide: S. Lazebnik

Page 34: 776 Computer Vision

Neighborhood size• Smaller neighborhood: more details• Larger neighborhood: fewer isolated mistakes

• w = 3 w = 20

slide: R. Szeliski

Page 35: 776 Computer Vision

Matching criteria• Raw pixel values (correlation)• Band-pass filtered images [Jones & Malik 92]• “Corner” like features [Zhang, …]• Edges [many people…]• Gradients [Seitz 89; Scharstein 94]• Rank statistics [Zabih & Woodfill 94]• Intervals [Birchfield and Tomasi 96]• Overview of matching metrics and their

performance:o H. Hirschmüller and D. Scharstein, “Evaluation of Stereo Matching

Costs on Images with Radiometric Differences”, PAMI 2008

slide: R. Szeliski

Page 36: 776 Computer Vision

Adaptive Weighting• Boundary Preserving• More Costly

Page 37: 776 Computer Vision

Failures of correspondence search

Textureless surfaces Occlusions, repetition

Non-Lambertian surfaces, specularitiesslide: S. Lazebnik

Page 38: 776 Computer Vision

Stereo: certainty modeling• Compute certainty map from correlations

• input depth map certainty map

slide: R. Szeliski

Page 39: 776 Computer Vision

Results with window search

Window-based matching Ground truth

Data

slide: S. Lazebnik

Page 40: 776 Computer Vision

Better methods exist...

Graph cuts Ground truth

For the latest and greatest: http://www.middlebury.edu/stereo/

Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 2001

slide: S. Lazebnik

Page 41: 776 Computer Vision

How can we improve window-based

matching?• The similarity constraint is local (each reference

window is matched independently)• Need to enforce non-local correspondence

constraints

slide: S. Lazebnik

Page 42: 776 Computer Vision

Non-local constraints• Uniqueness

o For any point in one image, there should be at most one matching point in the other image

slide: S. Lazebnik

Page 43: 776 Computer Vision

Non-local constraints• Uniqueness

o For any point in one image, there should be at most one matching point in the other image

• Orderingo Corresponding points should be in the same order in both views

slide: S. Lazebnik

Page 44: 776 Computer Vision

Non-local constraints• Uniqueness

o For any point in one image, there should be at most one matching point in the other image

• Orderingo Corresponding points should be in the same order in both views

Ordering constraint doesn’t holdslide: S. Lazebnik

Page 45: 776 Computer Vision

Non-local constraints• Uniqueness

o For any point in one image, there should be at most one matching point in the other image

• Orderingo Corresponding points should be in the same order in both views

• Smoothnesso We expect disparity values to change slowly (for the most part)

slide: S. Lazebnik

Page 46: 776 Computer Vision

Scanline stereo• Try to coherently match pixels on the entire

scanline• Different scanlines are still optimized

independentlyLeft image Right image

slide: S. Lazebnik

Page 47: 776 Computer Vision

“Shortest paths” for scan-line stereo

Left image

Right image

Can be implemented with dynamic programmingOhta & Kanade ’85, Cox et al. ‘96

leftS

rightS

correspondence

q

p

Left

occlu

sion

t

Rightocclusion

s

occlC

II

corrC

Slide credit: Y. Boykov

Page 48: 776 Computer Vision

Coherent stereo on 2D grid• Scanline stereo generates streaking artifacts

• Can’t use dynamic programming to find spatially coherent disparities/ correspondences on a 2D grid

slide: S. Lazebnik

Page 49: 776 Computer Vision

Stereo matching as energy minimizationI1 I2 D

• Energy functions of this form can be minimized using graph cuts

Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 2001

W1(i ) W2(i+D(i )) D(i )

jii

jDiDiDiWiWDE,neighbors

221 )()())(()()(

data term smoothness term

slide: S. Lazebnik

Page 50: 776 Computer Vision

Active stereo with structured light

• Project “structured” light patterns onto the objecto Simplifies the correspondence problemo Allows us to use only one camera

camera

projector

L. Zhang, B. Curless, and S. M. Seitz. Rapid Shape Acquisition Using Color Structured Light and Multi-pass Dynamic Programming. 3DPVT 2002

slide: S. Lazebnik

Page 51: 776 Computer Vision

Active stereo with structured light

L. Zhang, B. Curless, and S. M. Seitz. Rapid Shape Acquisition Using Color Structured Light and Multi-pass Dynamic Programming. 3DPVT 2002

slide: S. Lazebnik

Page 52: 776 Computer Vision

Active stereo with structured light

http://en.wikipedia.org/wiki/Structured-light_3D_scannerslide: S. Lazebnik

Page 53: 776 Computer Vision

Kinect: Structured infrared light

http://bbzippo.wordpress.com/2010/11/28/kinect-in-infrared/ slide: S. Lazebnik

Page 54: 776 Computer Vision

Laser scanning

• Optical triangulationo Project a single stripe of laser lighto Scan it across the surface of the objecto This is a very precise version of structured light scanning

Digital Michelangelo ProjectLevoy et al.

http://graphics.stanford.edu/projects/mich/

Source: S. Seitz

Page 55: 776 Computer Vision

Laser scanned models

The Digital Michelangelo Project, Levoy et al.

Source: S. Seitz

Page 56: 776 Computer Vision

Laser scanned models

The Digital Michelangelo Project, Levoy et al. Source: S. Seitz

Page 57: 776 Computer Vision

Laser scanned models

The Digital Michelangelo Project, Levoy et al.

Source: S. Seitz

Page 58: 776 Computer Vision

Laser scanned models

The Digital Michelangelo Project, Levoy et al.

Source: S. Seitz

Page 59: 776 Computer Vision

Laser scanned models

The Digital Michelangelo Project, Levoy et al.

Source: S. Seitz

1.0 mm resolution (56 million triangles)

Page 60: 776 Computer Vision

Aligning range images• A single range scan is not sufficient to describe a

complex surface• Need techniques to register multiple range

images

B. Curless and M. Levoy, A Volumetric Method for Building Complex Models from Range Images, SIGGRAPH 1996

Page 61: 776 Computer Vision

Aligning range images• A single range scan is not sufficient to describe a

complex surface• Need techniques to register multiple range

images

• … which brings us to multi-view stereo