COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

Transcript

Page 1: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

Video SegmentationWednesday 9 Nov 2006

Page 2: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

overview

• scene change detectionspatial-temporal change detectionmotion segmentation (optical flow)clustering in motion parameter space(k-means test)

semantic video object segmenationchroma-keying

Page 3: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

scene change detection

• frame difference between k-th frame and reference frame at pixel location x:

∂γ

∂τ= F$ν = (f(δI(x)) + λκ)$ν (6)

Balloon forcef(δI(x))$ν (7)

As δI !", f(δI)! 0

Temporal model assuming velocity is constant between time t and τ > t

x(τ) = x(t) + vt(x)(τ # t) = x(t) + dt,! (x) (8)

Observation

Ik[n] = Ik!1[n# d] (9)

ds= 0 (10)

dxν1 +

dyν2 +

dt= ($I) · ! +

dt= 0 (11)

E(v) =!

D($I(x) · v(x) +

∂I(x)dt

)2 + λ(%$(v1(x))%2 + %$(v2(x))%2) (12)

1 Motion Segmentation

Frame Di!erence k#th frame compared with reference frame at r.

FDk,r[x] = Ik[x]# Ir[x] (13)

!#= F$% = (f(&I(x)) + '()$% (6)

Balloon forcef(&I(x))$% (7)

As &I !", f(&I)! 0

Temporal model assuming velocity is constant between time t and # > t

x(#) = x(t) + vt(x)(# # t) = x(t) + dt,! (x) (8)

Observation

Ik[n] = Ik!1[n# d] (9)

ds= 0 (10)

dx%1 +

dy%2 +

dt= ($I) · ! +

dt= 0 (11)

E(v) =!

D($I(x) · v(x) +

!I(x)dt

)2 + '(%$(v1(x))%2 + %$(v2(x))%2) (12)

1 Motion Segmentation

Frame Di!erence k#th frame compared with reference frame at r.

FDk,r[x] = Ik[x]# Ir[x] (13)

thresholded segmentation label

k#th frame compared with reference frame at r. zk,r[x] = 1 if |FDk,r[x]| > T , 0 otherwise

!#= F$% = (f(&I(x)) + '()$% (6)

Balloon forcef(&I(x))$% (7)

As &I !", f(&I)! 0

Temporal model assuming velocity is constant between time t and # > t

x(#) = x(t) + vt(x)(# # t) = x(t) + dt,! (x) (8)

Observation

Ik[n] = Ik!1[n# d] (9)

ds= 0 (10)

dx%1 +

dy%2 +

dt= ($I) · ! +

dt= 0 (11)

E(v) =!

D($I(x) · v(x) +

!I(x)dt

)2 + '(%$(v1(x))%2 + %$(v2(x))%2) (12)

1 Motion Segmentation

Frame Di!erence k#th frame compared with reference frame at r.

FDk,r[x] = Ik[x]# Ir[x] (13)

thresholded segmentation label

k#th frame compared with reference frame at r. zk,r[x] = 1 if |FDk,r[x]| > T , 0 otherwise

!#= F$% = (f(&I(x)) + '()$% (6)

Balloon forcef(&I(x))$% (7)

As &I !", f(&I)! 0

Temporal model assuming velocity is constant between time t and # > t

x(#) = x(t) + vt(x)(# # t) = x(t) + dt,! (x) (8)

Observation

Ik[n] = Ik!1[n# d] (9)

ds= 0 (10)

dx%1 +

dy%2 +

dt= ($I) · ! +

dt= 0 (11)

E(v) =!

D($I(x) · v(x) +

!I(x)dt

)2 + '(%$(v1(x))%2 + %$(v2(x))%2) (12)

1 Motion Segmentation

Frame Di!erence k#th frame compared with reference frame at r.

FDk,r[x] = Ik[x]# Ir[x] (13)

thresholded segmentation label

k#th frame compared with reference frame at r. zk,r[x] = 1 if |FDk,r[x]| > T , 0 otherwise

Thresholded by T, segmentation label on each pixel

Problems:- a uniform intensity region may be interpreted as stationary- FD is affected by spatial gradient in the direction of motion

Gaussian pyramid

• Multi-resolution representation of image1, Original (highest resolution) image at bottom level2. Lowpass filter (e.g. Gaussian filter)3. Subsample by factor 24. Place result in second level

Change Detection v. 0.2• 1. Gaussian pyramid, start at lowest resolution.

2. Compute at each pixel, normalized frame difference:

where N is a local neighborhood of x, gradient of image, c is fudge addend to avoid divde by 0.

3. If FDN is high (pixel is moving), then replace FDN from previous level with this one, else retain lower res value.

4. Repeat 2-3 for all resolution levels.

!#= F$% = (f(&I(x)) + '()$% (6)

Balloon forcef(&I(x))$% (7)

As &I !", f(&I)! 0

Temporal model assuming velocity is constant between time t and # > t

x(#) = x(t) + vt(x)(# # t) = x(t) + dt,! (x) (8)

Observation

Ik[n] = Ik!1[n# d] (9)

ds= 0 (10)

dx%1 +

dy%2 +

dt= ($I) · ! +

dt= 0 (11)

E(v) =!

D($I(x) · v(x) +

!I(x)dt

)2 + '(%$(v1(x))%2 + %$(v2(x))%2) (12)

1 Motion Segmentation

Frame Di!erence k#th frame compared with reference frame at r.

FDk,r[x] = Ik[x]# Ir[x] (13)

thresholded segmentation label

k#th frame compared with reference frame at r. zk,r[x] = 1 if |FDk,r[x]| > T , 0 otherwise

Normalized frame di!erence:

FDNk,r[x] ="

x"N |Ik[x]# Ir[x]||$Ir[x]|"x"N |$Ir[x]|2 + c

(14)

Page 6: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

temporal integration 1• Warp map W[A,B]: warp image A toward B using

motion model parameters estimated between A and B.

Compute internal representation image:

Result: unchanged regions retain sharpness (less noise), changed regions blur

warp (by motion) W [A,B] warp image A toward B using motion model parameters estimatedbetween A and B:

Ik[x] = (1! !)Ik[x] + !W [Ik!1[x], Ik[x]] (15)

warp (by motion) W [A,B] warp image A toward B using motion model parameters estimatedbetween A and B:

Ik[x] = (1! !)Ik[x] + !W [Ik!1[x], Ik[x]] (15)

0 " ! " 1

(*)

Page 7: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

temporal integration 2• 1. Compute motion parameters between internal

representation and new frame within support of dominant object in previous frame.

2. Warp internal representation image at frame k-1 towards new frame.

3. Detect stationary reqgions between registered images, using as initial estiamte to compute new mask .

4. Update internal representation using (*)warp (by motion) W [A,B] warp image A toward B using motion model parameters estimatedbetween A and B:

Ik[x] = (1! !)Ik[x] + !W [Ik!1[x], Ik[x]] (15)

warp (by motion) W [A,B] warp image A toward B using motion model parameters estimatedbetween A and B:

Ik[x] = (1! !)Ik[x] + !W [Ik!1[x], Ik[x]] (15)

0 " ! " 1

warp (by motion) W [A,B] warp image A toward B using motion model parameters estimatedbetween A and B:

Ik[x] = (1! !)Ik[x] + !W [Ik!1[x], Ik[x]] (15)

0 " ! " 1

warp (by motion) W [A,B] warp image A toward B using motion model parameters estimatedbetween A and B:

Ik[x] = (1! !)Ik[x] + !W [Ik!1[x], Ik[x]] (15)

0 " ! " 1

Mk!1

warp (by motion) W [A,B] warp image A toward B using motion model parameters estimatedbetween A and B:

Ik[x] = (1! !)Ik[x] + !W [Ik!1[x], Ik[x]] (15)

0 " ! " 1

Mk!1

warp (by motion) W [A,B] warp image A toward B using motion model parameters estimatedbetween A and B:

Ik[x] = (1! !)Ik[x] + !W [Ik!1[x], Ik[x]] (15)

0 " ! " 1

Mk!1

Page 8: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

temporal integration 3

• Advantages:Comparing each frame with internal representaiton -- weighted by motion warp -- rather than previoous frame, tracks (dominant) moving object.

- noise in tracked object is lower &- image gradients elsewhere are blurred (lower)

Page 9: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

semantic video object segmentation

Low level features:motion

color (of pixel)texture

But what’s “semantic” ??

Wittgenstein Philosophical Investigations 454: A pointing arrow: ----> “The arrow points only in the application that a living being makes of it.This pointing is not a hocus-pocus which can be performed only by the soul.“

Page 10: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

exampleschroma-keying

cv.jit.meanblob tracking

quennesson’s conscious=camera

Page 11: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

averaging over time

• cv.jit.mean

Page 12: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

chroma-keying

• 10jChromakey-x.pat

Page 13: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

blob tracking

• cv.jit.labelcv.jit.blobs.boundscv.jit.blobs.centroidscv.jit.blobs.direcitioncv.jit.blobs.elongationcv.jit.blobs.momentscv.jit.blobs.orientationcv.jit.blobs.recon