Download - Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

Transcript
Page 1: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

Video SegmentationWednesday 9 Nov 2006

Page 2: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

overview

• scene change detectionspatial-temporal change detectionmotion segmentation (optical flow)clustering in motion parameter space(k-means test)

semantic video object segmenationchroma-keying

Page 3: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

scene change detection

• frame difference between k-th frame and reference frame at pixel location x:

∂γ

∂τ= F$ν = (f(δI(x)) + λκ)$ν (6)

Balloon forcef(δI(x))$ν (7)

As δI !", f(δI)! 0

Temporal model assuming velocity is constant between time t and τ > t

x(τ) = x(t) + vt(x)(τ # t) = x(t) + dt,! (x) (8)

Observation

Ik[n] = Ik!1[n# d] (9)

dI

ds= 0 (10)

dI

dxν1 +

dI

dyν2 +

dI

dt= ($I) · ! +

dI

dt= 0 (11)

E(v) =!

D($I(x) · v(x) +

∂I(x)dt

)2 + λ(%$(v1(x))%2 + %$(v2(x))%2) (12)

1 Motion Segmentation

Frame Di!erence k#th frame compared with reference frame at r.

FDk,r[x] = Ik[x]# Ir[x] (13)

2

!"

!#= F$% = (f(&I(x)) + '()$% (6)

Balloon forcef(&I(x))$% (7)

As &I !", f(&I)! 0

Temporal model assuming velocity is constant between time t and # > t

x(#) = x(t) + vt(x)(# # t) = x(t) + dt,! (x) (8)

Observation

Ik[n] = Ik!1[n# d] (9)

dI

ds= 0 (10)

dI

dx%1 +

dI

dy%2 +

dI

dt= ($I) · ! +

dI

dt= 0 (11)

E(v) =!

D($I(x) · v(x) +

!I(x)dt

)2 + '(%$(v1(x))%2 + %$(v2(x))%2) (12)

1 Motion Segmentation

Frame Di!erence k#th frame compared with reference frame at r.

FDk,r[x] = Ik[x]# Ir[x] (13)

thresholded segmentation label

k#th frame compared with reference frame at r. zk,r[x] = 1 if |FDk,r[x]| > T , 0 otherwise

2

!"

!#= F$% = (f(&I(x)) + '()$% (6)

Balloon forcef(&I(x))$% (7)

As &I !", f(&I)! 0

Temporal model assuming velocity is constant between time t and # > t

x(#) = x(t) + vt(x)(# # t) = x(t) + dt,! (x) (8)

Observation

Ik[n] = Ik!1[n# d] (9)

dI

ds= 0 (10)

dI

dx%1 +

dI

dy%2 +

dI

dt= ($I) · ! +

dI

dt= 0 (11)

E(v) =!

D($I(x) · v(x) +

!I(x)dt

)2 + '(%$(v1(x))%2 + %$(v2(x))%2) (12)

1 Motion Segmentation

Frame Di!erence k#th frame compared with reference frame at r.

FDk,r[x] = Ik[x]# Ir[x] (13)

thresholded segmentation label

k#th frame compared with reference frame at r. zk,r[x] = 1 if |FDk,r[x]| > T , 0 otherwise

2

!"

!#= F$% = (f(&I(x)) + '()$% (6)

Balloon forcef(&I(x))$% (7)

As &I !", f(&I)! 0

Temporal model assuming velocity is constant between time t and # > t

x(#) = x(t) + vt(x)(# # t) = x(t) + dt,! (x) (8)

Observation

Ik[n] = Ik!1[n# d] (9)

dI

ds= 0 (10)

dI

dx%1 +

dI

dy%2 +

dI

dt= ($I) · ! +

dI

dt= 0 (11)

E(v) =!

D($I(x) · v(x) +

!I(x)dt

)2 + '(%$(v1(x))%2 + %$(v2(x))%2) (12)

1 Motion Segmentation

Frame Di!erence k#th frame compared with reference frame at r.

FDk,r[x] = Ik[x]# Ir[x] (13)

thresholded segmentation label

k#th frame compared with reference frame at r. zk,r[x] = 1 if |FDk,r[x]| > T , 0 otherwise

2

Thresholded by T, segmentation label on each pixel

Problems:- a uniform intensity region may be interpreted as stationary- FD is affected by spatial gradient in the direction of motion

Page 4: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

Gaussian pyramid

• Multi-resolution representation of image1, Original (highest resolution) image at bottom level2. Lowpass filter (e.g. Gaussian filter)3. Subsample by factor 24. Place result in second level

Page 5: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

Change Detection v. 0.2• 1. Gaussian pyramid, start at lowest resolution.

2. Compute at each pixel, normalized frame difference:

where N is a local neighborhood of x, gradient of image, c is fudge addend to avoid divde by 0.

3. If FDN is high (pixel is moving), then replace FDN from previous level with this one, else retain lower res value.

4. Repeat 2-3 for all resolution levels.

!"

!#= F$% = (f(&I(x)) + '()$% (6)

Balloon forcef(&I(x))$% (7)

As &I !", f(&I)! 0

Temporal model assuming velocity is constant between time t and # > t

x(#) = x(t) + vt(x)(# # t) = x(t) + dt,! (x) (8)

Observation

Ik[n] = Ik!1[n# d] (9)

dI

ds= 0 (10)

dI

dx%1 +

dI

dy%2 +

dI

dt= ($I) · ! +

dI

dt= 0 (11)

E(v) =!

D($I(x) · v(x) +

!I(x)dt

)2 + '(%$(v1(x))%2 + %$(v2(x))%2) (12)

1 Motion Segmentation

Frame Di!erence k#th frame compared with reference frame at r.

FDk,r[x] = Ik[x]# Ir[x] (13)

thresholded segmentation label

k#th frame compared with reference frame at r. zk,r[x] = 1 if |FDk,r[x]| > T , 0 otherwise

Normalized frame di!erence:

FDNk,r[x] ="

x"N |Ik[x]# Ir[x]||$Ir[x]|"x"N |$Ir[x]|2 + c

(14)

2

Page 6: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

temporal integration 1• Warp map W[A,B]: warp image A toward B using

motion model parameters estimated between A and B.

Compute internal representation image:

Result: unchanged regions retain sharpness (less noise), changed regions blur

warp (by motion) W [A,B] warp image A toward B using motion model parameters estimatedbetween A and B:

Ik[x] = (1! !)Ik[x] + !W [Ik!1[x], Ik[x]] (15)

3

warp (by motion) W [A,B] warp image A toward B using motion model parameters estimatedbetween A and B:

Ik[x] = (1! !)Ik[x] + !W [Ik!1[x], Ik[x]] (15)

0 " ! " 1

3

(*)

Page 7: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

temporal integration 2• 1. Compute motion parameters between internal

representation and new frame within support of dominant object in previous frame.

2. Warp internal representation image at frame k-1 towards new frame.

3. Detect stationary reqgions between registered images, using as initial estiamte to compute new mask .

4. Update internal representation using (*)warp (by motion) W [A,B] warp image A toward B using motion model parameters estimatedbetween A and B:

Ik[x] = (1! !)Ik[x] + !W [Ik!1[x], Ik[x]] (15)

3

warp (by motion) W [A,B] warp image A toward B using motion model parameters estimatedbetween A and B:

Ik[x] = (1! !)Ik[x] + !W [Ik!1[x], Ik[x]] (15)

0 " ! " 1

3

warp (by motion) W [A,B] warp image A toward B using motion model parameters estimatedbetween A and B:

Ik[x] = (1! !)Ik[x] + !W [Ik!1[x], Ik[x]] (15)

0 " ! " 1

3

warp (by motion) W [A,B] warp image A toward B using motion model parameters estimatedbetween A and B:

Ik[x] = (1! !)Ik[x] + !W [Ik!1[x], Ik[x]] (15)

0 " ! " 1

Mk!1

3

warp (by motion) W [A,B] warp image A toward B using motion model parameters estimatedbetween A and B:

Ik[x] = (1! !)Ik[x] + !W [Ik!1[x], Ik[x]] (15)

0 " ! " 1

Mk!1

3

warp (by motion) W [A,B] warp image A toward B using motion model parameters estimatedbetween A and B:

Ik[x] = (1! !)Ik[x] + !W [Ik!1[x], Ik[x]] (15)

0 " ! " 1

Mk!1

3

Page 8: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

temporal integration 3

• Advantages:Comparing each frame with internal representaiton -- weighted by motion warp -- rather than previoous frame, tracks (dominant) moving object.

- noise in tracked object is lower &- image gradients elsewhere are blurred (lower)

Page 9: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

semantic video object segmentation

Low level features:motion

color (of pixel)texture

But what’s “semantic” ??

Wittgenstein Philosophical Investigations 454: A pointing arrow: ----> “The arrow points only in the application that a living being makes of it.This pointing is not a hocus-pocus which can be performed only by the soul.“

Page 10: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

exampleschroma-keying

cv.jit.meanblob tracking

quennesson’s conscious=camera

Page 11: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

averaging over time

• cv.jit.mean

Page 12: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

chroma-keying

• 10jChromakey-x.pat

Page 13: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

blob tracking

• cv.jit.labelcv.jit.blobs.boundscv.jit.blobs.centroidscv.jit.blobs.direcitioncv.jit.blobs.elongationcv.jit.blobs.momentscv.jit.blobs.orientationcv.jit.blobs.recon

Page 14: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

kevin quennesson

• conscious=camera

initial test hands

Page 15: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

conscious=camera

• Interactive video installation

Page 16: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

Consciousness of “things”

• Static moments: shows face and hands

• Movement: shows body

• Motion: shows trail

• Memory: marks remain on background

Page 17: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

Body-tracking technique

• Inspired from Pfinder– Blob tracking (of face and hands) in YUV space

– Difference: we use skin tone database

• Technologies used– Platform: MAC OS X Tiger

– Code: C, Objective-C (Cocoa framework).

– Graphics: vImage (CPU, altivec), Core Image (GPU).

– Other: Core Data, …

Page 18: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

Implications

• Different work for the programmer– Does not know where he is going initially

• Different work for the “creator”

– Design a function, not an fixed output (ie. not y in f(x)=y, but f)

• Different relation of users with the piece– What kind of consciousness does the users have of it?

– What kind of narrative is generated?

Page 19: Video Segmentationtopologicalmedialab.net/xinwei/classes/cs/COMP471_ComputerGrap… · Change Detection v. 0.2 • 1. Gaussian pyramid, start at lowest resolution. 2. Compute at each

purple veil • TML Summer Workshop 2005