Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.

Scale-Invariant Feature Transform (SIFT)

Jinxiang Chai

Review

Image Processing - Median filtering

- Bilateral filtering

- Edge detection

- Corner detection

Review: Corner Detection

1. Compute image gradients

2. Construct the matrix from it and its neighborhood values

3. Determine the 2 eigenvalues λ(i.j)= [λ1, λ2].

4. If both λ1 and λ2 are big, we have a corner

),(yyx

yxxji III

The Orientation Field

Corners are detected where both λ1 and λ2 are big

Good Image Features

• What are we looking for?– Strong features– Invariant to changes (affine and

perspective/occlusion)– Solve the problem of correspondence

• Locate an object in multiple images (i.e. in video)• Track the path of the object, infer 3D structures,

object and camera movement,

Scale Invariant Feature Transform (SIFT)

• Choosing features that are invariant to image scaling and rotation

• Also, partially invariant to changes in illumination and 3D camera viewpoint

Invariance

• Illumination

• Scale

• Rotation

• Affine

Required Readings

• Object recognition from local scale-invariant features [pdf link], ICCV 09

• David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110

Motivation for SIFT

• Earlier Methods– Harris corner detector

• Sensitive to changes in image scale• Finds locations in image with large gradients in two

directions

– No method was fully affine invariant• Although the SIFT approach is not fully invariant it

allows for considerable affine change• SIFT also allows for changes in 3D viewpoint

SIFT Algorithm Overview

1. Scale-space extrema detection

2. Keypoint localization

3. Orientation Assignment

4. Generation of keypoint descriptors.

Scale Space• Different scales are appropriate for

describing different objects in the image, and we may not know the correct scale/size ahead of time.

Scale space (Cont.)

• Looking for features (locations) that are stable (invariant) across all possible scale changes– use a continuous function of scale (scale space)

• Which scale-space kernel will we use?– The Gaussian Function

- variable-scale Gaussian

- input image

Scale-Space of Image

y)(x,I *)ky,G(x, )ky,x,( L

),,( kyxG),( yxI

- input image

• To detect stable keypoint locations, find the scale-space extrema in difference-of-Gaussian function

),,( kyxG),( yxI

),(*)),,(),,((),,( yxIyxGkyxGyxD ),,(),,(),,( yxLkyxLyxD

- input image

),,( kyxG),( yxI

- input image

),,( kyxG),( yxI

Look familiar?

- input image

),,( kyxG),( yxI

Look familiar?

-bandpass filter!

Difference of Gaussian

1. A = Convolve image with vertical and horizontal 1D Gaussians, σ=sqrt(2)

2. B = Convolve A with vertical and horizontal 1D Gaussians, σ=sqrt(2)

3. DOG (Difference of Gaussian) = A – B

4. So how to deal with different scales?

Difference of Gaussian

1. A = Convolve image with vertical and horizontal 1D Gaussians, σ=sqrt(2)

2. B = Convolve A with vertical and horizontal 1D Gaussians, σ=sqrt(2)

3. DOG (Difference of Gaussian) = A – B

4. Downsample B with bilinear interpolation with pixel spacing of 1.5 (linear combination of 4 adjacent pixels)

Difference of Gaussian Pyramid

Input Image

Downsample

Other issues

• Initial smoothing ignores highest spatial frequencies of images

Other issues

- expand the input image by a factor of 2, using bilinear interpolation, prior to building the pyramid

Other issues

- expand the input image by a factor of 2, using bilinear interpolation, prior to building the pyramid

• How to do downsampling with bilinear interpolations?

Bilinear Filter

Weighted sum of four neighboring pixels

Bilinear Filter

Sampling at S(x,y):

(i+1,j)

(i,j) (i,j+1)

(i+1,j+1)

S(x,y) = a*b*S(i,j) + a*(1-b)*S(i+1,j)

+ (1-a)*b*S(i,j+1) + (1-a)*(1-b)*S(i+1,j+1)

Bilinear Filter

Sampling at S(x,y):

(i+1,j)

(i,j) (i,j+1)

(i+1,j+1)

S(x,y) = a*b*S(i,j) + a*(1-b)*S(i+1,j)

+ (1-a)*b*S(i,j+1) + (1-a)*(1-b)*S(i+1,j+1)

Si = S(i,j) + a*(S(i,j+1)-S(i))

Sj = S(i+1,j) + a*(S(i+1,j+1)-S(i+1,j))

S(x,y) = Si+b*(Sj-Si)

To optimize the above, do the following

Bilinear Filter

(i+1,j)

(i,j) (i,j+1)

(i+1,j+1)

Pyramid Example

A1 B1 DOG1

DOG3A2

Feature Detection

• Find maxima and minima of scale space• For each point on a DOG level:

– Compare to 8 neighbors at same level– If max/min, identify corresponding point at pyramid

level below– Determine if the corresponding point is max/min of its 8

neighbors– If so, repeat at pyramid level above

• Repeat for each DOG level• Those that remain are key points

Identifying Max/Min

DOG L-1

DOG L+1

Refining Key List: Illumination

• For all levels, use the “A” smoothed image to compute– Gradient Magnitude

• Threshold gradient magnitudes: – Remove all key points with MIJ less than 0.1

times the max gradient value

• Motivation: Low contrast is generally less reliable than high for feature points

Results: Eliminating Features

• Removing features in low-contrast regions

Results: Eliminating Features

• Removing features in low-contrast regions

Assigning Canonical Orientation

• For each remaining key point:– Choose surrounding N x N window at DOG

level it was detected

DOG image

• For all levels, use the “A” smoothed image to compute– Gradient Orientation

Gaussian Smoothed Image Gradient Orientation Gradient Magnitude

• Gradient magnitude weighted by 2D Gaussian with σ of 3 times that of the current smoothing scale

Gradient Magnitude 2D Gaussian Weighted Magnitude

Assigning Canonical Orientation• Accumulate in histogram

based on orientation• Histogram has 36 bins with

10° increments

Weighted Magnitude

Gradient OrientationGradient OrientationS

Assigning Canonical Orientation• Identify peak and assign

orientation and sum of magnitude to key point

Weighted Magnitude

Gradient OrientationGradient OrientationS

Eliminating edges

• Difference-of-Gaussian function will be strong along edges– So how can we get rid of these edges?

Eliminating edges

• Difference-of-Gaussian function will be strong along edges– Similar to Harris corner detector

– We are not concerned about actual values of eigenvalue, just the ratio of the two

),(yyx

xji III

222 )1()()(

Eliminating edges

• Difference-of-Gaussian function will be strong along edges– So how can we get rid of these edges?

Local Image Description

• SIFT keys each assigned:– Location– Scale (analogous to level it was detected)– Orientation (assigned in previous canonical

orientation steps)

• Now: Describe local image region invariant to the above transformations

SIFT: Local Image Description

• Needs to be invariant to changes in location, scale and rotation

SIFT Key Example

For each key point:

• Identify 8x8 neighborhood (from DOG level it was detected)

• Align orientation to x-axis

3. Calculate gradient magnitude and orientation map

4. Weight by Gaussian

5. Calculate histogram of each 4x4 region. 8 bins for gradient orientation. Tally weighted gradient magnitude.

6. This histogram array is the image descriptor. (Example here is vector, length 8*4=32. Best suggestion: 128 vector for 16x16 neighborhood)

Applications: Image Matching

• Find all key points identified in source and target image– Each key point will have 2d location, scale and

orientation, as well as invariant descriptor vector

• For each key point in source image, search corresponding SIFT features in target image.

• Find the transformation between two images using epipolar geometry constraints or affine transformation.

Image matching via SIFT featrues

Feature detection

• Image matching via nearest neighbor search

- if the ratio of closest distance to 2nd closest distance greater than 0.8 then reject as a false match.

• Remove outliers using epipolar line constraints.

Summary

• SIFT features are reasonably invariant to rotation, scaling, and illumination changes.

• We can use them for image matching and object recognition among other things.

• Efficient on-line matching and recognition can be performed in real time

Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.

Documents

Transcript of Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.

Rimini Series MINI LIGHT Showcase/Retail/Feature€¦ · Showcase/Retail/Feature Product Range Code Watts Dimension CRI Efficiency MTR1W17*55BK 3W φ17.2*55.5mm >80 90lm/W • •

75502390 Invariant Theory 1

LTE Feature Performance

Normally hyperbolic invariant manifolds near strong …vkaloshi/papers/NHIC-DR.pdf · Normally hyperbolic invariant manifolds near ... normally hyperbolic invariant ... original Hamiltonian

INVARIANT MEASURES AND ARITHMETIC QUANTUM UNIQUE ERGODICITYelon/Publications/quantum.pdf · INVARIANT MEASURES AND ARITHMETIC QUANTUM UNIQUE ERGODICITY ELON LINDENSTRAUSS Abstract.

CSCE 441: Keyframe Animation/Smooth Curves (Cont.) Jinxiang Chai.

Group theory is the study of symmetry (more than just proofs) Symmetric = anything invariant under transformations Examples: circle invariant under rotation.

A Characterization of Locally Testable Affine-Invariant ...

Linear Time-Invariant Dynamical Systemspeople.duke.edu/~hpgavin/SystemID/CourseNotes/LTI.pdf · 2020-06-07 · Linear Time Invariant Systems 3 A single degree of freedom oscillator

Tensor Methods for Feature Learning

On Feature Learning in Neural Networks: Emergence from ...

INVARIANT DIFFERENTIAL OPERATORS AND MEIXNER …

COMPUTING WITH RATIONAL SYMMETRIC FUNCTIONS AND APPLICATIONS TO INVARIANT

Towards U(N|M) knot invariant from ABJM theorybctp.uni-bonn.de/workshop2014/talks/kimura.pdf · Towards U(NjM) knot invariant from ABJM theory Taro Kimura Institut de Physique Th

Feature extraction for image and point set analysis

Using global invariant manifolds to understand - Boston University

Linear Time-Invariant Dynamical Systems

Huntingtin exon 1 fibrils feature an interdigitated β-hairpin–based … · Huntingtin exon 1fibrils feature an interdigitated β-hairpin–based polyglutamine core Cody L. Hoopa,1,2,

Managing Feature Models

WebFML: Synthesizing Feature Models Everywhere (@ SPLC 2014)

Rimini Series MINI LIGHT Showcase/Retail/Feature€¦ · Showcase/Retail/Feature Product Range Code Watts Dimension CRI Efficiency MTR1W1755BK 3W φ17.255.5mm >80 90lm/W • •