Face Detection and Recognition - UNISA · Face Detection and Recognition Docente: Michele Nappi...

Post on 18-Feb-2019

248 views 0 download

Transcript of Face Detection and Recognition - UNISA · Face Detection and Recognition Docente: Michele Nappi...

Face Detection and Recognition

Docente: Michele Nappi

mnappi@unisa.it

biplab.unisa.it

089-963334

Il Volto: Una biometria al limite (?)

La pareidolia (dal greco εἴδω

λον èidōlon, "immagine", col

prefisso παρά parà, "vicino") è

l'illusione subcosciente che

tende a ricondurre a forme

note oggetti o profili (naturali o

artificiali) dalla forma casuale.

2

Un celebre caso di pareidolia: il Volto su

Marte, una formazione rocciosa marziana

ripresa dalla sonda Viking 1, che appare

come un volto in particolari condizioni e

angolazioni di luce

Il Volto: Una biometria al limite (?)

• La prosopagnosia è un

deficit percettivo acquisito

o congenito del sistema

nervoso centrale che

impedisce ai soggetti che

ne vengono colpiti di

riconoscere

correttamente i volti delle

persone

3

Esempi di pareidolia

11/06/2016 Riconoscimento facciale 5

Perché il volto

• I due più importanti fattori che sanciscono il successo di una biometria sono Affidabilità e Accettabilità

• Il riconoscimento dell’iride è il sistema più affidabile, ma è anche quello più intrusivo. Le impronte digitali sono più facilmente accettate, ma non applicabili a soggetti non consenzienti.

• Il volto ha un’accettabilità molto elevata, mentre l’affidabilità deve essere ancora migliorata

Further advantages …It is natural to recognize a person from face and it is

usual for people to be photographed (e.g., fingerprints

may be associated with the bad feeling to be suspected

to be a criminal)

High recognition rate in controlled conditions

Acquisition devices are easy to

deploy in the operation environment

It is possible to integrate it in remote control applications

It is possible to integrate it in logical access (login)

… but face is a complex object!

(Facial Animation Parameters)

Comparisons

From http://www.cs.bham.ac.uk/~mdr/teaching/module/security/lectures/biometric.html

Classical properties

Comparisons

From http://www.biometricsintegrated.com/biometric-types-comparison.html/biometric-comparison-chart

Further properties

Notice Hygienic

level!

refers to possibility

to acquire

Comparisons

From http://www.ssbs.com/sasolutions/medical-solution/sproducts-client-secure.aspx

Interesting to notice the

position of face which

shares «good»

properties with

fingerprints

Comparisons

Which is the most important

dimensions?

Face is in acceptable position

when compared with other

traits

Market

By Acuity Market Intelligence

(can you guess what is their product?)

Applications

iOS face recognition app –

FaceVault

Forensics

Border control

In a crowd!

ProblemsIntra-personal variations.

Piero Fornasetti's template of

Cavalieri's face gave birth to

hundreds of variations.

http://www.artistdaily.com/blogs/artistdaily/archive/201

3/04/22/350-ways-of-drawing-faces.aspx

Inter-personal similarities.

Bradley Cooper & Hrithik Roshan

http://www.mensxp.com/entertainme

nt/gossip/7356-have-we-met-before-

bollywoodhollywood-lookalikes.html

In detail: PIE (Pose, Illumination,

Expression) variations

http://www.csee.wvu.edu/~gidoretto/courses/2011-fall-

cp/assignments/final_project/results/cao_chen/index.html

In detail: PIE (Pose, Illumination,

Expression) variations + Ageing = A-PIE

Popular databases

Very popular public face databases:

• AR-Faces

• FERET

• MIT

• ORL

• Harward

• MIT/CMU

• CMU test set II

• …

Popular databases

Popular databases: FERET

Popular databases: AR-Faces

(1998)

126 people (over 4,000 color images).

Different facial expressions, illumination conditions and

occlusions.

Two sessions per person (2 different days).

1 : Neutral expression

2 : Smile

3 : Anger

4 : Scream

5 : left light on

6 : right light on

7 : all side lights on

8 : wearing sun glasses

9 : wearing sun glasses and left light on

10 : wearing sun glasses and right light on

11 : wearing scarf

12 : wearing scarf and left light on

13 : wearing scarf and right light on

14 to 26 : second session (same as 1 to 13)

http://wwwsipl.technion.ac.il/new/DataBases/Aleix

%20Face%20Database.htm

Popular databases : CMU-PIE

(2000)• CMU Pose, Illumination, and

Expression (PIE) database.

• 68 subjects

• About 608 color pictures per subject (resolution 640×486 pixels) =41,368 images

• Each person is imaged under 13 different poses, 43 different illumination conditions, and with 4 different expressions.

Popular databases: CASIA 3D

Face V1 (2004)4624 scans of 123 persons using the

non-contact 3D digitizer, Minolta Vivid

910 Created in 2004

http://biometrics.idealtest.org/dbDetailForUser.do?id=8

Popular databases: LFW (Labeled Faces in the

Wild)(2007)

http://vis-www.cs.umass.edu/lfw/

The data set contains more

than 13,000 images of faces

collected from the web.

Each face has been labeled

with the name of the person

pictured. 1680 of the people

pictured have two or more

distinct photos in the data

set. The only constraint on

these faces is that they were

detected by the Viola-Jones

face detector.

Popular databases: YouTube faces

(2011)

http://www.cs.tau.ac.il/~wolf/ytfaces/

The data set contains 3,425 videos of 1,595 different people. All the videos were

downloaded from YouTube. An average of 2.15 videos are available for each subject. The

shortest clip duration is 48 frames, the longest clip is 6,070 frames, and the average length

of a video clip is 181.3 frames. The creation started by using the 5,749 names of subjects

included in the LFW data set to search YouTube for videos of these same individuals. The

top six results for each query were downloaded.

Number of videos per person:

#videos 1 2 3 4 5 6

#people 591 471 307 167 51 8

11/06/2016 Riconoscimento facciale 25

Verifica e Riconoscimento

• Verifica: Confronto Uno a Uno. Conferma l’identità

dichiarata da un individuo

– Dichiarata mediante: carta di identità, codice utente, ...

• Riconoscimento: Confronto Uno a Molti. Stabilisce

l’identità di un soggetto a partire da un insieme di

persone registrate

Ok

11/06/2016 Riconoscimento facciale 26

Struttura di un Riconoscitore Facciale

• Il primo step di un sistema automatico è un face detector.

• A partire dall’immagine, si estrae la regione contenente unicamente il volto.

• Infine la regione estratta viene segmentata, associando un significato semantico a ciascuna zona del volto.

Occhio

Sinistro Occhio

Destro

Naso

Bocca

Face DetectorF

ace

Seg

men

tatio

n

Face localization• Problem: given a single image or a video sequence, detect the presence of

one or more faces and locate their position within the single image.

• Requirements: it is necessary to be independent with respect to position,

orientation (pose), scale, expression, possibly DIFFERENT for different

subjects in the image, as well as to illumination or cluttered background.

What to do here???

Face localization …

possible to hide from

it?

“According to Adam Harvey, the

key part of the face that computers

can read is the "nose bridge," or the

area between the eyes. If you can

obscure that, you have a good

chance of tricking computers into

thinking you don't have a face, he

said. Another technique is to create

an "anit-face," which is less

terrifying than it sounds since it just

means inverting your face's color

scheme. So the black-and-white

triangles on the cheeks aim

to achieve this effect.” (from CNN -

http://whatsnext.blogs.cnn.com/201

2/04/29/how-to-hide-from-face-

detection-technology/)

Adam Harvey proposes CV Dazzle

(Computer vision Dazzle) to avoid

one’s face detection

(http://ahprojects.com/projects/)

Face localization: some

approachesFeature-based techniques

These techniques make explicit use of the knowledge about the expected face

appearance, which is characterized by a set of features at different levels (below, they

are from lower to higher)

• Pixel properties:

– Edges

– Skin color

• Face geometry properties:

– Constellation

– Feature searching

• Template matching against a standard model:

– Correlation

– Snakes

– Active Shape Models

Face localization: some

approachesImage-based techniques

These techniques address the localization problem as a generic pattern recognition

problem (class of faces).

The goal is to learn to recognize a face image according to a number of examples.

Face localization: recent

advances

Localizing Parts of Faces Using a Consensus of Exemplars

from: http://neerajkumar.org/projects/face-parts/

Algorithm A: color space

transformation • RGB is not a perceptually uniform space = colors which are close to

each other in RGB space may not be perceived as similar

• Skin model: a set of close colors (cluster) within the color space

• It is advisable to perform the detection in a different space

RGB color space

YCBCR color space

Modelli di Colore: YIQ

Eye/Mouth localization

• Chroma map is enhanced by histogram equalization

• The two maps are combined through AND operator

• The resulting map undergoes dilation, masking and normalization to

discard the other face regions and brighten eyes.

• Further operations allow to refine this map.

Face contourThe algorithm analyzes all the triangles composed by two candidate

eyes and a candidate mouth.

Each triangle is verified by cheking:

• Luma variations and average of the orientation gradient of the blobs

containing eyes and mouth

• Geometry and orientation of the triangle

• Presence of a face contour around the triangle

A score is assigned to each

triangle satisfying the

conditions, which also

considers preference for

upright orientation and

simmetry.

The triangle with highests score (above a threshold) selected.

Some results

Face Detection: Viola-Jones

• Paul Viola and Michael Jones proposed one of the most successful (so far)

approaches to object localization (included in OpenCV).

• The algorithm is image based and can be applied to face detection (but also to

eye and mouth detection in a hierarchical strategy).

• The algorithm requires to create a classifier that is initially trained using multiple

instances of the class to identify (positive examples), and several instances of

images that do not contain any object of the class but may cause an error

(negative examples).

• Training is designed to extract several features from the examples and to select

the most discriminating ones. The statistical model which is built incrementally

contains such information.

• Misses (a present object is not detected) or false alarms (an object is detected

but it is not present) can be decreased by retraining adding new suited examples

(positive or negative).

Face Detection: Viola-Jones (II)

• Haar Features – All human faces share some similar

properties. This knowledge is used to construct certain

features known as Haar Features.

• The properties that are similar for a human face are:

– The eyes region is darker than the upper-cheeks.

– The nose bridge region is brighter than the eyes.

• That is useful domain knowledge:

– Location - Size: eyes & nose bridge region

– Value: darker / brighter

Haar Feature that looks similar to

the bridge of the nose is applied

onto the face

Haar Feature that looks similar to the eye

region which is darker than the upper

cheeks is applied onto a face

Face Detection: Viola-Jones (III)• The value of any given feature is

always simply the sum of the pixels

within clear rectangles subtracted

from the sum of the pixels within

shaded rectangles.

• Rectangle features:

– Value = Σ (pixels in black area) - Σ (pixels

in white area)

• Three types: two-, three-, four-

rectangles, Viola & Jones used two-

rectangle features

• For example: the difference in

brightness between the white &black

rectangles over a specific area

• Each feature is related to a special

location in the sub-window 39

Example use of rectangular

features

https://github.com/atduskgreg/Makematics/blob/master/research/haar_detection

/index.md

An example of an early stage

in the Haar cascade. Each

black and white patch

represents a feature that the

algorithm hunts for in the

image.

Note: example of AdaBoost

Note: example of AdaBoost

Note: example of AdaBoost

Note: example of AdaBoost

Note: example of AdaBoost

Note: example of AdaBoost

11/06/2016 Riconoscimento facciale 47

Struttura di un Riconoscitore Facciale (1)

• Si associa un insieme di caratteristicheall’identità di un soggetto

– Raccolta dei dati ed estrazione delle caratteristiche

• Il modello estratto (template) è memorizzato in un database o su un supporto portatile (smart card)

• Questo processo può essere effettuato singolarmente o in blocco (batch enrollment)

• Si estrae il template dal volto del soggetto che richiede di essere autenticato

– Raccolta dei dati ed estrazione delle caratteristiche

• Il modello estratto (template) è cercato all’interno di in un database o su un supporto portatile (smart card)

• Un criterio di matching stabilisce se il soggetto deve essere autorizzato o no.

Enrollment

Testing

11/06/2016 Riconoscimento facciale 48

Sistemi esistenti in Letteratura

Image Based

•ICA

•Neural Networks

•Eigenfaces

Feature Based

•Elastic Graph Matching

Hybrid

•Fractals

•Wavelets

3D

•3D Morphable Models

11/06/2016 Riconoscimento facciale 49

Eigenfaces – sistemi lineari

Immagini in Input LinearizzazioneMedia su tutte i volti A ciascun vettore togliamo la media

Risolvendo un particolare sistema lineare

estraiamo un insieme di volti di riferimento.

Ciascun volto viene rappresentato come una combinazione

dei volti di riferimento. I coefficienti utilizzati per la

rappresentazione costituiscono il vettore di caratteristiche

Note (continued)• Some consequences:

– Machine learning requires learning a possibly infinite distribution

from a finite number of data samples; in a high-dimensional

feature space with each feature having a number of possible

values, an enormous amount of training data are required to

ensure that there are several samples with each combination of

values; with a fixed number of training samples, the predictive

power reduces as the dimensionality increases (Hughes effect).

– When a measure such as a Euclidean distance is defined using

many coordinates, there is little difference in the distances

between different pairs of samples problems in nearest

neighbor search and k-nearest neighbor classification

• Dimensionality reduction methods such as Principal Component

Analysis (PCA) aim at reducing high-dimensional data sets to lower-

dimensional data without significant information loss, by identifying

the most informative dimensions relevant subspaces

A statistical method in detail: PCA

• Principal component analysis (PCA) is a statistical procedure that uses an

orthogonal transformation to convert a set of observations of possibly correlated

variables (some redundancy is involved) into a set of values of linearly

uncorrelated variables (no redundancy) called principal components.

• The number of principal components is less than or equal to the number of

original variables the n-dimensional feature space is projected onto a k-

dimensional subspace.

• The transformation is defined in such a way that the first principal component

has the largest possible variance (that is, accounts for as much of the variability

in the data as possible), and each succeeding component in turn has the

highest variance possible under the constraint that it is orthogonal to (i.e.,

uncorrelated with) the preceding components.

• For an image space, the principal components are orthogonal when they are the

eigenvectors of the covariance matrix, which is symmetric.

• In signal processing it is also often referred to as the discrete Karhunen–Loève

transform (KLT).

A statistical method in detail:

PCA

A statistical method in detail:

PCA

A statistical method in detail:

PCA• Sirovich and Kirby where the first to use Principal

Component Analysis (PCA) for face recognition.

• They demonstrated that any particular face can be

efficiently represented along the eigenpictures

coordinate space, and that can be approximately

reconstructed by using just a small collection of

eigenpictures and the corresponding projections

(‘coefficients’) along each eigenpicture.

• PCA Image space for face representation

PCA and Eigenfaces• Starting from Sirovich and Kirby’s approach to

representation, Turk and Pentland realized that

projections along eigenpictures can be used as

classification features to recognize faces.

• They developed a face recognition system that builds

eigenfaces, corresponding to eigenvectors associated

with the dominant eigenvalues of the covariance matrix

built with linearized known faces (patterns).

• The system recognizes particular faces by comparing

their projections along the eigenfaces to those of the

face images of the known individuals.

• The eigenfaces define a feature space that drastically

reduces the dimensionality of the original space, and

face identification is carried out in this reduced space.

• PCA Holistic classification

PCA and Eigenfaces

Example training set

PCA and Eigenfaces

Mean image from the training set

PCA and Eigenfaces

Eigenfaces from the training set

Projection

PCA and Eigenfaces

• Basic idea: perform the search in a subspace k-

dimensional KL subspace which has been built after

faces of the training set.

• Each gallery image is projected onto the KL subspace

the projection coefficients make up the feature vector.

• Each probe image is projected onto the subspace to be

recognized, and classification is based on a simple

nearest neighbor criterion : the face to be recognized is

associated with the face of the gallery to which

corresponds the minimum distance in the subspace

(possibly subject to a threshold).

PCA and Eigenfaces

Example backprojection

PCA: problems and a solution

• PCA is good for representation, but lacks true discriminative

power.

• Eigenfaces separation (scattering) that is maximized depends not

only on inter-class separation (truly useful for classification) but

also on the intra-class differences …

• When there are significant PIE variations, PCA retains them and

therefore the similarity in the space of the faces is not necessarily

determined the identity of individuals the classes may be not

separate correctly.

• To solve the above problems Belhumer et al. (1997) proposed

Fisherfaces, an application of Fisher’s linear discriminant

(FLD), more often mentioned in the context of Linear

Discriminant Analysis (LDA).

11/06/2016 Riconoscimento facciale 63

Vantaggi e Svantaggi

• La fase di identificazione è veloce.

• Se si conservano gli autovettori è possibile ricostruire l’informazione originaria.

• La fase di training è lenta.

• Se si aggiunge un numero consistente di nuovi soggetti è necessario il retraining del sistema.

• Elevata sensibilità a variazioni di illuminazione, posa, occlusioni, …

11/06/2016 Riconoscimento facciale 64

Reti Neurali• Una rete neurale simula il funzionamento dei neuroni nel cervello.

• Ciascun neurone è rappresentato da una funzione matematica basata sul calcolo delle probabilità.

• Per il riconoscimento dei volti, l’ottimo sarebbe utilizzare un neurone per ciascun pixel. Questo approccio richiede troppi neuroni.

• La soluzione consiste nell’utilizzare una rete di neuroni per « riassumere » l’immagine in un vettore più piccolo. Mentre una seconda rete effettua il riconoscimento vero e proprio.

=

=

=

=

=

Input faces LinearizationDimensionality

reductionClassification

network

Output

classes

11/06/2016 Riconoscimento facciale 65

Reti Neurali

11/06/2016 Riconoscimento facciale 66

Vantaggi e Svantaggi

• Riducono l’ambiguità fra soggetti appartenenti a classi simili.

• Con opportuni accorgimenti sono robuste alle occlusioni

• Richiedono più di una immagine per il training

• Alcune reti sono soggette a problemi di:

– Overfitting: la rete ha la stessa dimensione dell’input.

– Overtraining: se le immagini sono grandi il metodo converge lentamente

– Dtabase size: quando il numero di soggetti cresce diventano inefficienti.

11/06/2016 Riconoscimento facciale 67

Sistemi basati su Grafi• Attraverso filtri e funzioni di localizzazione vengono localizzati sul

volto un insieme di punti di riferimento.

• Questi punti vengono collegati da archi pesati e si ottiene un

grafo.

• Ad ogni volto è associato un grafo, per cui confrontare due volti

significa confrontare due grafi.

11/06/2016 Riconoscimento facciale 68

Vantaggi e Svantaggi

• Sono robusti rispetto alle variazioni di posa

• Sono robusti rispetto alle variazioni di illuminazione

• Non richiedono il retraining del sistema

• Il processo di training è lento

• Il processo di testing è molto lento perché richiede il confronto fra grafi (NP-Hard).

11/06/2016 Riconoscimento facciale 69

Termogramma• L’immagine del volto viene acquisita mediante

un sensore termico.

• Il sensore rileva le variazione di temperatura

dell’epidermide del volto.

• L’immagine viene segmentata e indicizzata.

11/06/2016 Riconoscimento facciale 70

Vantaggi e Svantaggi

• Sono robusti rispetto alle variazioni di illuminazione

• Sono robusti rispetto alle variazioni di tempo

• Sono efficienti anche in caso di ambienti esterni

• Richiedono dispositivi di acquisizione costosi

• I dispositivi di acquisizione sono troppo sensibili ai movimenti del soggetto e offrono limitate risoluzioni

• Dipendono dallo stato emotivo del soggetto

• La presenza di un vetro fra il soggetto e il dispositivo rende inefficace l’acquisizione.