Δ-encoder: An effective sample synthesis method for few-shot …05-09-45)-05... · 2018. 11....

Post on 06-Oct-2020

1 views 0 download

Transcript of Δ-encoder: An effective sample synthesis method for few-shot …05-09-45)-05... · 2018. 11....

IBM Research AI | Computer Vision and Augmented Reality

Δ-encoder: An effective sample synthesis method for few-shot object recognitionEli Schwartz*, Leonid Karlinsky*,

Joseph Shtok, Sivan Harary, Mattias Marder, Abhishek Kumar,

Rogerio Feris, Raja Giryes, Alex M. Bronstein

IBM Research AI | Computer Vision and Augmented Reality

Who’s that dog?

IBM Research AI | Computer Vision and Augmented Reality

• The model is a variant ofan auto-encoder operatingin feature space

Encoder

Decoder

𝑍

Key idea – training

IBM Research AI | Computer Vision and Augmented Reality

• The model is a variant ofan auto-encoder operatingin feature space

• The network learns to encode the delta between the reference and the target image

Encoder

Decoder

𝑍

target reference

delta

Key idea – training

IBM Research AI | Computer Vision and Augmented Reality

• The model is a variant ofan auto-encoder operatingin feature space

• The network learns to encode the delta between the reference and the target image

• This delta is used to recover the target image as a (non-linear)combination of the reference and the delta

Encoder

Decoder

𝑍

target reference

recovered target

delta

Key idea – training

IBM Research AI | Computer Vision and Augmented Reality

Encoder

Decoder

𝑍

Key idea – synthesizing

IBM Research AI | Computer Vision and Augmented Reality

• At test time we sampleencoded deltasfrom random training image pairs

Encoder

Decoder

𝑍

sampledtarget

sampled reference

sampled delta

Key idea – synthesizing

IBM Research AI | Computer Vision and Augmented Reality

• At test time we sampleencoded deltasfrom random training image pairs

• The sampled deltas are usedto create samples for new classes by combining them with the new class reference examples

• These samples are used to traina classifier for the new category

Encoder

Decoder

𝑍

sampledtarget

sampled reference

sampled delta

new classreference

synthesizednew classexample

Key idea – synthesizing

IBM Research AI | Computer Vision and Augmented Reality

miniImageNet: 58.5 (previous SOA) 59.9 (ours)CIFAR-100: 63.4 (previous SOA) 66.7 (ours)Caltech-256: 63.8 (previous SOA) 73.2 (ours)CUB: 69.6 (previous SOA) 69.8 (ours)

50

55

60

65

70

75

minImageNet CIFAR-100 Caltech-256 CUB

one-shot classification benchmarks

ours previous state-of-the-art

Few-shot classification experiments

IBM Research AI | Computer Vision and Augmented Reality

miniImageNet: 58.5 (previous SOA) 59.9 (ours)CIFAR-100: 63.4 (previous SOA) 66.7 (ours)Caltech-256: 63.8 (previous SOA) 73.2 (ours)CUB: 69.6 (previous SOA) 69.8 (ours)

50

55

60

65

70

75

minImageNet CIFAR-100 Caltech-256 CUB

one-shot classification benchmarks

ours previous state-of-the-art

Few-shot classification experiments

IBM Research AI | Computer Vision and Augmented Reality

Real vs synthetic examples ablation study

IBM Research AI | Computer Vision and Augmented Reality