Δ-encoder: An effective sample synthesis method for few-shot …05-09-45)-05... · 2018. 11....

12
IBM Research AI | Computer Vision and Augmented Reality Δ-encoder: An effective sample synthesis method for few-shot object recognition Eli Schwartz*, Leonid Karlinsky*, Joseph Shtok, Sivan Harary, Mattias Marder, Abhishek Kumar, Rogerio Feris, Raja Giryes, Alex M. Bronstein

Transcript of Δ-encoder: An effective sample synthesis method for few-shot …05-09-45)-05... · 2018. 11....

Page 1: Δ-encoder: An effective sample synthesis method for few-shot …05-09-45)-05... · 2018. 11. 29. · IBM Research AI | Computer Vision and Augmented Reality Δ-encoder: An effective

IBM Research AI | Computer Vision and Augmented Reality

Δ-encoder: An effective sample synthesis method for few-shot object recognitionEli Schwartz*, Leonid Karlinsky*,

Joseph Shtok, Sivan Harary, Mattias Marder, Abhishek Kumar,

Rogerio Feris, Raja Giryes, Alex M. Bronstein

Page 2: Δ-encoder: An effective sample synthesis method for few-shot …05-09-45)-05... · 2018. 11. 29. · IBM Research AI | Computer Vision and Augmented Reality Δ-encoder: An effective

IBM Research AI | Computer Vision and Augmented Reality

Who’s that dog?

Page 3: Δ-encoder: An effective sample synthesis method for few-shot …05-09-45)-05... · 2018. 11. 29. · IBM Research AI | Computer Vision and Augmented Reality Δ-encoder: An effective

IBM Research AI | Computer Vision and Augmented Reality

• The model is a variant ofan auto-encoder operatingin feature space

Encoder

Decoder

𝑍

Key idea – training

Page 4: Δ-encoder: An effective sample synthesis method for few-shot …05-09-45)-05... · 2018. 11. 29. · IBM Research AI | Computer Vision and Augmented Reality Δ-encoder: An effective

IBM Research AI | Computer Vision and Augmented Reality

• The model is a variant ofan auto-encoder operatingin feature space

• The network learns to encode the delta between the reference and the target image

Encoder

Decoder

𝑍

target reference

delta

Key idea – training

Page 5: Δ-encoder: An effective sample synthesis method for few-shot …05-09-45)-05... · 2018. 11. 29. · IBM Research AI | Computer Vision and Augmented Reality Δ-encoder: An effective

IBM Research AI | Computer Vision and Augmented Reality

• The model is a variant ofan auto-encoder operatingin feature space

• The network learns to encode the delta between the reference and the target image

• This delta is used to recover the target image as a (non-linear)combination of the reference and the delta

Encoder

Decoder

𝑍

target reference

recovered target

delta

Key idea – training

Page 6: Δ-encoder: An effective sample synthesis method for few-shot …05-09-45)-05... · 2018. 11. 29. · IBM Research AI | Computer Vision and Augmented Reality Δ-encoder: An effective

IBM Research AI | Computer Vision and Augmented Reality

Encoder

Decoder

𝑍

Key idea – synthesizing

Page 7: Δ-encoder: An effective sample synthesis method for few-shot …05-09-45)-05... · 2018. 11. 29. · IBM Research AI | Computer Vision and Augmented Reality Δ-encoder: An effective

IBM Research AI | Computer Vision and Augmented Reality

• At test time we sampleencoded deltasfrom random training image pairs

Encoder

Decoder

𝑍

sampledtarget

sampled reference

sampled delta

Key idea – synthesizing

Page 8: Δ-encoder: An effective sample synthesis method for few-shot …05-09-45)-05... · 2018. 11. 29. · IBM Research AI | Computer Vision and Augmented Reality Δ-encoder: An effective

IBM Research AI | Computer Vision and Augmented Reality

• At test time we sampleencoded deltasfrom random training image pairs

• The sampled deltas are usedto create samples for new classes by combining them with the new class reference examples

• These samples are used to traina classifier for the new category

Encoder

Decoder

𝑍

sampledtarget

sampled reference

sampled delta

new classreference

synthesizednew classexample

Key idea – synthesizing

Page 9: Δ-encoder: An effective sample synthesis method for few-shot …05-09-45)-05... · 2018. 11. 29. · IBM Research AI | Computer Vision and Augmented Reality Δ-encoder: An effective

IBM Research AI | Computer Vision and Augmented Reality

miniImageNet: 58.5 (previous SOA) 59.9 (ours)CIFAR-100: 63.4 (previous SOA) 66.7 (ours)Caltech-256: 63.8 (previous SOA) 73.2 (ours)CUB: 69.6 (previous SOA) 69.8 (ours)

50

55

60

65

70

75

minImageNet CIFAR-100 Caltech-256 CUB

one-shot classification benchmarks

ours previous state-of-the-art

Few-shot classification experiments

Page 10: Δ-encoder: An effective sample synthesis method for few-shot …05-09-45)-05... · 2018. 11. 29. · IBM Research AI | Computer Vision and Augmented Reality Δ-encoder: An effective

IBM Research AI | Computer Vision and Augmented Reality

miniImageNet: 58.5 (previous SOA) 59.9 (ours)CIFAR-100: 63.4 (previous SOA) 66.7 (ours)Caltech-256: 63.8 (previous SOA) 73.2 (ours)CUB: 69.6 (previous SOA) 69.8 (ours)

50

55

60

65

70

75

minImageNet CIFAR-100 Caltech-256 CUB

one-shot classification benchmarks

ours previous state-of-the-art

Few-shot classification experiments

Page 11: Δ-encoder: An effective sample synthesis method for few-shot …05-09-45)-05... · 2018. 11. 29. · IBM Research AI | Computer Vision and Augmented Reality Δ-encoder: An effective

IBM Research AI | Computer Vision and Augmented Reality

Real vs synthetic examples ablation study

Page 12: Δ-encoder: An effective sample synthesis method for few-shot …05-09-45)-05... · 2018. 11. 29. · IBM Research AI | Computer Vision and Augmented Reality Δ-encoder: An effective

IBM Research AI | Computer Vision and Augmented Reality