Δ-encoder: An effective sample synthesis method for few-shot …05-09-45)-05... · 2018. 11....
Transcript of Δ-encoder: An effective sample synthesis method for few-shot …05-09-45)-05... · 2018. 11....
IBM Research AI | Computer Vision and Augmented Reality
Δ-encoder: An effective sample synthesis method for few-shot object recognitionEli Schwartz*, Leonid Karlinsky*,
Joseph Shtok, Sivan Harary, Mattias Marder, Abhishek Kumar,
Rogerio Feris, Raja Giryes, Alex M. Bronstein
IBM Research AI | Computer Vision and Augmented Reality
Who’s that dog?
IBM Research AI | Computer Vision and Augmented Reality
• The model is a variant ofan auto-encoder operatingin feature space
Encoder
Decoder
𝑍
Key idea – training
IBM Research AI | Computer Vision and Augmented Reality
• The model is a variant ofan auto-encoder operatingin feature space
• The network learns to encode the delta between the reference and the target image
Encoder
Decoder
𝑍
target reference
delta
Key idea – training
IBM Research AI | Computer Vision and Augmented Reality
• The model is a variant ofan auto-encoder operatingin feature space
• The network learns to encode the delta between the reference and the target image
• This delta is used to recover the target image as a (non-linear)combination of the reference and the delta
Encoder
Decoder
𝑍
target reference
recovered target
delta
Key idea – training
IBM Research AI | Computer Vision and Augmented Reality
Encoder
Decoder
𝑍
Key idea – synthesizing
IBM Research AI | Computer Vision and Augmented Reality
• At test time we sampleencoded deltasfrom random training image pairs
Encoder
Decoder
𝑍
sampledtarget
sampled reference
sampled delta
Key idea – synthesizing
IBM Research AI | Computer Vision and Augmented Reality
• At test time we sampleencoded deltasfrom random training image pairs
• The sampled deltas are usedto create samples for new classes by combining them with the new class reference examples
• These samples are used to traina classifier for the new category
Encoder
Decoder
𝑍
sampledtarget
sampled reference
sampled delta
new classreference
synthesizednew classexample
Key idea – synthesizing
IBM Research AI | Computer Vision and Augmented Reality
miniImageNet: 58.5 (previous SOA) 59.9 (ours)CIFAR-100: 63.4 (previous SOA) 66.7 (ours)Caltech-256: 63.8 (previous SOA) 73.2 (ours)CUB: 69.6 (previous SOA) 69.8 (ours)
50
55
60
65
70
75
minImageNet CIFAR-100 Caltech-256 CUB
one-shot classification benchmarks
ours previous state-of-the-art
Few-shot classification experiments
IBM Research AI | Computer Vision and Augmented Reality
miniImageNet: 58.5 (previous SOA) 59.9 (ours)CIFAR-100: 63.4 (previous SOA) 66.7 (ours)Caltech-256: 63.8 (previous SOA) 73.2 (ours)CUB: 69.6 (previous SOA) 69.8 (ours)
50
55
60
65
70
75
minImageNet CIFAR-100 Caltech-256 CUB
one-shot classification benchmarks
ours previous state-of-the-art
Few-shot classification experiments
IBM Research AI | Computer Vision and Augmented Reality
Real vs synthetic examples ablation study
IBM Research AI | Computer Vision and Augmented Reality