�ݺ�ߣ

1
Click to edit Master title style
Graph Matching and Pseudo-Label Guided Deep
Unsupervised Domain Adaptation
Debasmit Das C.S. George Lee
Assistive Robotics Technology Laboratory
School of Electrical and Computer Engineering
Purdue University, West Lafayette, IN, USA
Funding Source : National Science Foundation (IIS-1813935)

2
Outline
• INTRODUCTION
• OUR METHOD
• EXPERIMENTAL RESULTS

3
Introduction
Classical Machine Learning Setting
Find where
Training and Testing
Samples from same
Distribution !!

4
Introduction
Classifying a Dog and a Cat
Training Samples Testing Samples
Training and Testing
Distribution different!!
Domain Adaptation
Required!!

5
Introduction
Domain Adaptation Methods
Non-Deep Methods Deep Methods
• Instance Re-weighting
[Dai et al. ICML’07]
• Parameter Adaptation
[Bruzzone et al. TPAMI’10]
• Feature Transformation
[Fernando et al. ICCV’13]
[Sun et al. AAAI’16]
• Discrepancy Based
[Long et al. ICML’15]
[Sun et al. ECCV’16]
• Adversarial Based
[Ganin et al. JMLR’16]
[Tzeng et al. CVPR’17]

6
Introduction
• Discrepancy Based Methods
Mostly global metrics. Minimizes statistics of data like covariance
[Sun et al. ECCV’16] or maximum mean discrepancy [Long et al. ICML’15]
• Local Method
Optimal Transport (Courty et al. TPAMI’17). Basically point-point Matching
Using first order information might be misleading. How ?

7
Second order information
Our Method

8
Representing Correspondences
Correspondence Matrix
1
0
Matching Matrix
Continuous
Relaxation
Our Method

9
First-order Matching
Continuous
Relaxation
Our Method
Continuous
Relaxation
Second-order Matching

10
Our Method
Optimization
With additional correction
Add equality constraints as
penalty

11
Our Method
Optimization
Classification Loss
Total Loss

12
Closed form solution of mapping
Iterative solution
Our Method

13
Second stage training procedure
• Domain Discrepancy is
reduced
• Need to exploit unlabeled data
• Chose confident unlabeled
samples based on a threshold
• ‘Sharpen’ the probability of
these confident samples
Our Method

14
Pseudo-label (PL) Loss
Our Method
Class 2Class 1
Class 3
Sample 1 Sample 2
Sample 3

15
Second stage training procedure
Total Loss
Our Method

16
Overall architecture
Stage 1 training Stage 2 training
Our Method

17
Experimental Results
Dataset Settings
tunable
Batch Size – 32 source & target samples
Input – Decaf features [ICML‘14]
Feature Extractor – [500,100] FC ReLU
Optimizer – Adam [ICLR’15]
Implementation – Tensorflow on
NVIDIA GeForce GPUDataset available at https://github.com/RockySJ/WDGRL

18
Comparison with previous work
MMD [JMLR’12], DANN [JMLR’16], CORAL [ECCV’16], WDGRL [AAAI’18]
(Maximum Mean Discrepancy) (Domain Adversarial) (Correlation alignment) (Wasserstein Distance)

19
Parameter Sensitivity
Second-order term over First-order term Graph-Matching Loss over Classification loss Classification loss over Pseudo-label loss

20
Convergence

21
Feature Visualization
Without Adaptation With Graph Matching With Graph Matching
& Pseudo-labeling

22
• Second order matching term is
important to match structure in data
• Refining decision boundaries by
self-training with unlabeled data is
beneficial
• Performance improvement on image
recognition justifies the inclusion of
the 2-stage training
Conclusion
Future Work
Multiple source domain generalization

�ݺ�ߣ

Icann2018ppt final

More Related Content

Icann2018ppt final