際際滷

際際滷Share a Scribd company logo
Sinkhorn algorithm and its
applications
PyData Osaka meetup #11

Taku Yoshioka
? Problem: Optimal transport

? Algorithm: Sinkhorn algorithm

? Python implementation: GeomLoss

? Another application: latent permutations
https://dfdazac.github.io/sinkhorn.html
https://arxiv.org/abs/1802.08665
http://marcocuturi.net/SI.html
https://github.com/jeanfeydy/geomloss
? Problem: Optimal transport

? Algorithm: Sinkhorn algorithm

? Python implementation: GeomLoss

? Another application: Jigsaw puzzle
Fitting generative models
Observations
Likelihood of generative model
? Fitting generative model using parameter gradient of
likelihood of observations
Implicit generative model
? Likelihood not available

? Can generate samples
Low-dimensional

random variable
Deterministic transformation

(e.g., Neural network)
High-dimensional

random variable
Typical example of implicit model
Fitting without likelihood function
Observations Samples from generative model

(likelihood not available)
? Fitting generative model without likelihood function (aka
implicit models)

? Alternative to likelihood: distance between these sets of
points (underlying distributions)

? To do this, GAN utilizes classi?er (JS-divergence)
Distance between distributions
Observations Samples from generative model

(likelihood not available)
i
j
Cost matrix
C =
c11 ? c14
? cij ?
c41 ? c44
(e.g., Euclidean distance)
Transport cost
11
2
3
4
2
3
4
Transport cost
1
4
c13 +
1
4
c21 +
1
4
c32 +
1
4
c44
Cost matrix
C =
c11 ? c14
? cij ?
c41 ? c44
(e.g., Euclidean distance)
Transport cost
11
2
3
4
2
3
4
Transport cost
1
4
c11 +
1
4
c23 +
1
4
c34 +
1
4
c42
Minimum transport cost over possible assignment

-> Optimal transport (discrete case)
Cost matrix
C =
c11 ? c14
? cij ?
c41 ? c44
(e.g., Euclidean distance)
Generalization of assignment
11
2
3
4
2
3
4
Coupling matrix
P =
p11 ? p14
? pij ?
p41 ? p44
Transport cost
?C, P? 《
‘
ij
CijPij
Cost matrix
C =
c11 ? c14
? cij ?
c41 ? c44
p13c13
c42p42
‘
j
pij =
1
4
,
‘
i
pij =
1
4 P1 = r, PT
1 = cGeneral form:
p21c21
? Problem: Optimal transport

? Algorithm: Sinkhorn algorithm

? Python implementation: GeomLoss

? Another application: Jigsaw puzzle
Regularized OT problem
d = min
P
?C, P?,
d = min
P
?C, P? ? ?H(P)
Optimal transport
Entropy regularized
P1 = r, PT
1 = c
Regularized OT problem
d = min
P
?C, P?,
d = min
P
?C, P? ? ?H(P)
Optimal transport
Entropy regularized
Solution obtained with Lagrangian (Cuturi 2013, Lemma 2)
P = diag(u) ? K ? diag(v) Kij = exp(?cij/?), u − 0, v − 0
The same form of the solution obtained with 

Sinkhorn algorithm
P1 = r, PT
1 = c
Sinkhorn algorithm
u(k+1)
=
r
Kv(k)
v(k+1)
=
c
KTu(k+1)
pij = diag(u) ? K ? diag(v) Kij = exp(?cij/?), u − 0, v − 0
How to get u and v: repeat applying linear operations until
convergence (stops at a ?nite iteration in practice)
Sinkhorn and Knopp (1967) proposed this iterative
algorithm and analyzed convergence.
Sinkhorn algorithm
u(k+1)
=
r
Kv(k)
v(k+1)
=
c
KTu(k+1)
P = diag(u) ? K ? diag(v) Kij = exp(?cij/?), u − 0, v − 0
Backprop
How to get u and v: repeat applying linear operations until
convergence (stops at a ?nite iteration in practice)
https://arxiv.org/abs/1706.00292
? Problem: Optimal transport

? Algorithm: Sinkhorn algorithm

? Python implementation: GeomLoss

? Another application: Jigsaw puzzle
GeomLoss library (pytorch)
https://github.com/jeanfeydy/geomloss
? Provides the following loss functions:

? Kernel norms (maximum mean discrepancies)

? Hausdor? divergences

? Unbiased Sinkhorn divergences

? Can be used in a wide variety of settings, from shape
analysis (LDDMM, optimal transport´) to machine
learning (kernel methods, GANs´) and image processing.
https://www.kernel-operations.io/geomloss/_auto_examples/comparisons/plot_gradient_?ows_2D.html#sphx-glr-auto-examples-
comparisons-plot-gradient-?ows-2d-py
20191019 sinkhorn
? Problem: Optimal transport

? Algorithm: Sinkhorn algorithm

? Python implementation: GeomLoss

? Another application: Jigsaw puzzle
Latent permutations
https://duvenaud.github.io/learn-discrete/slides/Gumbel-Sinkhorn-際際滷s.pdf
Jigsaw puzzle
Jigsaw puzzle
20191019 sinkhorn
Summary
? Optimal transport for measuring distance of distributions 

? Sinkhorn algorithm is just a sequential application of
linear operation, thus it¨s possible to backprop

? GeomLoss makes it easy to use Sinkhorn algorithm with
PyTorch

? Sinkhorn algorithm can be used to ?nd a good latent
permutation from observation

More Related Content

20191019 sinkhorn