PyTorch API
geomloss
 Geometric Loss functions,
with full support of PyTorch’s autograd
engine:

Creates a criterion that computes distances between sampled measures on a vector space. 
 class geomloss.SamplesLoss(loss='sinkhorn', p=2, blur=0.05, reach=None, diameter=None, scaling=0.5, truncate=5, cost=None, kernel=None, cluster_scale=None, debias=True, potentials=False, verbose=False, backend='auto')[source]
Creates a criterion that computes distances between sampled measures on a vector space.
Warning
If loss is
"sinkhorn"
and reach is None (balanced Optimal Transport), the resulting routine will expect measures whose total masses are equal with each other. Parameters:
loss (string, default =
"sinkhorn"
) –The loss function to compute. The supported values are:
"sinkhorn"
: (Unbiased) Sinkhorn divergence, which interpolates between Wasserstein (blur=0) and kernel (blur= \(+\infty\) ) distances."hausdorff"
: Weighted Hausdorff distance, which interpolates between the ICP loss (blur=0) and a kernel distance (blur= \(+\infty\) )."energy"
: Energy Distance MMD, computed using the kernel \(k(x,y) = \xy\_2\)."gaussian"
: Gaussian MMD, computed using the kernel \(k(x,y) = \exp \big( \xy\_2^2 \,/\, 2\sigma^2)\) of standard deviation \(\sigma\) = blur."laplacian"
: Laplacian MMD, computed using the kernel \(k(x,y) = \exp \big( \xy\_2 \,/\, \sigma)\) of standard deviation \(\sigma\) = blur.
p (int, default=2) –
If loss is
"sinkhorn"
or"hausdorff"
, specifies the ground cost function between points. The supported values are:p = 1: \(~~C(x,y) ~=~ \xy\_2\).
p = 2: \(~~C(x,y) ~=~ \tfrac{1}{2}\xy\_2^2\).
blur (float, default=.05) –
The finest level of detail that should be handled by the loss function  in order to prevent overfitting on the samples’ locations.
If loss is
"gaussian"
or"laplacian"
, it is the standard deviation \(\sigma\) of the convolution kernel.If loss is
"sinkhorn"
or"hausdorff"
, it is the typical scale \(\sigma\) associated to the temperature \(\varepsilon = \sigma^p\). The default value of .05 is sensible for input measures that lie in the unit square/cube.
Note that the Energy Distance is scaleequivariant, and won’t be affected by this parameter.
reach (float, default=None= \(+\infty\)) – If loss is
"sinkhorn"
or"hausdorff"
, specifies the typical scale \(\tau\) associated to the constraint strength \(\rho = \tau^p\).diameter (float, default=None) – A rough indication of the maximum distance between points, which is used to tune the \(\varepsilon\)scaling descent and provide a default heuristic for clustering multiscale schemes. If None, a conservative estimate will be computed onthefly.
scaling (float, default=.5) – If loss is
"sinkhorn"
, specifies the ratio between successive values of \(\sigma=\varepsilon^{1/p}\) in the \(\varepsilon\)scaling descent. This parameter allows you to specify the tradeoff between speed (scaling < .4) and accuracy (scaling > .9).truncate (float, default=None= \(+\infty\)) – If backend is
"multiscale"
, specifies the effective support of a Gaussian/Laplacian kernel as a multiple of its standard deviation. If truncate is not None, kernel truncation steps will assume that \(\exp(x/\sigma)\) or \(\exp(x^2/2\sigma^2) are zero when :math:\)x ,>, text{truncate}cdot sigma`.cost (function or string, default=None) –
if loss is
"sinkhorn"
or"hausdorff"
, specifies the cost function that should be used instead of \(\tfrac{1}{p}\xy\^p\):If backend is
"tensorized"
, cost should be a python function that takes as input a (B,N,D) torch Tensor x, a (B,M,D) torch Tensor y and returns a batched Cost matrix as a (B,N,M) Tensor.Otherwise, if backend is
"online"
or"multiscale"
, cost should be a KeOps formula, given as a string, with variablesX
andY
. The default values are"Norm2(XY)"
(for p = 1) and"(SqDist(X,Y) / IntCst(2))"
(for p = 2).
cluster_scale (float, default=None) – If backend is
"multiscale"
, specifies the coarse scale at which cluster centroids will be computed. If None, a conservative estimate will be computed from diameter and the ambient space’s dimension, making sure that memory overflows won’t take place.debias (bool, default=True) – If loss is
"sinkhorn"
, specifies if we should compute the unbiased Sinkhorn divergence instead of the classic, entropyregularized “SoftAssign” loss.potentials (bool, default=False) – When this parameter is set to True, the
SamplesLoss
layer returns a pair of optimal dual potentials \(F\) and \(G\), sampled on the input measures, instead of differentiable scalar value. These dual vectors \((F(x_i))\) and \((G(y_j))\) are encoded as Torch tensors, with the same shape as the input weights \((\alpha_i)\) and \((\beta_j)\).verbose (bool, default=False) – If backend is
"multiscale"
, specifies whether information on the clustering and \(\varepsilon\)scaling descent should be displayed in the standard output.backend (string, default =
"auto"
) –The implementation that will be used in the background; this choice has a major impact on performance. The supported values are:
"auto"
: Choose automatically, using a simple heuristic based on the inputs’ shapes."tensorized"
: Relies on a full cost/kernel matrix, computed once and for all and stored on the device memory. This method is fast, but has a quadratic memory footprint and does not scale beyond ~5,000 samples per measure."online"
: Computes cost/kernel values onthefly, leveraging online mapreduce CUDA routines provided by the pykeops library."multiscale"
: Fast implementation that scales to millions of samples in dimension 123, relying on the blocksparse reductions provided by the pykeops library.