Geometric Loss functions between sampled measures, images and volumes
---------------------------------------------------------------------


The **GeomLoss** library provides efficient GPU implementations for:

- `Kernel norms <https://en.wikipedia.org/wiki/Reproducing_kernel_Hilbert_space>`_ 
  (also known as `Maximum Mean Discrepancies <http://www.jmlr.org/papers/volume13/gretton12a/gretton12a.pdf>`_).
- `Hausdorff divergences <https://hal.archives-ouvertes.fr/hal-01827184v2>`_, which are
  positive definite generalizations of the
  `Chamfer-ICP <https://en.wikipedia.org/wiki/Iterative_closest_point>`_ loss
  and are
  analogous to **log-likelihoods** of Gaussian Mixture Models.
- `Debiased Sinkhorn divergences <https://arxiv.org/abs/1810.08278>`_, which are
  affordable yet **positive and definite** approximations of 
  `Optimal Transport <https://arxiv.org/abs/1803.00567>`_ 
  (`Wasserstein <https://en.wikipedia.org/wiki/Wasserstein_metric>`_) distances.


It is hosted on `GitHub <https://github.com/jeanfeydy/geomloss>`_
and distributed under the permissive `MIT license <https://en.wikipedia.org/wiki/MIT_License>`_. |br|  
|PyPi version| |Downloads| 


GeomLoss functions
are available through the custom `PyTorch <https://pytorch.org/>`_ layers 
:class:`SamplesLoss <geomloss.SamplesLoss>`, 
:class:`ImagesLoss <geomloss.ImagesLoss>` and 
:class:`VolumesLoss <geomloss.VolumesLoss>`
which allow you to work with weighted **point clouds** (of any dimension),
**density maps** and **volumetric segmentation masks**.
Geometric losses come with three backends each:

- A simple ``tensorized`` implementation, for **small problems** (< 5,000 samples).
- A reference ``online`` implementation, with a **linear** (instead of quadratic)
  **memory footprint**, that can be used for finely sampled measures.
- A very fast ``multiscale`` code, which uses an
  **octree**-like structure for large-scale problems in dimension <= 3.


A typical sample of code looks like:

.. code-block:: python

    import torch
    from geomloss import SamplesLoss  # See also ImagesLoss, VolumesLoss

    # Create some large point clouds in 3D
    x = torch.randn(100000, 3, requires_grad=True).cuda()
    y = torch.randn(200000, 3).cuda()

    # Define a Sinkhorn (~Wasserstein) loss between sampled measures
    loss = SamplesLoss(loss="sinkhorn", p=2, blur=.05)

    L = loss(x, y)  # By default, use constant weights = 1/number of samples
    g_x, = torch.autograd.grad(L, [x])  # GeomLoss fully supports autograd!


GeomLoss is a simple interface for cutting-edge Optimal Transport
algorithms. It provides:

* Support for **batchwise** computations.
* **Linear** (instead of quadratic) **memory footprint** for large problems,
  relying on the `KeOps library <https://www.kernel-operations.io>`_
  for map-reduce operations on the GPU.
* Fast **kernel truncation** for small bandwidths, using an octree-based structure. 
* Log-domain stabilization of the Sinkhorn iterations,
  eliminating numeric **overflows** for small values of :math:`\varepsilon`.
* Efficient computation of the **gradients**, which bypasses the naive
  backpropagation algorithm.
* Support for `unbalanced  <https://link.springer.com/article/10.1007/s00222-017-0759-8>`_ 
  Optimal `Transport <https://arxiv.org/pdf/1506.06430.pdf>`_,
  with a softening of the marginal constraints
  through a maximum **reach** parameter.
* Support for the `ε-scaling heuristic <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.228.9750&rep=rep1&type=pdf>`_ 
  in the Sinkhorn loop, with `kernel truncation <https://arxiv.org/abs/1610.06519>`_
  in dimensions 1, 2 and 3.
  On typical 3D problems, our implementation is **50-100 times faster** than
  the standard `SoftAssign/Sinkhorn algorithm <https://arxiv.org/abs/1306.0895>`_.


Note, however, that :class:`SamplesLoss <geomloss.SamplesLoss>` does *not* implement the 
`Fast Multipole <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.129.7826&rep=rep1&type=pdf>`_ 
or `Fast Gauss <http://users.umiacs.umd.edu/~morariu/figtree/>`_ transforms.
If you are aware of a well-packaged implementation
of these algorithms on the GPU, please contact me!


The divergences implemented here are
all **symmetric**, **positive definite**
and therefore suitable for measure-fitting applications.
For positive input measures :math:`\alpha` and :math:`\beta`,
our :math:`\text{Loss}` functions are such that

.. math::
    \text{Loss}(\alpha,\beta) ~&=~ \text{Loss}(\beta,\alpha), \\
    0~=~\text{Loss}(\alpha,\alpha) ~&\leqslant~ \text{Loss}(\alpha,\beta), \\
    0~=~\text{Loss}(\alpha,\beta)~&\Longleftrightarrow~ \alpha = \beta.

**GeomLoss** can be used in a wide variety of settings, 
from **shape analysis** (LDDMM, optimal transport...)
to **machine learning** (kernel methods, GANs...)
and **image processing**.
Details and examples are provided below:

* :doc:`Maths and algorithms <api/geomloss>`
* :doc:`PyTorch API <api/pytorch-api>`
* `Source code <https://github.com/jeanfeydy/geomloss>`_
* :doc:`Examples <_auto_examples/index>`


Author and Contributors
-------------------------

Feel free to contact us for any **bug report** or **feature request**:

- `Jean Feydy <https://www.jeanfeydy.com>`_
- `Pierre Roussillon <https://proussillon.gitlab.io/en/>`_ (extensions to brain tractograms and normal cycles)

Licensing, academic use
---------------------------

This library is licensed under the permissive `MIT license <https://en.wikipedia.org/wiki/MIT_License>`_,
which is fully compatible with both **academic** and **commercial** applications.
If you use this code in a research paper, **please cite**:

::

    @inproceedings{feydy2019interpolating,
        title={Interpolating between Optimal Transport and MMD using Sinkhorn Divergences},
        author={Feydy, Jean and S{\'e}journ{\'e}, Thibault and Vialard, Fran{\c{c}}ois-Xavier and Amari, Shun-ichi and Trouve, Alain and Peyr{\'e}, Gabriel},
        booktitle={The 22nd International Conference on Artificial Intelligence and Statistics},
        pages={2681--2690},
        year={2019}
    }


Related projects
------------------

You may be interested by:

- The `KeOps library <http://www.kernel-operations.io/>`_, which provides 
  **efficient CUDA routines**
  for point cloud processing, with full `PyTorch <https://pytorch.org/>`_ support.
- Rémi Flamary's 
  `Python Optimal Transport library <https://pot.readthedocs.io/en/stable/>`_,
  which provides a reference implementation of **OT-related methods** for small problems.
- Bernhard Schmitzer's `Optimal Transport toolbox <https://github.com/bernhard-schmitzer/optimal-transport/tree/master/v0.2.0>`_,
  which provides a reference **multiscale solver** for the OT problem, on the CPU.

Table of contents
-----------------

.. toctree::
   :maxdepth: 2

   api/install
   api/geomloss
   api/pytorch-api
   _auto_examples/index


.. |PyPi version| image:: https://img.shields.io/pypi/v/geomloss?color=blue
   :target: https://pypi.org/project/geomloss/
.. |Downloads| image:: https://pepy.tech/badge/geomloss?color=green
   :target: https://pepy.tech/project/geomloss


.. |br| raw:: html

  <br/>