Math operations

Key files. As detailed earlier, our parsing grammar for symbolic formulas is described in terms of abstract C++ types implemented in the keops/core/formulas/*/*.h headers. These files provide a comprehensive list of mathematical operators and rely on the primitives implemented in the keops/core/pack and keops/core/autodiff subfolders: abstract unary and binary operators, tuples of variables and parameters, variables, gradients.

In practice

To give a glimpse of how KeOps works under the hood, let us present a small excerpt from the formulas/maths/ subfolder – the declaration of the Log(...) operator in the Log.h header:

// 1. Declare a new Unary Operation - the pointwise logarithm:
template < class F >
struct Log : UnaryOp<Log,F> {

    // 2. Declare a new attribute: dimension of Log(F) = dimension of F:
    static const int DIM = F::DIM;

    // 3. Utility method: pointwise Logarithm should be displayed as "Log":
    static void PrintIdString(std::stringstream& str) { str << "Log"; }

    // 4. Actual C++ implementation of our operator:
    static HOST_DEVICE INLINE void Operation(__TYPE__ *out, __TYPE__ *outF) {
        for(int k = 0; k < DIM; k++)
            out[k] = log( outF[k] );
    }

    // 5. Define a new alias for the "backward" operator of F...
    template < class V, class GRADIN >
    using DiffTF = typename F::template DiffT<V,GRADIN>;

    // 6. And use it to implement the "backward" of Log(F):
    template < class V, class GRADIN >
    using DiffT = DiffTF<V, Mult< Inv<F>, GRADIN> >;
};

// 7. Define a user-friendly alias "Log(...)" for "Log<...>":
#define Log(f) KeopsNS<Log<decltype(InvKeopsNS(f))>>()

As evidenced here, the implementation of a new operator goes through seven compulsory steps:

  1. The declaration of a new operation as an instance of the abstract UnaryOp or BinaryOp templates. These are defined in the keops/core/autodiff folder with a set of standard methods and attributes. The operand F of Log is an arbitrary formula, recursively encoded as a templated structure.

  2. The specification of a few standard attributes. Here, the dimension of the vector Log(F) – accessed as Log<F>::DIM in typical C++ fashion – is equal to that of F. Our logarithm is applied pointwise and does not affect the shape of the underlying vector.

  3. The specification of some utility methods. Here, the string identifier PrintIdString may be used to access the formula that is encoded within any KeOps C++ template.

  4. The actual implementation of our operator, that is to be executed within the Thread memory of each CUDA core. As specified in the abstract definition of UnaryOp, the inline method Operation takes as input a C++ array outF, the vector-valued output of our operand F. It computes the relevant pointwise logarithms using a standard CUDA-maths routine and stores them in a new out buffer of size Log<F>::DIM. In practice, modern C++ compilers may simplify this operation as an in-place modification of the values stored in outF.

  5. Prepare the chain rule by defining an alias for the adjoint “backward” operator of the operand F with respect to an arbitrary differentiation variable V. As explained in our introduction to backpropagation, the new operator \(\partial_{\texttt{V}} F\) is a formal expression that takes as input the variables “\(x = (p^1,\dots,x^1_i,\dots,y^1_j,\dots)\)” of F and a new vector “\(a\)” of size F::DIM, the gradient vector GRADIN or “\(x_i^*\)” that is backpropagated through the whole computational graph. Understood as the adjoint or “transpose” of the differential of F, the application of this operator is encoded within KeOps as a new templated expression F::DiffT that should implement the computation of \(\partial_\texttt{V} F \cdot \texttt{GRADIN}\).

  6. Implement the chain rule recursively, using the templated expression above: DiffTF = F::DiffT. Here, the C++ declaration:

    \[\begin{aligned} \texttt{Log<F>::DiffT<V,GRADIN> = F::DiffT<V, Mult< Inv<F>, GRADIN> >} \end{aligned}\]

    simply encodes the well-known fact that with pointwise computations,

    \[\begin{aligned} \partial_{\texttt{V}} \big[ \log \circ F \big] (p, x_i, y_j) \, \cdot \,\texttt{GRADIN} ~=~ \partial_{\texttt{V}} F (p, x_i, y_j) \, \cdot \, \frac{\texttt{GRADIN}}{F(p, x_i, y_j)}~. \end{aligned}\]
  7. Declare a convenient alias for the operation. This arcane formulation relies on classes defined in the keops/core/pre_headers.h header.

Contributing with a new operation

Advanced users may wish to extend the existing engine with home-made operators, injecting their C++ code within the KeOps Map-Reduce kernels. Doing so is now relatively easy: having implemented a custom instance of the UnaryOp or BinaryOp templates in a new keops/core/formulas/*/*.h header, contributors should simply remember to add their file to the list of KeOps includes and write a LazyTensor method in the pykeops/common/lazy_tensor.py module.

To get merged in the main KeOps repository, which is hosted on GitHub, writing a simple unit test in the pykeops/test/ folder and an adequate description in the pull request should then be enough.