KernelSolve
This section contains the full API documentation for the NumPy version of the conjugate gradient solver for Generic linear systems:
Summary
Creates a new conjugate gradient solver. 

Instantiate a new KernelSolve operation. 

To apply the routine on arbitrary NumPy arrays. 
Syntax
 class pykeops.numpy.KernelSolve[source]
Creates a new conjugate gradient solver.
Supporting the same generic syntax as
numpy.Genred
, this module allows you to solve generic optimization problems of the form:\[\begin{split}& & a^{\star} & =\operatorname*{argmin}_a \tfrac 1 2 \langle a,( \alpha \operatorname{Id}+K_{xx}) a\rangle  \langle a,b \rangle, \\\\ &\text{i.e.}\quad & a^{\star} & = (\alpha \operatorname{Id} + K_{xx})^{1} b,\end{split}\]where \(K_{xx}\) is a symmetric, positive definite linear operator and \(\alpha\) is a nonnegative regularization parameter.
Example
>>> formula = "Exp(Norm2(x  y)) * a" # Exponential kernel >>> aliases = ["x = Vi(3)", # 1st input: target points, one dim3 vector per line ... "y = Vj(3)", # 2nd input: source points, one dim3 vector per column ... "a = Vj(2)"] # 3rd input: source signal, one dim2 vector per column >>> K = Genred(formula, aliases, axis = 1) # Reduce formula along the lines of the kernel matrix >>> K_inv = KernelSolve(formula, aliases, "a", # The formula above is linear wrt. 'a' ... axis = 1) >>> # Generate some random data: >>> x = np.random.randn(10000, 3) # Sampling locations >>> b = np.random.randn(10000, 2) # Random observed signal >>> a = K_inv(x, x, b, alpha = .1) # Linear solve: a_i = (.1*Id + K(x,x)) \ b >>> print(a.shape) (10000, 2) >>> # Mean squared error: >>> print( ( np.sum( np.sqrt( ( .1 * a + K(x,x,a)  b)**2 ) ) / len(x) ).item() ) 1.5619025770417854e06
 __init__(formula, aliases, varinvalias, axis=0, dtype=None, opt_arg=None, dtype_acc='auto', use_double_acc=False, sum_scheme='auto', enable_chunks=True, rec_multVar_highdim=None, use_fast_math=True)[source]
Instantiate a new KernelSolve operation.
Note
KernelSolve
relies on C++ or CUDA kernels that are compiled onthefly and stored in a cache directory as shared libraries (“.so” files) for later use. Parameters:
formula (string) – The scalar or vectorvalued expression that should be computed and reduced. The correct syntax is described in the documentation, using appropriate mathematical operations.
aliases (list of strings) –
A list of identifiers of the form
"AL = TYPE(DIM)"
that specify the categories and dimensions of the input variables. Here:AL
is an alphanumerical alias, used in the formula.TYPE
is a category. One of:Vi
: indexation by \(i\) along axis 0.Vj
: indexation by \(j\) along axis 1.Pm
: no indexation, the input tensor is a vector and not a 2d array.
DIM
is an integer, the dimension of the current variable.
As described below,
__call__()
will expect input arrays whose shape are compatible with aliases.varinvalias (string) – The alphanumerical alias of the variable with respect to which we shall perform our conjugate gradient descent. formula is supposed to be linear with respect to varinvalias, but may be more sophisticated than a mere
"K(x,y) * {varinvalias}"
.
 Keyword Arguments:
axis (int, default = 0) –
Specifies the dimension of the kernel matrix \(K_{x_ix_j}\) that is reduced by our routine. The supported values are:
axis = 0: reduction with respect to \(i\), outputs a
Vj
or “\(j\)” variable.axis = 1: reduction with respect to \(j\), outputs a
Vi
or “\(i\)” variable.
dtype_acc (string, default
"auto"
) –type for accumulator of reduction, before casting to dtype. It improves the accuracy of results in case of large sized data, but is slower. Default value “auto” will set this option to the value of dtype. The supported values are:
dtype_acc =
"float16"
: allowed only if dtype is “float16”.dtype_acc =
"float32"
: allowed only if dtype is “float16” or “float32”.dtype_acc =
"float64"
: allowed only if dtype is “float32” or “float64”..
use_double_acc (bool, default False) – same as setting dtype_acc=”float64” (only one of the two options can be set) If True, accumulate results of reduction in float64 variables, before casting to float32. This can only be set to True when data is in float32 or float64. It improves the accuracy of results in case of large sized data, but is slower.
sum_scheme (string, default
"auto"
) –method used to sum up results for reductions. Default value “auto” will set this option to “block_red”. Possible values are:
sum_scheme =
"direct_sum"
: direct summationsum_scheme =
"block_sum"
: use an intermediate accumulator in each block before accumulating in the output. This improves accuracy for large sized data.sum_scheme =
"kahan_scheme"
: use Kahan summation algorithm to compensate for roundoff errors. This improves accuracy for large sized data.
enable_chunks (bool, default True) – enable automatic selection of special “chunked” computation mode for accelerating reductions with formulas involving large dimension variables.
use_fast_math (bool, default True) – enables use_fast_math Cuda option
 __call__(*args, backend='auto', device_id=1, alpha=1e10, eps=1e06, ranges=None)[source]
To apply the routine on arbitrary NumPy arrays.
Warning
Even for variables of size 1 (e.g. \(a_i\in\mathbb{R}\) for \(i\in[0,M)\)), KeOps expects inputs to be formatted as 2d arrays of size
(M,dim)
. In practice,a.view(1,1)
should be used to turn a vector of weights into a list of scalar values. Parameters:
*args (2d arrays (variables
Vi(..)
,Vj(..)
) and 1d arrays (parametersPm(..)
)) –The input numerical arrays, which should all have the same
dtype
, be contiguous and be stored on the same device. KeOps expects one array per alias, with the following compatibility rules:All
Vi(Dim_k)
variables are encoded as 2darrays withDim_k
columns and the same number of lines \(M\).All
Vj(Dim_k)
variables are encoded as 2darrays withDim_k
columns and the same number of lines \(N\).All
Pm(Dim_k)
variables are encoded as 1darrays (vectors) of sizeDim_k
.
 Keyword Arguments:
alpha (float, default = 1e10) – Nonnegative ridge regularization parameter, added to the diagonal of the Kernel matrix \(K_{xx}\).
backend (string) – Specifies the mapreduce scheme, as detailed in the documentation of the
numpy.Genred
module.device_id (int, default=1) – Specifies the GPU that should be used to perform the computation; a negative value lets your system choose the default GPU. This parameter is only useful if your system has access to several GPUs.
ranges (6uple of IntTensors, None by default) –
Ranges of integers that specify a blocksparse reduction scheme with Mc clusters along axis 0 and Nc clusters along axis 1, as detailed in the documentation of the
numpy.Genred
module.If None (default), we simply use a dense Kernel matrix as we loop over all indices \(i\in[0,M)\) and \(j\in[0,N)\).
 Returns:
The solution of the optimization problem, which is always a 2darray with \(M\) or \(N\) lines (if axis = 1 or axis = 0, respectively) and a number of columns that is inferred from the formula.
 Return type:
(M,D) or (N,D) array