API Documentation

core module

The GopPy core classes.

class goppy.core.OnlineGP(kernel, noise_var=0.0, expected_size=None, buffer_factory=<class 'goppy.growable.GrowableArray'>)

Online Gaussian Process.

Provides a Gaussian process to which further data can be added efficiently after the initial training.

Parameters:
kernelKernel

Covariance function of the Gaussian process.

noise_varfloat, optional

The assumed variance of the noise on the training targets.

expected_sizeint, optional

The overall expected number of training samples to be added to the Gaussian process. Setting this parameter can be more efficient as it may avoid memory reallocations.

buffer_factoryfunction, optional

Function to call to create buffer arrays for data storage.

Examples

>>> from goppy import OnlineGP, SquaredExponentialKernel
>>> gp = OnlineGP(SquaredExponentialKernel([1.0]), noise_var=0.1)
>>> gp.fit(np.array([[2, 4]]).T, np.array([[3, 1]]).T)
>>> gp.add(np.array([[0]]), np.array([[3]]))
>>> gp.predict(np.array([[1, 3]]).T)
{'mean': array([[ 2.91154709],
       [ 1.82863199]])}
Attributes:
kernelKernel

Covariance function of the Gaussian process.

noise_varfloat

The assumed variance of the noise on the training targets.

x_train(N, D) ndarray

The N training data inputs of dimension D. This will be None as long as the Gaussian process has not been trained.

y_train(N, D) ndarray

The N training data targets of dimension D. This will be None as long as the Gaussian process has not been trained.

inv_chol(N, N) ndarray

Inverted lower Cholesky factor of the covariance matrix (upper triangular matrix). This will be None as long as the Gaussian process has not been trained.

trainedbool

Indicates whether the Gaussian process has been fitted to some training data.

add(x, y)

Adds additional training data to the Gaussian process and adjusts the fit.

The complexity of this method is in \(O(n \cdot \max(n^2, N^2))\) where

  • \(n\) is the number of data points being added,

  • and \(N\) is the total number of data points added so far.

This is better than re-initializing the Gaussian process, which would have a complexity in \(O((n + N)^3)\) because of the matrix inversion.

See this master’s thesis for more details.

Parameters:
x(N, D) array-like

The N input data points of dimension D to train on.

y(N, D) array-like

The N training targets with D independent dimensions.

calc_log_likelihood(what=('value',))

Calculate the log likelihood or its derivative of the Gaussian process.

Depending on the values included in the what parameter different values will be calculated:

  • 'value': The log likelihood of the Gaussian process as scalar.

  • 'derivative': Partial derivatives of the log likelihood for each kernel parameter as array. See the params property of the used kernel for the order.

Parameters:
whatset-like, optional

Values to calculate (see above).

Returns:
dict

Dictionary with the elements of what as keys and the corresponding calculated values.

fit(x, y)

Fits the Gaussian process to training data.

Parameters:
x(N, D) array-like

The N input data points of dimension D to train on.

y(N, D) array-like

The N training targets with D independent dimensions.

property inv_cov_matrix

Inverted covariance matrix.

Cannot be accessed before the Gaussian process has been trained.

predict(x, what=('mean',))

Predict with the Gaussian process.

Depending on the values included in the what parameter different predictions will be made and returned a dictionary res:

  • 'mean': Mean prediction of the Gaussian process of shape (N, D).

  • 'mse': Predictive variance of the Gaussian process of shape (N,).

  • 'derivative': Predicted derivative of the mean. res['derivative'][i, :, j] will correspond to \(\left(\frac{\partial \mu}{\partial x_j}\right) \left(x_i\right)\) with the i-th input data point \(x_i\), and mean function \(\mu(x)\).

  • 'mse_derivative': Predicted derivative of the variance. res['mse_derivative'][i, :] will correspond to \(\left(\frac{d \sigma^2}{d x}\right) \left(x_i\right)\) with the i-th input data point \(x_i\), and variance function \(\sigma^2(x)\).

Parameters:
x(N, D) array-like

The N data points of dimension D to predict data for.

whatset-like, optional

Types of predictions to be made (see above).

Returns:
dict

Dictionary with the elements of what as keys and the corresponding predictions as values.

growable module

Provides array types which can be enlarged after creation.

class goppy.growable.GrowableArray(shape, dtype=<class 'float'>, order='C', buffer_shape=None)

An array which can be enlarged after creation.

Though this is not a subclass of numpy.ndarray, it implements the same interface.

Parameters:
shapeint or tuple of int

Initial shape of the created empty array.

dtypedata-type, optional

Desired output data-type.

order{‘C’, ‘F’}, optional

Whether to store multi-dimensional data in C (row-major) or Fortran (column-major) order in memory.

buffer_shapeint or tuple of int, optional

Initial shape of the buffer to hold the actual data. As long as the array shape stays below the buffer shape no new memory has to allocated.

Examples

>>> from goppy.growable import GrowableArray
>>> a = GrowableArray((1, 1))
>>> a[:, :] = 1
>>> print a
[[ 1.]]
>>> a.grow_by((1, 2))
>>> a[:, :] = 2
>>> print a
[[ 2.  2.  2.]
 [ 2.  2.  2.]]
grow_by(amount)

Grow the array.

Parameters:
amountint or tuple of int

Amount by which each dimension will be enlarged.

kernel module

Provides kernels for the use with Gaussian processes.

class goppy.kernel.ExponentialKernel(lengthscales, variance=1.0)

Bases: Kernel

Exponential kernel.

The exponential kernel is defined as \(k(r_i) = \sigma^2 \exp\left(-\frac{r_i}{l_i}\right)\) with \(r = |\mathtt{x1} - \mathtt{x2}|\), kernel variance \(\sigma^2\) and length scales \(l\).

Parameters:
lengthscales(D,) array-like

The length scale \(l_i\) for each dimension.

variancefloat

The kernel variance \(\sigma^2\).

diag(x1, x2)

Evaluate the kernel and return only the diagonal of the Gram matrix.

If only the diagonal is needed, this functions may be more efficient than calculating the full Gram matrix with full().

Parameters:
x1, x2(N, D) array-like

The N data points of dimension D to evaluate the kernel for.

Returns:
1d ndarray

The diagonal of the resulting Gram matrix from evaluating the kernels for pairs from x1 and x2.

See also

full
full(x1, x2, what=('y',))

Evaluate the kernel for all pairs of x1 and x2.

Depending on the values included in the what parameter different evaluations will be made and returned as a dictionary res:

  • 'y': Evaluate the kernel for each pair of x1 and x2 resulting in the Gram matrix.

  • 'derivative': Evaluate the partial derivatives. res['derivative'][i, j, :] will correspond to \(\left(\frac{\partial k}{d\mathtt{x2}}\right) \left(\mathtt{x1}_i, \mathtt{x2}_j\right)\) with subscripts denoting input data points, and the kernel \(k(\mathtt{x1}, \mathtt{x2})\).

  • 'param_derivatives': Evaluate the partial derivatives of the kernel parameters. res['param_derivatives'] will be a list with the \(i\)-th element corresponding to \(\left(\frac{\partial k}{d\theta_i}\right) \left(\mathtt{x1}, \mathtt{x2}\right)\) wherein \(\theta_i\) is the \(i\)-th parameter. The order of the parameters is the same as in the params attribute.

An implementation of a kernel is not required to provide the functionality to evaluate 'derivative' and/or 'param_derivatives'. In this case the set of available predictions of a Gaussian process might be limited. All the GopPy standard kernels implement the complete functionality described above.

Parameters:
x1, x2(N, D) array-like

The N data points of dimension D to evaluate the kernel for.

whatset-like, optional

Types of evaluations to be made (see above).

Returns:
dict

Dictionary with the elements of what as keys and the corresponding evaluations as values.

See also

diag
property params

1d-array of kernel parameters.

The first D values are the length scales for each dimension and the last value is the kernel variance.

class goppy.kernel.Kernel

Bases: object

Abstract base class for kernels.

An instance of this class is callable and instance(x1, x2) will call instance.full(x1, x2).

Attributes:
params1d ndarray

Array representation of the kernel parameters.

diag(x1, x2)

Evaluate the kernel and return only the diagonal of the Gram matrix.

If only the diagonal is needed, this functions may be more efficient than calculating the full Gram matrix with full().

Parameters:
x1, x2(N, D) array-like

The N data points of dimension D to evaluate the kernel for.

Returns:
1d ndarray

The diagonal of the resulting Gram matrix from evaluating the kernels for pairs from x1 and x2.

See also

full
full(x1, x2, what=('y',))

Evaluate the kernel for all pairs of x1 and x2.

Depending on the values included in the what parameter different evaluations will be made and returned as a dictionary res:

  • 'y': Evaluate the kernel for each pair of x1 and x2 resulting in the Gram matrix.

  • 'derivative': Evaluate the partial derivatives. res['derivative'][i, j, :] will correspond to \(\left(\frac{\partial k}{d\mathtt{x2}}\right) \left(\mathtt{x1}_i, \mathtt{x2}_j\right)\) with subscripts denoting input data points, and the kernel \(k(\mathtt{x1}, \mathtt{x2})\).

  • 'param_derivatives': Evaluate the partial derivatives of the kernel parameters. res['param_derivatives'] will be a list with the \(i\)-th element corresponding to \(\left(\frac{\partial k}{d\theta_i}\right) \left(\mathtt{x1}, \mathtt{x2}\right)\) wherein \(\theta_i\) is the \(i\)-th parameter. The order of the parameters is the same as in the params attribute.

An implementation of a kernel is not required to provide the functionality to evaluate 'derivative' and/or 'param_derivatives'. In this case the set of available predictions of a Gaussian process might be limited. All the GopPy standard kernels implement the complete functionality described above.

Parameters:
x1, x2(N, D) array-like

The N data points of dimension D to evaluate the kernel for.

whatset-like, optional

Types of evaluations to be made (see above).

Returns:
dict

Dictionary with the elements of what as keys and the corresponding evaluations as values.

See also

diag
class goppy.kernel.Matern32Kernel(lengthscales, variance=1.0)

Bases: Kernel

Matérn 3/2 kernel.

The Matérn kernel with \(\nu = \frac{3}{2}\) is defined as \(k(r_i) = \sigma^2 \left(1 + \frac{r_i \sqrt{3}}{l_i}\right) \exp\left(-\frac{r_i \sqrt{3}}{l_i}\right)\) with \(r = |\mathtt{x1} - \mathtt{x2}|\), kernel variance \(\sigma^2\) and length scales \(l\).

Parameters:
lengthscales(D,) array-like

The length scale \(l_i\) for each dimension.

variancefloat

The kernel variance \(\sigma^2\).

diag(x1, x2)

Evaluate the kernel and return only the diagonal of the Gram matrix.

If only the diagonal is needed, this functions may be more efficient than calculating the full Gram matrix with full().

Parameters:
x1, x2(N, D) array-like

The N data points of dimension D to evaluate the kernel for.

Returns:
1d ndarray

The diagonal of the resulting Gram matrix from evaluating the kernels for pairs from x1 and x2.

See also

full
full(x1, x2, what=('y',))

Evaluate the kernel for all pairs of x1 and x2.

Depending on the values included in the what parameter different evaluations will be made and returned as a dictionary res:

  • 'y': Evaluate the kernel for each pair of x1 and x2 resulting in the Gram matrix.

  • 'derivative': Evaluate the partial derivatives. res['derivative'][i, j, :] will correspond to \(\left(\frac{\partial k}{d\mathtt{x2}}\right) \left(\mathtt{x1}_i, \mathtt{x2}_j\right)\) with subscripts denoting input data points, and the kernel \(k(\mathtt{x1}, \mathtt{x2})\).

  • 'param_derivatives': Evaluate the partial derivatives of the kernel parameters. res['param_derivatives'] will be a list with the \(i\)-th element corresponding to \(\left(\frac{\partial k}{d\theta_i}\right) \left(\mathtt{x1}, \mathtt{x2}\right)\) wherein \(\theta_i\) is the \(i\)-th parameter. The order of the parameters is the same as in the params attribute.

An implementation of a kernel is not required to provide the functionality to evaluate 'derivative' and/or 'param_derivatives'. In this case the set of available predictions of a Gaussian process might be limited. All the GopPy standard kernels implement the complete functionality described above.

Parameters:
x1, x2(N, D) array-like

The N data points of dimension D to evaluate the kernel for.

whatset-like, optional

Types of evaluations to be made (see above).

Returns:
dict

Dictionary with the elements of what as keys and the corresponding evaluations as values.

See also

diag
property params

1d-array of kernel parameters.

The first D values are the length scales for each dimension and the last value is the kernel variance.

class goppy.kernel.Matern52Kernel(lengthscales, variance=1.0)

Bases: Kernel

Matérn 5/2 kernel.

The Matérn kernel with \(\nu = \frac{5}{2}\) is defined as \(k(r_i) = \sigma^2 \left(1 + \frac{r_i \sqrt{5}}{l_i} + \frac{5 r_i^2}{3 l_i^2}\right) \exp\left(-\frac{r_i \sqrt{5}}{l_i}\right)\) with \(r = |\mathtt{x1} - \mathtt{x2}|\), kernel variance \(\sigma^2\) and length scales \(l\).

Parameters:
lengthscales(D,) array-like

The length scale \(l_i\) for each dimension.

variancefloat

The kernel variance \(\sigma^2\).

diag(x1, x2)

Evaluate the kernel and return only the diagonal of the Gram matrix.

If only the diagonal is needed, this functions may be more efficient than calculating the full Gram matrix with full().

Parameters:
x1, x2(N, D) array-like

The N data points of dimension D to evaluate the kernel for.

Returns:
1d ndarray

The diagonal of the resulting Gram matrix from evaluating the kernels for pairs from x1 and x2.

See also

full
full(x1, x2, what=('y',))

Evaluate the kernel for all pairs of x1 and x2.

Depending on the values included in the what parameter different evaluations will be made and returned as a dictionary res:

  • 'y': Evaluate the kernel for each pair of x1 and x2 resulting in the Gram matrix.

  • 'derivative': Evaluate the partial derivatives. res['derivative'][i, j, :] will correspond to \(\left(\frac{\partial k}{d\mathtt{x2}}\right) \left(\mathtt{x1}_i, \mathtt{x2}_j\right)\) with subscripts denoting input data points, and the kernel \(k(\mathtt{x1}, \mathtt{x2})\).

  • 'param_derivatives': Evaluate the partial derivatives of the kernel parameters. res['param_derivatives'] will be a list with the \(i\)-th element corresponding to \(\left(\frac{\partial k}{d\theta_i}\right) \left(\mathtt{x1}, \mathtt{x2}\right)\) wherein \(\theta_i\) is the \(i\)-th parameter. The order of the parameters is the same as in the params attribute.

An implementation of a kernel is not required to provide the functionality to evaluate 'derivative' and/or 'param_derivatives'. In this case the set of available predictions of a Gaussian process might be limited. All the GopPy standard kernels implement the complete functionality described above.

Parameters:
x1, x2(N, D) array-like

The N data points of dimension D to evaluate the kernel for.

whatset-like, optional

Types of evaluations to be made (see above).

Returns:
dict

Dictionary with the elements of what as keys and the corresponding evaluations as values.

See also

diag
property params

1d-array of kernel parameters.

The first D values are the length scales for each dimension and the last value is the kernel variance.

class goppy.kernel.SquaredExponentialKernel(lengthscales, variance=1.0)

Bases: Kernel

Squared exponential kernel.

The squared exponential kernel is defined as \(k(r_i) = \sigma^2 \exp\left(-\frac{r_i^2}{2 l_i}\right)\) with \(r = |\mathtt{x1} - \mathtt{x2}|\), kernel variance \(\sigma^2\) and length scales \(l\).

Parameters:
lengthscales(D,) array-like

The length scale \(l_i\) for each dimension.

variancefloat

The kernel variance \(\sigma^2\).

diag(x1, x2)

Evaluate the kernel and return only the diagonal of the Gram matrix.

If only the diagonal is needed, this functions may be more efficient than calculating the full Gram matrix with full().

Parameters:
x1, x2(N, D) array-like

The N data points of dimension D to evaluate the kernel for.

Returns:
1d ndarray

The diagonal of the resulting Gram matrix from evaluating the kernels for pairs from x1 and x2.

See also

full
full(x1, x2, what=('y',))

Evaluate the kernel for all pairs of x1 and x2.

Depending on the values included in the what parameter different evaluations will be made and returned as a dictionary res:

  • 'y': Evaluate the kernel for each pair of x1 and x2 resulting in the Gram matrix.

  • 'derivative': Evaluate the partial derivatives. res['derivative'][i, j, :] will correspond to \(\left(\frac{\partial k}{d\mathtt{x2}}\right) \left(\mathtt{x1}_i, \mathtt{x2}_j\right)\) with subscripts denoting input data points, and the kernel \(k(\mathtt{x1}, \mathtt{x2})\).

  • 'param_derivatives': Evaluate the partial derivatives of the kernel parameters. res['param_derivatives'] will be a list with the \(i\)-th element corresponding to \(\left(\frac{\partial k}{d\theta_i}\right) \left(\mathtt{x1}, \mathtt{x2}\right)\) wherein \(\theta_i\) is the \(i\)-th parameter. The order of the parameters is the same as in the params attribute.

An implementation of a kernel is not required to provide the functionality to evaluate 'derivative' and/or 'param_derivatives'. In this case the set of available predictions of a Gaussian process might be limited. All the GopPy standard kernels implement the complete functionality described above.

Parameters:
x1, x2(N, D) array-like

The N data points of dimension D to evaluate the kernel for.

whatset-like, optional

Types of evaluations to be made (see above).

Returns:
dict

Dictionary with the elements of what as keys and the corresponding evaluations as values.

See also

diag
property params

1d-array of kernel parameters.

The first D values are the length scales for each dimension and the last value is the kernel variance.