API Documentation

core module

The GopPy core classes.

class goppy.core.OnlineGP(kernel, noise_var=0.0, expected_size=None, buffer_factory=<class 'goppy.growable.GrowableArray'>)

Online Gaussian Process.

Provides a Gaussian process to which further data can be added efficiently after the initial training.

Parameters:

kernelKernel: Covariance function of the Gaussian process.
noise_varfloat, optional: The assumed variance of the noise on the training targets.
expected_sizeint, optional: The overall expected number of training samples to be added to the Gaussian process. Setting this parameter can be more efficient as it may avoid memory reallocations.
buffer_factoryfunction, optional: Function to call to create buffer arrays for data storage.

Examples

>>> from goppy import OnlineGP, SquaredExponentialKernel
>>> gp = OnlineGP(SquaredExponentialKernel([1.0]), noise_var=0.1)
>>> gp.fit(np.array([[2, 4]]).T, np.array([[3, 1]]).T)
>>> gp.add(np.array([[0]]), np.array([[3]]))
>>> gp.predict(np.array([[1, 3]]).T)
{'mean': array([[ 2.91154709],
       [ 1.82863199]])}

Attributes:

kernelKernel: Covariance function of the Gaussian process.
noise_varfloat: The assumed variance of the noise on the training targets.
x_train(N, D) ndarray: The N training data inputs of dimension D. This will be None as long as the Gaussian process has not been trained.
y_train(N, D) ndarray: The N training data targets of dimension D. This will be None as long as the Gaussian process has not been trained.
inv_chol(N, N) ndarray: Inverted lower Cholesky factor of the covariance matrix (upper triangular matrix). This will be None as long as the Gaussian process has not been trained.
trainedbool: Indicates whether the Gaussian process has been fitted to some training data.

add(x, y)

Adds additional training data to the Gaussian process and adjusts the fit.

The complexity of this method is in \(O(n \cdot \max(n^2, N^2))\) where

\(n\) is the number of data points being added,
and \(N\) is the total number of data points added so far.

This is better than re-initializing the Gaussian process, which would have a complexity in \(O((n + N)^3)\) because of the matrix inversion.

See this master’s thesis for more details.

Parameters:

x(N, D) array-like: The N input data points of dimension D to train on.
y(N, D) array-like: The N training targets with D independent dimensions.

calc_log_likelihood(what=('value',))

Calculate the log likelihood or its derivative of the Gaussian process.

Depending on the values included in the what parameter different values will be calculated:

'value': The log likelihood of the Gaussian process as scalar.
'derivative': Partial derivatives of the log likelihood for each kernel parameter as array. See the params property of the used kernel for the order.

Parameters:

whatset-like, optional: Values to calculate (see above).

Returns:

dict: Dictionary with the elements of what as keys and the corresponding calculated values.

fit(x, y)

Fits the Gaussian process to training data.

Parameters:

x(N, D) array-like: The N input data points of dimension D to train on.
y(N, D) array-like: The N training targets with D independent dimensions.

property inv_cov_matrix

Inverted covariance matrix.

Cannot be accessed before the Gaussian process has been trained.

predict(x, what=('mean',))

Predict with the Gaussian process.

Depending on the values included in the what parameter different predictions will be made and returned a dictionary res:

'mean': Mean prediction of the Gaussian process of shape (N, D).
'mse': Predictive variance of the Gaussian process of shape (N,).
'derivative': Predicted derivative of the mean. res['derivative'][i, :, j] will correspond to \(\left(\frac{\partial \mu}{\partial x_j}\right) \left(x_i\right)\) with the i-th input data point \(x_i\), and mean function \(\mu(x)\).
'mse_derivative': Predicted derivative of the variance. res['mse_derivative'][i, :] will correspond to \(\left(\frac{d \sigma^2}{d x}\right) \left(x_i\right)\) with the i-th input data point \(x_i\), and variance function \(\sigma^2(x)\).

Parameters:

x(N, D) array-like: The N data points of dimension D to predict data for.
whatset-like, optional: Types of predictions to be made (see above).

Returns:

dict: Dictionary with the elements of what as keys and the corresponding predictions as values.

growable module

Provides array types which can be enlarged after creation.

class goppy.growable.GrowableArray(shape, dtype=<class 'float'>, order='C', buffer_shape=None)

An array which can be enlarged after creation.

Though this is not a subclass of numpy.ndarray, it implements the same interface.

Parameters:

shapeint or tuple of int: Initial shape of the created empty array.
dtypedata-type, optional: Desired output data-type.
order{‘C’, ‘F’}, optional: Whether to store multi-dimensional data in C (row-major) or Fortran (column-major) order in memory.
buffer_shapeint or tuple of int, optional: Initial shape of the buffer to hold the actual data. As long as the array shape stays below the buffer shape no new memory has to allocated.

Examples

>>> from goppy.growable import GrowableArray
>>> a = GrowableArray((1, 1))
>>> a[:, :] = 1
>>> print a
[[ 1.]]
>>> a.grow_by((1, 2))
>>> a[:, :] = 2
>>> print a
[[ 2.  2.  2.]
 [ 2.  2.  2.]]

grow_by(amount)

Grow the array.

Parameters:

amountint or tuple of int: Amount by which each dimension will be enlarged.

kernel module

Provides kernels for the use with Gaussian processes.

class goppy.kernel.ExponentialKernel(lengthscales, variance=1.0)

Bases: Kernel

Exponential kernel.

The exponential kernel is defined as \(k(r_i) = \sigma^2 \exp\left(-\frac{r_i}{l_i}\right)\) with \(r = |\mathtt{x1} - \mathtt{x2}|\), kernel variance \(\sigma^2\) and length scales \(l\).

Parameters:

lengthscales(D,) array-like: The length scale \(l_i\) for each dimension.
variancefloat: The kernel variance \(\sigma^2\).

diag(x1, x2)

Evaluate the kernel and return only the diagonal of the Gram matrix.

If only the diagonal is needed, this functions may be more efficient than calculating the full Gram matrix with full().

Parameters:

x1, x2(N, D) array-like: The N data points of dimension D to evaluate the kernel for.

Returns:

1d ndarray: The diagonal of the resulting Gram matrix from evaluating the kernels for pairs from x1 and x2.

See also

full

full(x1, x2, what=('y',))

Evaluate the kernel for all pairs of x1 and x2.

Depending on the values included in the what parameter different evaluations will be made and returned as a dictionary res:

'y': Evaluate the kernel for each pair of x1 and x2 resulting in the Gram matrix.
'derivative': Evaluate the partial derivatives. res['derivative'][i, j, :] will correspond to \(\left(\frac{\partial k}{d\mathtt{x2}}\right) \left(\mathtt{x1}_i, \mathtt{x2}_j\right)\) with subscripts denoting input data points, and the kernel \(k(\mathtt{x1}, \mathtt{x2})\).
'param_derivatives': Evaluate the partial derivatives of the kernel parameters. res['param_derivatives'] will be a list with the \(i\)-th element corresponding to \(\left(\frac{\partial k}{d\theta_i}\right) \left(\mathtt{x1}, \mathtt{x2}\right)\) wherein \(\theta_i\) is the \(i\)-th parameter. The order of the parameters is the same as in the params attribute.

An implementation of a kernel is not required to provide the functionality to evaluate 'derivative' and/or 'param_derivatives'. In this case the set of available predictions of a Gaussian process might be limited. All the GopPy standard kernels implement the complete functionality described above.

Parameters:

x1, x2(N, D) array-like: The N data points of dimension D to evaluate the kernel for.
whatset-like, optional: Types of evaluations to be made (see above).

Returns:

dict: Dictionary with the elements of what as keys and the corresponding evaluations as values.

See also

diag

property params

1d-array of kernel parameters.

The first D values are the length scales for each dimension and the last value is the kernel variance.

class goppy.kernel.Kernel

Bases: object

Abstract base class for kernels.

An instance of this class is callable and instance(x1, x2) will call instance.full(x1, x2).

Attributes:

params1d ndarray: Array representation of the kernel parameters.

diag(x1, x2)

Evaluate the kernel and return only the diagonal of the Gram matrix.

If only the diagonal is needed, this functions may be more efficient than calculating the full Gram matrix with full().

Parameters:

x1, x2(N, D) array-like: The N data points of dimension D to evaluate the kernel for.

Returns:

1d ndarray: The diagonal of the resulting Gram matrix from evaluating the kernels for pairs from x1 and x2.

See also

full

full(x1, x2, what=('y',))

Evaluate the kernel for all pairs of x1 and x2.

Depending on the values included in the what parameter different evaluations will be made and returned as a dictionary res:

'y': Evaluate the kernel for each pair of x1 and x2 resulting in the Gram matrix.
'derivative': Evaluate the partial derivatives. res['derivative'][i, j, :] will correspond to \(\left(\frac{\partial k}{d\mathtt{x2}}\right) \left(\mathtt{x1}_i, \mathtt{x2}_j\right)\) with subscripts denoting input data points, and the kernel \(k(\mathtt{x1}, \mathtt{x2})\).
'param_derivatives': Evaluate the partial derivatives of the kernel parameters. res['param_derivatives'] will be a list with the \(i\)-th element corresponding to \(\left(\frac{\partial k}{d\theta_i}\right) \left(\mathtt{x1}, \mathtt{x2}\right)\) wherein \(\theta_i\) is the \(i\)-th parameter. The order of the parameters is the same as in the params attribute.

An implementation of a kernel is not required to provide the functionality to evaluate 'derivative' and/or 'param_derivatives'. In this case the set of available predictions of a Gaussian process might be limited. All the GopPy standard kernels implement the complete functionality described above.

Parameters:

x1, x2(N, D) array-like: The N data points of dimension D to evaluate the kernel for.
whatset-like, optional: Types of evaluations to be made (see above).

Returns:

dict: Dictionary with the elements of what as keys and the corresponding evaluations as values.

See also

diag

class goppy.kernel.Matern32Kernel(lengthscales, variance=1.0)

Bases: Kernel

Matérn 3/2 kernel.

The Matérn kernel with \(\nu = \frac{3}{2}\) is defined as \(k(r_i) = \sigma^2 \left(1 + \frac{r_i \sqrt{3}}{l_i}\right) \exp\left(-\frac{r_i \sqrt{3}}{l_i}\right)\) with \(r = |\mathtt{x1} - \mathtt{x2}|\), kernel variance \(\sigma^2\) and length scales \(l\).

Parameters:

lengthscales(D,) array-like: The length scale \(l_i\) for each dimension.
variancefloat: The kernel variance \(\sigma^2\).

diag(x1, x2)

Evaluate the kernel and return only the diagonal of the Gram matrix.

If only the diagonal is needed, this functions may be more efficient than calculating the full Gram matrix with full().

Parameters:

x1, x2(N, D) array-like: The N data points of dimension D to evaluate the kernel for.

Returns:

1d ndarray: The diagonal of the resulting Gram matrix from evaluating the kernels for pairs from x1 and x2.

See also

full

full(x1, x2, what=('y',))

Evaluate the kernel for all pairs of x1 and x2.

Depending on the values included in the what parameter different evaluations will be made and returned as a dictionary res:

'y': Evaluate the kernel for each pair of x1 and x2 resulting in the Gram matrix.
'derivative': Evaluate the partial derivatives. res['derivative'][i, j, :] will correspond to \(\left(\frac{\partial k}{d\mathtt{x2}}\right) \left(\mathtt{x1}_i, \mathtt{x2}_j\right)\) with subscripts denoting input data points, and the kernel \(k(\mathtt{x1}, \mathtt{x2})\).
'param_derivatives': Evaluate the partial derivatives of the kernel parameters. res['param_derivatives'] will be a list with the \(i\)-th element corresponding to \(\left(\frac{\partial k}{d\theta_i}\right) \left(\mathtt{x1}, \mathtt{x2}\right)\) wherein \(\theta_i\) is the \(i\)-th parameter. The order of the parameters is the same as in the params attribute.

An implementation of a kernel is not required to provide the functionality to evaluate 'derivative' and/or 'param_derivatives'. In this case the set of available predictions of a Gaussian process might be limited. All the GopPy standard kernels implement the complete functionality described above.

Parameters:

x1, x2(N, D) array-like: The N data points of dimension D to evaluate the kernel for.
whatset-like, optional: Types of evaluations to be made (see above).

Returns:

dict: Dictionary with the elements of what as keys and the corresponding evaluations as values.

See also

diag

property params

1d-array of kernel parameters.

The first D values are the length scales for each dimension and the last value is the kernel variance.

class goppy.kernel.Matern52Kernel(lengthscales, variance=1.0)

Bases: Kernel

Matérn 5/2 kernel.

The Matérn kernel with \(\nu = \frac{5}{2}\) is defined as \(k(r_i) = \sigma^2 \left(1 + \frac{r_i \sqrt{5}}{l_i} + \frac{5 r_i^2}{3 l_i^2}\right) \exp\left(-\frac{r_i \sqrt{5}}{l_i}\right)\) with \(r = |\mathtt{x1} - \mathtt{x2}|\), kernel variance \(\sigma^2\) and length scales \(l\).

Parameters:

lengthscales(D,) array-like: The length scale \(l_i\) for each dimension.
variancefloat: The kernel variance \(\sigma^2\).

diag(x1, x2)

Evaluate the kernel and return only the diagonal of the Gram matrix.

If only the diagonal is needed, this functions may be more efficient than calculating the full Gram matrix with full().

Parameters:

x1, x2(N, D) array-like: The N data points of dimension D to evaluate the kernel for.

Returns:

1d ndarray: The diagonal of the resulting Gram matrix from evaluating the kernels for pairs from x1 and x2.

See also

full

full(x1, x2, what=('y',))

Evaluate the kernel for all pairs of x1 and x2.

Depending on the values included in the what parameter different evaluations will be made and returned as a dictionary res:

'y': Evaluate the kernel for each pair of x1 and x2 resulting in the Gram matrix.
'derivative': Evaluate the partial derivatives. res['derivative'][i, j, :] will correspond to \(\left(\frac{\partial k}{d\mathtt{x2}}\right) \left(\mathtt{x1}_i, \mathtt{x2}_j\right)\) with subscripts denoting input data points, and the kernel \(k(\mathtt{x1}, \mathtt{x2})\).
'param_derivatives': Evaluate the partial derivatives of the kernel parameters. res['param_derivatives'] will be a list with the \(i\)-th element corresponding to \(\left(\frac{\partial k}{d\theta_i}\right) \left(\mathtt{x1}, \mathtt{x2}\right)\) wherein \(\theta_i\) is the \(i\)-th parameter. The order of the parameters is the same as in the params attribute.

An implementation of a kernel is not required to provide the functionality to evaluate 'derivative' and/or 'param_derivatives'. In this case the set of available predictions of a Gaussian process might be limited. All the GopPy standard kernels implement the complete functionality described above.

Parameters:

x1, x2(N, D) array-like: The N data points of dimension D to evaluate the kernel for.
whatset-like, optional: Types of evaluations to be made (see above).

Returns:

dict: Dictionary with the elements of what as keys and the corresponding evaluations as values.

See also

diag

property params

1d-array of kernel parameters.

The first D values are the length scales for each dimension and the last value is the kernel variance.

class goppy.kernel.SquaredExponentialKernel(lengthscales, variance=1.0)

Bases: Kernel

Squared exponential kernel.

The squared exponential kernel is defined as \(k(r_i) = \sigma^2 \exp\left(-\frac{r_i^2}{2 l_i}\right)\) with \(r = |\mathtt{x1} - \mathtt{x2}|\), kernel variance \(\sigma^2\) and length scales \(l\).

Parameters:

lengthscales(D,) array-like: The length scale \(l_i\) for each dimension.
variancefloat: The kernel variance \(\sigma^2\).

diag(x1, x2)

Evaluate the kernel and return only the diagonal of the Gram matrix.

If only the diagonal is needed, this functions may be more efficient than calculating the full Gram matrix with full().

Parameters:

x1, x2(N, D) array-like: The N data points of dimension D to evaluate the kernel for.

Returns:

1d ndarray: The diagonal of the resulting Gram matrix from evaluating the kernels for pairs from x1 and x2.

See also

full

full(x1, x2, what=('y',))

Evaluate the kernel for all pairs of x1 and x2.

Depending on the values included in the what parameter different evaluations will be made and returned as a dictionary res:

'y': Evaluate the kernel for each pair of x1 and x2 resulting in the Gram matrix.
'derivative': Evaluate the partial derivatives. res['derivative'][i, j, :] will correspond to \(\left(\frac{\partial k}{d\mathtt{x2}}\right) \left(\mathtt{x1}_i, \mathtt{x2}_j\right)\) with subscripts denoting input data points, and the kernel \(k(\mathtt{x1}, \mathtt{x2})\).
'param_derivatives': Evaluate the partial derivatives of the kernel parameters. res['param_derivatives'] will be a list with the \(i\)-th element corresponding to \(\left(\frac{\partial k}{d\theta_i}\right) \left(\mathtt{x1}, \mathtt{x2}\right)\) wherein \(\theta_i\) is the \(i\)-th parameter. The order of the parameters is the same as in the params attribute.

An implementation of a kernel is not required to provide the functionality to evaluate 'derivative' and/or 'param_derivatives'. In this case the set of available predictions of a Gaussian process might be limited. All the GopPy standard kernels implement the complete functionality described above.

Parameters:

x1, x2(N, D) array-like: The N data points of dimension D to evaluate the kernel for.
whatset-like, optional: Types of evaluations to be made (see above).

Returns:

dict: Dictionary with the elements of what as keys and the corresponding evaluations as values.

See also

diag

property params

1d-array of kernel parameters.

The first D values are the length scales for each dimension and the last value is the kernel variance.