5. Modules

5.1. Negentropy approximators

5.1.1. MIT License

Copyright (c) 2023 Ryan Balshaw

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


The negentropy approximation functions for the negentropy-based ICA methods.

Classes:

CubeObject()

An object that implements the first derivative, second derivative and gamma functions of the quad (u**4) function.

ExpObject([a2])

An object that implements the first derivative, second derivative and gamma functions of the exp function.

LogcoshObject([a1])

An object that implements the first derivative, second derivative and gamma functions of the logcosh function.

QuadObject()

An object that implements the first derivative, second derivative and gamma functions of the quad (u**4) function.

Functions:

initialise_sources([source_name, source_params])

A function that takes in the source name and its associated parameters and returns the source instance and the E{G(nu)} value for a set number of samples (100 000 samples).

class spectrally_regularised_lvms.negen_approx.CubeObject

Bases: object

An object that implements the first derivative, second derivative and gamma functions of the quad (u**4) function. These functions are used in the negentropy approximation calculation for negentropy-based ICA.

function(u=float) u^3

Return the function form of G(u) for the cube function

first_derivative(u=float) 3 u^2

Return the function form of the first derivative g(u) for the cube function

second_derivative(u=float) 6 u

Return the function form of the second derivative g’(u) for the cube function

gamma(u=float) 2 / u

Return the function form of ratio g’(u)/g(u) for the cube function

Methods

first_derivative(u)

This method implements the first derivative of G(.) for g(u)

function(u)

This method implements the functional form of G(u)

gamma(u)

This method implements the ratio of the second derivative to the first derivative (gamma(u) = g'(u) / g(u))

second_derivative(u)

This method implements the second derivative of G(.) for g'(u)

Methods:

first_derivative(u)

This method implements the first derivative of G(.) for g(u)

function(u)

This method implements the functional form of G(u)

gamma(u)

This method implements the ratio of the second derivative to the first derivative (gamma(u) = g'(u) / g(u))

second_derivative(u)

This method implements the second derivative of G(.) for g'(u)

static first_derivative(u: float | ndarray) float | ndarray

This method implements the first derivative of G(.) for g(u)

Parameters:

u (float or np.ndarray) – The input value to be fed through g(u)

Return type:

The computation of g(u)

static function(u: float | ndarray) float | ndarray

This method implements the functional form of G(u)

Parameters:

u (float or np.ndarray) – The input value to be fed through G(u)

Return type:

The computation of G(u)

static gamma(u: float | ndarray) float | ndarray

This method implements the ratio of the second derivative to the first derivative (gamma(u) = g’(u) / g(u))

Parameters:

u (float or np.ndarray) – The input value to be fed through gamma(u)

Return type:

The computation of gamma(u)

static second_derivative(u: float | ndarray) float | ndarray

This method implements the second derivative of G(.) for g’(u)

Parameters:

u (float or np.ndarray) – The input value to be fed through g’(u)

Return type:

The computation of g’(u)

class spectrally_regularised_lvms.negen_approx.ExpObject(a2: float | None = None)

Bases: object

An object that implements the first derivative, second derivative and gamma functions of the exp function. These functions are used in the negentropy approximation calculation for negentropy-based ICA.

function(u=float)

Return the function form of G(u) for the exp function

first_derivative(u=float)

Return the function form of the first derivative g(u) for the exp function

second_derivative(u = float) -> (1 - a2 * u^*2) * exp(-a2 / 2 * u^2)

Return the function form of the second derivative g’(u) for the exp function

gamma(u=float) (1 - a2 * u^2) / u

Return the function form of ratio g’(u)/g(u) for the exp function

Methods

first_derivative(u)

This method implements the first derivative of G(.) for g(u)

function(u)

This method implements the functional form of G(u)

gamma(u)

This method implements the ratio of the second derivative to the first derivative (gamma(u) = g'(u) / g(u))

second_derivative(u)

This method implements the second derivative of G(.) for g'(u)

Methods:

first_derivative(u)

This method implements the first derivative of G(.) for g(u)

function(u)

This method implements the functional form of G(u)

gamma(u)

This method implements the ratio of the second derivative to the first derivative (gamma(u) = g'(u) / g(u))

second_derivative(u)

This method implements the second derivative of G(.) for g'(u)

first_derivative(u: float | ndarray) float | ndarray

This method implements the first derivative of G(.) for g(u)

Parameters:

u (float or np.ndarray) – The input value to be fed through g(u)

Return type:

The computation of g(u)

function(u: float | ndarray) float | ndarray

This method implements the functional form of G(u)

Parameters:

u (float or np.ndarray) – The input value to be fed through G(u)

Return type:

The computation of G(u)

gamma(u: float | ndarray) float | ndarray

This method implements the ratio of the second derivative to the first derivative (gamma(u) = g’(u) / g(u))

Parameters:

u (float or np.ndarray) – The input value to be fed through gamma(u)

Return type:

The computation of gamma(u)

second_derivative(u: float | ndarray) float | ndarray

This method implements the second derivative of G(.) for g’(u)

Parameters:

u (float or np.ndarray) – The input value to be fed through g’(u)

Return type:

The computation of g’(u)

class spectrally_regularised_lvms.negen_approx.LogcoshObject(a1: float | None = None)

Bases: object

An object that implements the first derivative, second derivative and gamma functions of the logcosh function. These functions are used in the negentropy approximation calculation for negentropy-based ICA.

function(u=float)

Return the function form of G(u) for the logcosh function

first_derivative(u=float) tanh

Return the function form of the first derivative g(u) for the logcosh function

second_derivative(u=float)

Return the function form of the second derivative g’(u) for the logcosh function

gamma(u=float)

Return the function form of ratio g’(u)/g(u) for the logcosh function

Methods

first_derivative(u)

This method implements the first derivative of G(.) for g(u)

function(u)

This method implements the functional form of G(u)

gamma(u)

This method implements the ratio of the second derivative to the first derivative (gamma(u) = g'(u) / g(u))

second_derivative(u)

This method implements the second derivative of G(.) for g'(u)

Methods:

first_derivative(u)

This method implements the first derivative of G(.) for g(u)

function(u)

This method implements the functional form of G(u)

gamma(u)

This method implements the ratio of the second derivative to the first derivative (gamma(u) = g'(u) / g(u))

second_derivative(u)

This method implements the second derivative of G(.) for g'(u)

first_derivative(u: float | ndarray) float | ndarray

This method implements the first derivative of G(.) for g(u)

Parameters:

u (float or np.ndarray) – The input value to be fed through g(u)

Return type:

The computation of g(u)

function(u: float | ndarray) float | ndarray

This method implements the functional form of G(u)

Parameters:

u (float or np.ndarray) – The input value to be fed through G(u)

Return type:

The computation of G(u)

gamma(u: float | ndarray) float | ndarray

This method implements the ratio of the second derivative to the first derivative (gamma(u) = g’(u) / g(u))

Parameters:

u (float or np.ndarray) – The input value to be fed through gamma(u)

Return type:

The computation of gamma(u)

second_derivative(u: float | ndarray) float | ndarray

This method implements the second derivative of G(.) for g’(u)

Parameters:

u (float or np.ndarray) – The input value to be fed through g’(u)

Return type:

The computation of g’(u)

class spectrally_regularised_lvms.negen_approx.QuadObject

Bases: object

An object that implements the first derivative, second derivative and gamma functions of the quad (u**4) function. These functions are used in the negentropy approximation calculation for negentropy-based ICA.

function(u=float) 1/4 u^4

Return the function form of G(u) for the quartic function

first_derivative(u=float) u^3

Return the function form of the first derivative g(u) for the quartic function

second_derivative(u=float) 3 u ^2

Return the function form of the second derivative g’(u) for the quartic function

gamma(u=float) 3 / u

Return the function form of ratio g’(u)/g(u) for the quartic function

Methods

first_derivative(u)

This method implements the first derivative of G(.) for g(u)

function(u)

This method implements the functional form of G(u)

gamma(u)

This method implements the ratio of the second derivative to the first derivative (gamma(u) = g'(u) / g(u))

second_derivative(u)

This method implements the second derivative of G(.) for g'(u)

Methods:

first_derivative(u)

This method implements the first derivative of G(.) for g(u)

function(u)

This method implements the functional form of G(u)

gamma(u)

This method implements the ratio of the second derivative to the first derivative (gamma(u) = g'(u) / g(u))

second_derivative(u)

This method implements the second derivative of G(.) for g'(u)

static first_derivative(u: float | ndarray) float | ndarray

This method implements the first derivative of G(.) for g(u)

Parameters:

u (float or np.ndarray) – The input value to be fed through g(u)

Return type:

The computation of g(u)

static function(u: float | ndarray) float | ndarray

This method implements the functional form of G(u)

Parameters:

u (float or np.ndarray) – The input value to be fed through G(u)

Return type:

The computation of G(u)

static gamma(u: float | ndarray) float | ndarray

This method implements the ratio of the second derivative to the first derivative (gamma(u) = g’(u) / g(u))

Parameters:

u (float or np.ndarray) – The input value to be fed through gamma(u)

Return type:

The computation of gamma(u)

static second_derivative(u: float | ndarray) float | ndarray

This method implements the second derivative of G(.) for g’(u)

Parameters:

u (float or np.ndarray) – The input value to be fed through g’(u)

Return type:

The computation of g’(u)

spectrally_regularised_lvms.negen_approx.initialise_sources(source_name: str = 'logcosh', source_params: dict | None = None) tuple[LogcoshObject | ExpObject | QuadObject | CubeObject, float]

A function that takes in the source name and its associated parameters and returns the source instance and the E{G(nu)} value for a set number of samples (100 000 samples).

Parameters:
  • source_name (str) – The name of the source that is to be used.

  • source_params (dict | None (default None)) – The dictionary of parameters for the associated approximator. Format: {“alpha”: alpha_val} where alpha_val is some float.

Returns:

  • source_instance – The source instance that is to be used.

  • source_expecation (float) – The evaluation of G(nu) for a set number of samples.

5.2. Cost functions

5.2.1. MIT License

Copyright (c) 2023 Ryan Balshaw

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


This set of methods that define general cost functions that one can use, using user-defined analytical functions based off SymPy and NumPy.

Additionally, there are two specific methods implemented here: - principal component analysis - negentropy-based independent component analysis.

Classes:

CostClass([use_hessian, verbose, ...])

Base class for different formulations of the user cost function.

ExplicitCost([use_hessian, verbose, ...])

An object that implements the general user cost function class.

NegentropyCost(source_name, source_params[, ...])

This class implements the Negentropy cost that is commonly applied to ICA via negentropy maximisation or kurtosis maximisation.

SymbolicCost([use_hessian, verbose, ...])

This class implements a general user cost function class based off SymPy.

VarianceCost([use_hessian, verbose, ...])

This method implements the PCA variance maximisation objective.

class spectrally_regularised_lvms.cost_functions.CostClass(use_hessian: bool = True, verbose: bool = False, finite_diff_flag: bool = False)

Bases: object

Base class for different formulations of the user cost function. All children classes are expected to have _cost, _cost_gradient, and _cost_hessian instance attribute. These attributes are accessed by the methods of CostClass for ease of use.

cost(X, w, y)

This method accesses the internal self._cost instance attribute and returns its output self._cost(X, w, y).

cost_gradient(X, w, y)

This method accesses the internal self._cost_gradient instance attribute and returns its output self._cost_gradient(X, w, y).

cost_hessian(X, w, y)

This method accesses the internal self._cost_hessian instance attribute and returns its output self._cost_hessian(X, w, y).

finite_difference_grad(X, w, y, step_size)

The method returns the central finite difference approximation to the gradient.

finite_difference_hess(X, w, y, step_size)

The method returns the central finite difference approximation to the Hessian.

check_gradient(X, w, y, step_size)

This method takes in an initial set of variables X, w, y, and a finite difference step size. The function is used to check the self._cost_gradient method using a central finite-difference approach.

check_hessian(X, w, y, step_size)

This method takes in an initial set of variables X, w, y, and a finite difference step size. The function is used to check the self._cost_hessian method using a central finite-difference approach.

Methods

check_gradient(X, w, y[, step_size])

This method checks the self._cost_gradient function to determine whether the gradient implementation is correct based off the objective function.

check_hessian(X, w, y[, step_size])

This method checks the self._cost_hessian function to determine whether the hessian implementation is correct based off the user-defined objective function.

cost(X, w, y)

A method that returns the cost function for the inputs.

cost_gradient(X, w, y)

A method that returns the cost function gradient for the inputs.

cost_hessian(X, w, y)

A method that returns the Hessian function for the inputs.

finite_difference_grad(X, w, y, step_size)

Finite difference gradient approximation (central difference)

finite_difference_hess(X, w, y, step_size)

Finite difference Hessian approximation (central difference)

Methods:

check_gradient(X, w, y[, step_size])

This method checks the self._cost_gradient function to determine whether the gradient implementation is correct based off the objective function.

check_hessian(X, w, y[, step_size])

This method checks the self._cost_hessian function to determine whether the hessian implementation is correct based off the user-defined objective function.

cost(X, w, y)

A method that returns the cost function for the inputs.

cost_gradient(X, w, y)

A method that returns the cost function gradient for the inputs.

cost_hessian(X, w, y)

A method that returns the Hessian function for the inputs.

finite_difference_grad(X, w, y, step_size)

Finite difference gradient approximation (central difference)

finite_difference_hess(X, w, y, step_size)

Finite difference Hessian approximation (central difference)

check_gradient(X: ndarray, w: ndarray, y: ndarray, step_size: float = 0.0001) tuple[ndarray, ndarray, ndarray]

This method checks the self._cost_gradient function to determine whether the gradient implementation is correct based off the objective function.

Parameters:
  • X (ndarray) – An array of size n_samples x n_features.

  • w (ndarray) – An column vector of size n_features x 1

  • y (ndarray) – An column vector of size n_features x 1. Expected to be equivalent to X @ w.

  • step_size (float (default = 1e-4)) – The finite difference step size.

Returns:

  • grad_current (ndarray) – The gradient based off the internal self._cost_gradient instance.

  • grad_fd (ndarray) – The finite-difference approximation to the gradient

  • grad_norm (ndarray) – The L2 norm between the analytical gradient and the finite difference approximation.

  • Note that this is a helper method. costClass operates as a base class,

  • so you will find that methods such as self.cost() and self.cost_gradient()

  • are accessed but never defined. I use a child class to define these methods.

check_hessian(X: ndarray, w: ndarray, y: ndarray, step_size: float = 0.0001) Tuple[ndarray, ndarray, float]

This method checks the self._cost_hessian function to determine whether the hessian implementation is correct based off the user-defined objective function.

Parameters:
  • X (ndarray) – An array of size n_samples x n_features.

  • w (ndarray) – An column vector of size n_features x 1

  • y (ndarray) – An column vector of size n_features x 1. Expected to be equivalent to X @ w.

  • step_size (float (default = 1e-4)) – The finite difference step size.

Returns:

  • hess_current (ndarray) – The hessian based off the internal self._cost_hessian instance.

  • hess_check (ndarray) – The finite-difference approximation to the hessian.

  • hess_norm (ndarray) – The L2 norm (average of the row-wise L2 norm) between the analytical hessian and the finite difference approximation.

  • Note that this is a helper method. costClass operates as a base class,

  • so you will find that methods such as self.cost() and self.cost_hessian()

  • are accessed but never defined. I use a child class to define these methods.

cost(X: ndarray, w: ndarray, y: ndarray) float

A method that returns the cost function for the inputs.

Parameters:
  • X (ndarray) – The feature matrix of size (n_samples, n_features)

  • w (ndarray) – The transformation vector of size (n_features, 1)

  • y (ndarray) – The transformed variable y = X @ w of size (n_samples, 1)

Return type:

cost function evaluation of (X, w, y)

cost_gradient(X: ndarray, w: ndarray, y: ndarray) ndarray

A method that returns the cost function gradient for the inputs.

Parameters:
  • X (ndarray) – The feature matrix of size (n_samples, n_features)

  • w (ndarray) – The transformation vector of size (n_features, 1)

  • y (ndarray) – The transformed variable y = X @ w of size (n_samples, 1)

Return type:

derivative function evaluation of (X, w, y)

cost_hessian(X: ndarray, w: ndarray, y: ndarray) ndarray

A method that returns the Hessian function for the inputs.

Parameters:
  • X (ndarray) – The feature matrix of size (n_samples, n_features)

  • w (ndarray) – The transformation vector of size (n_features, 1)

  • y (ndarray) – The transformed variable y = X @ w of size (n_samples, 1)

Return type:

Hessian function evaluation of (X, w, y)

finite_difference_grad(X: ndarray, w: ndarray, y: ndarray, step_size: float) ndarray

Finite difference gradient approximation (central difference)

Parameters:
  • X (ndarray) – An array of size n_samples x n_features.

  • w (ndarray) – An column vector of size n_features x 1

  • y (ndarray) – An column vector of size n_features x 1. Expected to be equivalent to X @ w.

  • step_size (float (default = 1e-4)) – The finite difference step size.

Returns:

grad_fd – The finite difference approximation to the gradient

Return type:

ndarray

finite_difference_hess(X: ndarray, w: ndarray, y: ndarray, step_size: float) ndarray

Finite difference Hessian approximation (central difference)

Parameters:
  • X (ndarray) – An array of size n_samples x n_features.

  • w (ndarray) – An column vector of size n_features x 1

  • y (ndarray) – An column vector of size n_features x 1. Expected to be equivalent to X @ w.

  • step_size (float (default = 1e-4)) – The finite difference step size.

Returns:

hess_fd – The finite difference approximation to the hessian

Return type:

ndarray

class spectrally_regularised_lvms.cost_functions.ExplicitCost(use_hessian: bool = True, verbose: bool = False, finite_diff_flag: bool = False)

Bases: CostClass

An object that implements the general user cost function class. This allows the user to manually define their cost function and associated gradient vector and hessian. Inherits from costClass.

The user is asked to define their objective function, gradient function, and hessian function which take in three inputs: X, w, y. This can be done using the set_ methods that are available to an instance of this class, which is inherited from costClass.

Assumed function format from user: func(X, w, y) where X is a ndarray with shape (n_samples, n_features), w is a ndarray with shape (n_features, 1) and y is the linear transformation X @ w with shape (n_samples, 1).

set_cost(cost_func)

This method takes in a cost_func variable and sets it as an internal attribute self._cost.

set_gradient(cost_gradient)

This method takes in a cost_gradient variable and sets it as an internal attribute self._cost_gradient.

set_hessian(cost_hessian)

This method takes in a cost_hessian variable and sets it as an internal attribute self._cost_hessian.

get_cost()

This method returns the internal self._cost attribute.

get_gradient()

This method returns the internal self._cost_gradient attribute.

get_hessian()

This method returns the internal self._cost_hessian attribute.

Methods

check_gradient(X, w, y[, step_size])

This method checks the self._cost_gradient function to determine whether the gradient implementation is correct based off the objective function.

check_hessian(X, w, y[, step_size])

This method checks the self._cost_hessian function to determine whether the hessian implementation is correct based off the user-defined objective function.

cost(X, w, y)

A method that returns the cost function for the inputs.

cost_gradient(X, w, y)

A method that returns the cost function gradient for the inputs.

cost_hessian(X, w, y)

A method that returns the Hessian function for the inputs.

finite_difference_grad(X, w, y, step_size)

Finite difference gradient approximation (central difference)

finite_difference_hess(X, w, y, step_size)

Finite difference Hessian approximation (central difference)

get_cost()

Method to return the cost function to the user.

get_gradient()

Method to return the derivative function to the user.

get_hessian()

A method to return the hessian function to the user.

set_cost(cost_func)

This method allows one to set their cost function.

set_gradient(cost_gradient)

This method allows one to set their gradient vector.

set_hessian(cost_hessian)

This method allows one to set their objective Hessian (optional).

Methods:

get_cost()

Method to return the cost function to the user.

get_gradient()

Method to return the derivative function to the user.

get_hessian()

A method to return the hessian function to the user.

set_cost(cost_func)

This method allows one to set their cost function.

set_gradient(cost_gradient)

This method allows one to set their gradient vector.

set_hessian(cost_hessian)

This method allows one to set their objective Hessian (optional).

get_cost() Callable[[ndarray, ndarray, ndarray], float] | None

Method to return the cost function to the user.

Return type:

cost_func if attribute exists, else None.

get_gradient() Callable[[ndarray, ndarray, ndarray], ndarray] | None

Method to return the derivative function to the user.

Return type:

cost_gradient if attribute exists, else None.

get_hessian() Callable[[ndarray, ndarray, ndarray], ndarray] | None

A method to return the hessian function to the user.

Return type:

cost_hessian if attribute exists, else None.

set_cost(cost_func: Callable) None

This method allows one to set their cost function.

Parameters:

cost_func (function) – The users cost function.

Examples

cost_func = lambda X, w, y: -1 * np.mean(y ** 2, axis=0)

set_gradient(cost_gradient: Callable[[ndarray, ndarray, ndarray], ndarray]) None

This method allows one to set their gradient vector.

Parameters:

cost_gradient (function) – The users gradient vector of the cost function.

Examples

cost_gradient = lambda X, w, y: -2 * np.mean(y * X, axis=0,

keepdims=True)

set_hessian(cost_hessian: Callable[[ndarray, ndarray, ndarray], ndarray]) None

This method allows one to set their objective Hessian (optional).

Parameters:

cost_hessian (function) – The users gradient vector of the cost function.

Examples

cost_hessian = lambda X, w, y: -2 /X.shape[0] * (X.T @ X)

class spectrally_regularised_lvms.cost_functions.NegentropyCost(source_name: str, source_params: dict, use_approx: bool = True, use_hessian: bool = True, verbose: bool = False, finite_diff_flag: bool = False)

Bases: CostClass

This class implements the Negentropy cost that is commonly applied to ICA via negentropy maximisation or kurtosis maximisation. Inherits from CostClass.

Assumed function format: func(X, w, y) where X is a ndarray with shape (n_samples, n_features), w is a ndarray with shape (n_features, 1) and y is the linear transformation X @ w with shape (n_samples, 1).

_cost(X, w, y)

This method returns the cost function value for the Negentropy loss.

_cost_gradient(X, w, y)

This method returns the gradient of the cost function for the Negentropy loss.

_cost_hessian(X, w, y)

This method returns the Hessian of the cost function for the Negentropy loss.

Methods

check_gradient(X, w, y[, step_size])

This method checks the self._cost_gradient function to determine whether the gradient implementation is correct based off the objective function.

check_hessian(X, w, y[, step_size])

This method checks the self._cost_hessian function to determine whether the hessian implementation is correct based off the user-defined objective function.

cost(X, w, y)

A method that returns the cost function for the inputs.

cost_gradient(X, w, y)

A method that returns the cost function gradient for the inputs.

cost_hessian(X, w, y)

A method that returns the Hessian function for the inputs.

finite_difference_grad(X, w, y, step_size)

Finite difference gradient approximation (central difference)

finite_difference_hess(X, w, y, step_size)

Finite difference Hessian approximation (central difference)

class spectrally_regularised_lvms.cost_functions.SymbolicCost(use_hessian: bool = False, verbose: bool = False, finite_diff_flag: bool = False)

Bases: CostClass

This class implements a general user cost function class based off SymPy. This allows the user to manually define a symbolic representation of their cost function, and the necessary higher-order derivatives are calculated symbolically.

Inherits from costClass.

The user is asked to define their objective function loss based off of three inputs: z, its indexable variable i, and the number of indices n, i.e. i in [0, n - 1]. In code, z[i] = w^T x_i represents the latent transform of the ith x vector.

This can be done using the set_ methods that are available to an instance of this class, which is inherited from costClass.

set_cost(cost_func)

This method takes in the symbolic expression of a users cost function and then stores it as a class instance.

get_sympy_parameters()

This method gets the indexed random variable z, its index variable i and n, where n is the number of samples z[i], i = 0, …, n - 1. Example cost function: loss = -1/n * sp.Sum((z[i])**2, (i))

implement_cost()

This method converts the symbolic loss/cost function to a numerical form.

implement_first_derivative()

This method converts the symbolic derivative of the loss to a numerical form.

implement_second_derivative()

This method converts the symbolic second derivative (index-wise) to a numerical form.

implement_methods()

This method runs all three implement_* methods in succession.

_cost(X, w, y)

This method returns the cost function value based off the symbolic loss.

_cost_gradient(X, w, y)

This method returns the gradient of the cost function based off the symbolic loss.

_cost_hessian(X, w, y)

This method returns the Hessian of the cost function based off the symbolic loss.

Methods

check_gradient(X, w, y[, step_size])

This method checks the self._cost_gradient function to determine whether the gradient implementation is correct based off the objective function.

check_hessian(X, w, y[, step_size])

This method checks the self._cost_hessian function to determine whether the hessian implementation is correct based off the user-defined objective function.

cost(X, w, y)

A method that returns the cost function for the inputs.

cost_gradient(X, w, y)

A method that returns the cost function gradient for the inputs.

cost_hessian(X, w, y)

A method that returns the Hessian function for the inputs.

finite_difference_grad(X, w, y, step_size)

Finite difference gradient approximation (central difference)

finite_difference_hess(X, w, y, step_size)

Finite difference Hessian approximation (central difference)

get_symbolic_parameters()

implement_cost()

A method that lambdifies the user's cost function.

implement_first_derivative()

A method that symbolically computes the gradient of the user's cost function, and then lambdifies it so that it can be used.

implement_methods()

This method combines the implement_* methods into one call.

implement_second_derivative()

A method that symbolically computes the Hessian of the user's cost function, and then lambdifies it so that it can be used.

set_cost(cost_func)

This method allows one to set their cost function (overwrites default).

Methods:

get_symbolic_parameters()

implement_cost()

A method that lambdifies the user's cost function.

implement_first_derivative()

A method that symbolically computes the gradient of the user's cost function, and then lambdifies it so that it can be used.

implement_methods()

This method combines the implement_* methods into one call.

implement_second_derivative()

A method that symbolically computes the Hessian of the user's cost function, and then lambdifies it so that it can be used.

set_cost(cost_func)

This method allows one to set their cost function (overwrites default).

get_symbolic_parameters() Tuple[IndexedBase, Idx, Symbol]
Returns:

  • z (sp.IndexedBase instance) – An indexable variable that represents the transformation of the i^th data vector z_i = w.T @ x_i

  • i (sp.Idx instance) – An indexable SymPy matrix of size (n_features, 1)

  • n (sp.Idx instance) – A set of index variables that can be used to iterate over the X and w sp.IndexedBase instances.

implement_cost() None

A method that lambdifies the user’s cost function.

implement_first_derivative() None

A method that symbolically computes the gradient of the user’s cost function, and then lambdifies it so that it can be used.

The call occurs by iterating over the indices of w in (0, n_features - 1), deriving each gradient index using SymPy’s .diff() method and storing each gradient computation in a SymPy matrix.

implement_methods() None

This method combines the implement_* methods into one call. The idea was to provide access to a set lambdification process through one instance call.

Raises:

AttributeError – This is raised if the user’s cost function has not been defined within the instance.

implement_second_derivative() None

A method that symbolically computes the Hessian of the user’s cost function, and then lambdifies it so that it can be used.

The call occurs by iterating over the indices of w in (0, n_features - 1), deriving the gradient vector w.r.t w[i, 0] and storing the Hessian computation in a SymPy matrix.

set_cost(cost_func: Expr) None

This method allows one to set their cost function (overwrites default).

Parameters:

cost_func (function) – The users cost function defined symbolically.

class spectrally_regularised_lvms.cost_functions.VarianceCost(use_hessian: bool = True, verbose: bool = False, finite_diff_flag: bool = False)

Bases: ExplicitCost

This method implements the PCA variance maximisation objective. It inherits from the UserCost class and simply implements the three necessary components.

Methods

check_gradient(X, w, y[, step_size])

This method checks the self._cost_gradient function to determine whether the gradient implementation is correct based off the objective function.

check_hessian(X, w, y[, step_size])

This method checks the self._cost_hessian function to determine whether the hessian implementation is correct based off the user-defined objective function.

cost(X, w, y)

A method that returns the cost function for the inputs.

cost_gradient(X, w, y)

A method that returns the cost function gradient for the inputs.

cost_hessian(X, w, y)

A method that returns the Hessian function for the inputs.

finite_difference_grad(X, w, y, step_size)

Finite difference gradient approximation (central difference)

finite_difference_hess(X, w, y, step_size)

Finite difference Hessian approximation (central difference)

get_cost()

Method to return the cost function to the user.

get_gradient()

Method to return the derivative function to the user.

get_hessian()

A method to return the hessian function to the user.

set_cost(cost_func)

This method allows one to set their cost function.

set_gradient(cost_gradient)

This method allows one to set their gradient vector.

set_hessian(cost_hessian)

This method allows one to set their objective Hessian (optional).

5.3. Spectral regulariser

5.3.1. MIT License

Copyright (c) 2023 Ryan Balshaw

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


The spectral regularisation class used for LVM parameter estimation.

Classes:

SpectralObjective(N[, save_hessian_flag, ...])

An object that implements the spectral constraint objective.

class spectrally_regularised_lvms.spectral_regulariser.SpectralObjective(N: int | float, save_hessian_flag: bool = True, inv_hessian_flag: bool = True, verbose: bool = False)

Bases: object

An object that implements the spectral constraint objective.

dftmtx()

A method that returns the DFT matrix (unnormalised) by 1/sqrt(N).

decompose_DFT()

A method that decomposes the unnormalised DFT matrix into the real and imaginary matrices R and I.

Hadamard_product(A, x)

A method that computes the Hadamard product for the matrix-vector product v = A @ x. This returns v * v (elementwise product)

Hadamard_derivative(A, x)

A method that computes the Hadamard product derivative w.r.t the x vector. d/dx(v * v) = 2 * diag(A @ x) @ A

check_w_want(w1)

This method checks whether the vector that we compare to (w1) is suitable. It allows for either a Nx1 vector or a MxN matrix.

Xw(Re, Im, w)

This method computes the squared complex modulus of the Fourier vector F(w). The returned vector is normalised by 1/N. It accounts for the shape of w, to allow for a w to either be a Nx1 vector or a MxN matrix for M vectors w.

spectral_loss(w, w1)

This method returns the loss function for the spectral constraint. It computes the dot product between the Fourier representation of w and the Fourier representation/representations of the vector/vectors in w1. The resulting shape of the loss is either a 1x1 vector or a Mx1 vector (depending on whether w is a vector or matrix of vectors)

spectral_derivative(w, w1)

This method computes the first derivative (gradient) of the spectral loss w.r.t the w parameters. It returns a Nx1 vector of parameters or a NxM matrix of M gradient vectors.

spectral_hessian(w, w1)

This method computes the second derivative (Hessian) of the spectral loss w.r.t the w parameters. it either returns a NxN vector of parameters of a MxNxN matrix of M Hessians.

Methods

Hadamard_derivative(A, x)

A method that computes the derivative of Hadamard product of v = A @ x w.r.t x.

Hadamard_product(A, x[, vectorised_flag])

A method that computes the Hadamard product of v = A @ x.

Xw(Re, Im, w)

This method computes the squared modulus of the Fourier representation of a vector or matrix w.

check_w_want(w1)

A method that checks the shape of the vector w1.

decompose_DFT()

Splits the complex DFT matrix into its real and imaginary components.

dftmtx()

Computes the DFT matrix.

spectral_derivative(w, w1)

This method computes the first derivative of the spectral constraint loss function w.r.t w.

spectral_hessian(w, w1)

This method computes the second derivative of the spectral constraint loss function w.r.t w.

spectral_loss(w, w1)

This method computes the spectral constraint loss function.

Methods:

Hadamard_derivative(A, x)

A method that computes the derivative of Hadamard product of v = A @ x w.r.t x.

Hadamard_product(A, x[, vectorised_flag])

A method that computes the Hadamard product of v = A @ x.

Xw(Re, Im, w)

This method computes the squared modulus of the Fourier representation of a vector or matrix w.

check_w_want(w1)

A method that checks the shape of the vector w1.

decompose_DFT()

Splits the complex DFT matrix into its real and imaginary components.

dftmtx()

Computes the DFT matrix.

spectral_derivative(w, w1)

This method computes the first derivative of the spectral constraint loss function w.r.t w.

spectral_hessian(w, w1)

This method computes the second derivative of the spectral constraint loss function w.r.t w.

spectral_loss(w, w1)

This method computes the spectral constraint loss function.

static Hadamard_derivative(A: ndarray, x: ndarray) ndarray

A method that computes the derivative of Hadamard product of v = A @ x w.r.t x.

This method has no requirement for a vectorised_flag variable as it is only to be called on the Fourier representation of the vector w, not the vector/matrix w1 that it is compared to.

Parameters:
  • A (ndarray) – The matrix component of v of ‘float’ type.

  • x (ndarray) – The vector component of v of ‘float’ type. Shape is either Nx1 or MxN

Return type:

The elementwise derivative of v ʘ v w.r.t x.

static Hadamard_product(A: ndarray, x: ndarray, vectorised_flag: bool = False) ndarray

A method that computes the Hadamard product of v = A @ x.

Parameters:
  • A (ndarray) – The matrix component of v of ‘float’ type.

  • x (ndarray) – The vector component of v of ‘float’ type. Shape is either Nx1 or MxN

  • vectorised_flag (bool) – A flag to specify whether x is a Nx1 vector or a MxN matrix. This is important to control whether v = A @ x (false) or v = x @ A.T (assuming that x is already transposed).

Return type:

The elementwise product of v ʘ v.

Xw(Re: ndarray, Im: ndarray, w: ndarray) ndarray

This method computes the squared modulus of the Fourier representation of a vector or matrix w. Instead of using a vectorised_flag (haramard_product is a staticmethod), I can just use the shape of w directly.

Parameters:
  • Re (ndarray) – 2D array of shape NxN contained data with ‘float’ type.

  • Im (ndarray) – 2D array of shape NxN contained data with ‘float’ type.

  • w (nparray) – 2D array of shape Nx1 or MxN with ‘float’ type.

Returns:

spectral_representation – The squared modulus of the Fourier transform of w. Shape is either Nx1 or MxN (depends on shape of w).

Return type:

ndarray

check_w_want(w1: ndarray) bool

A method that checks the shape of the vector w1.

It is just used as a simple check to ensure that it matches the dimensionality of the problem

Parameters:

w1 (npdarray) – The vector of interest

Returns:

If w1, in some way, matches the dimensionality, it passes.

Return type:

bool

decompose_DFT() Tuple[ndarray, ndarray]

Splits the complex DFT matrix into its real and imaginary components.

Returns:

  • Re (ndarray) – 2D array of shape NxN contained data with ‘float’ type. The real components of the unnormalised DFT matrix.

  • Im (ndarray) – 2D array of shape NxN contained data with ‘float’ type. The imaginary components of the unnormalised DFT matrix.

dftmtx() ndarray

Computes the DFT matrix.

Returns:

DFT_matrix – A 2D array of shape NxN contained data with ‘complex’ type. This is the unnormalised DFT matrix.

Return type:

ndarray

spectral_derivative(w: ndarray, w1: ndarray) ndarray

This method computes the first derivative of the spectral constraint loss function w.r.t w.

Parameters:
  • w (ndarray) – The vector w that we wish to enforce is unique to the vector/vectors in w1. Expected shape is Nx1.

  • w1 (ndarray) – The vector/vectors that we wish to use to enforce that w is unique. Expected shape is Nx1 or MxN.

Returns:

gradient – The first derivative of the dot product between the spectral representations of the w vector and the vector/vectors in w1. It is either a Nx1 vector (if w1 is a vector) or a NxM matrix (if w1 is a MxN vector).

Note: I chose to use a NxM matrix for the latter as I know each column represents a gradient vector as the constraint is applied additively, so I can just sum over axis=1 to get a combined gradient vector.

Return type:

ndarray

spectral_hessian(w: ndarray, w1: ndarray) ndarray | Tuple[ndarray, ndarray]

This method computes the second derivative of the spectral constraint loss function w.r.t w.

Parameters:
  • w (ndarray) – The vector w that we wish to enforce is unique to the vector/vectors in w1. Expected shape is Nx1.

  • w1 (ndarray) – The vector/vectors that we wish to use to enforce that w is unique. Expected shape is Nx1 or MxN.

Returns:

  • hessian (ndarray) – The second derivative of the dot product between the spectral representations of the w vector and the vector/vectors in w1. It is either a NxN matrix (if w1 is a vector) or a MxNxN matrix (if w1 is a MxN vector).

  • If save_hessian_flag is used during initialisation, it will first

  • check to see if a Hessian exists.

  • If inv_hessian_flag is used during initialisation, it will return

  • both the hessian and its inverse (NOT recommended).

spectral_loss(w: ndarray, w1: ndarray) float | ndarray

This method computes the spectral constraint loss function.

Parameters:
  • w (ndarray) – The vector w that we wish to enforce is unique to the vector/vectors in w1. Expected shape is Nx1.

  • w1 (ndarray) – The vector/vectors that we wish to use to enforce that w is unique. Expected shape is Nx1 or MxN.

Returns:

loss – The dot product between the spectral representations of the w vector and the vector/vectors in w1. It is either a scalar (if w1 is a vector) or a Mx1 vector (if w1 is a MxN vector).

Return type:

float | ndarray

5.4. Helper methods

5.4.1. MIT License

Copyright (c) 2023 Ryan Balshaw

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


This file defines helper methods for the solver.

Method 1: data_processor

This method implements the pre-processing steps applied to X (centering and/or whitening).

Method 2: batch_sampler

This is a simple batch sampler method for quicker solving.

Method 3: quasi_Newton

This is a method that implements different Hessian approximation strategies.

Method 5: deflation_orthogonalisation

This method implements the Gram-Schmidt orthogonalisation process.

Method 5: Hankel_matrix

This method implements the data hankelisation step.

Classes:

BatchSampler(batch_size[, random_sampler, ...])

This is a simple iterator instance that can be called during runtime using:

DataProcessor([whiten, var_PCA])

A method that processes the data matrices.

DeflationOrthogonalisation()

This method implements the Gram-Schmidt orthogonalisation process and some helper methods.

QuasiNewton(jacobian_update_type[, use_inverse])

This method implements different Hessian approximation strategies and performs the updates on each call.

Functions:

hankel_matrix(signal[, Lw, Lsft])

A method that performs hankelisation for the user.

class spectrally_regularised_lvms.helper_methods.BatchSampler(batch_size: int, random_sampler: bool = True, include_end: bool = False)

Bases: object

This is a simple iterator instance that can be called during runtime using:

batch_sampler_inst = batch_sampler(batch_size, include_end=True) data_sampler = iter(batch_sampler_inst(X_preprocess, iter_idx=0))

and then sampling in a loop or something like that:

for i in range(10):

Xi = next(data_sampler)

Methods

__call__(data[, iter_idx])

The method used when the sampler instance is called (like a function).

class spectrally_regularised_lvms.helper_methods.DataProcessor(whiten: bool = True, var_PCA: float | None = None)

Bases: object

A method that processes the data matrices.

initialise_preprocessing(X)

A method that initialises all the processing attributes for the pre-processing (centering with whitening). Solves for the whitening transform parameters.

center_data(X)

Centers the columns of X to be zero-mean.

preprocess_data(X)

Transforms the X matrix to the whitened space (if required).

unprocess_data(X)

Transforms an X matrix from the whitened space to the data space.

Methods

center_data(X)

A method that centers the rows of the data matrix X

initialise_preprocessing(X)

A method that initialises the aspects of the data pre-processing stage.

preprocess_data(X)

A method that pre-processes the data matrix X

unprocess_data(X)

A method that un-whitens the whitened data matrix X

Methods:

center_data(X)

A method that centers the rows of the data matrix X

initialise_preprocessing(X)

A method that initialises the aspects of the data pre-processing stage.

preprocess_data(X)

A method that pre-processes the data matrix X

unprocess_data(X)

A method that un-whitens the whitened data matrix X

center_data(X: ndarray) ndarray

A method that centers the rows of the data matrix X

Parameters:

X (ndarray) – The original feature matrix X.

Returns:

X_centered – Zero-mean feature matrix.

Return type:

ndarray

initialise_preprocessing(X: ndarray) Self

A method that initialises the aspects of the data pre-processing stage.

Parameters:

X (ndarray) – The initialisation matrix from which the pre-processing parameters are obtained.

Return type:

self

preprocess_data(X: ndarray) ndarray

A method that pre-processes the data matrix X

Parameters:

X (ndarray) – The original feature matrix X.

Returns:

X_whitened – The whitened feature matrix.

Return type:

ndarray

unprocess_data(X: ndarray) ndarray

A method that un-whitens the whitened data matrix X

Parameters:

X (ndarray) – The whitened feature matrix

Returns:

X_unwhitened – The un-whitened feature matrix

Return type:

ndarray

class spectrally_regularised_lvms.helper_methods.DeflationOrthogonalisation

Bases: object

This method implements the Gram-Schmidt orthogonalisation process and some helper methods.

projection_operator(u, v)

This method computes the projection operator of v onto u.

gram_schmidt_orthogonalisation(w, W, idx)

This method performs GS orthogonalisation. of w w.r.t the vectors in W up to the W position.

global_gso(W)

This method performs GSO sequentially for the rows in W.

Methods

global_gso(W)

A method that orthogonalises a set of Nx1 vectors stores in some W matrix of shape MxN.

gram_schmidt_orthogonalisation(w, W, idx)

projection_operator(u, v)

Calculates projection of v onto u (equivalent to outer product map).

Methods:

global_gso(W)

A method that orthogonalises a set of Nx1 vectors stores in some W matrix of shape MxN.

gram_schmidt_orthogonalisation(w, W, idx)

projection_operator(u, v)

Calculates projection of v onto u (equivalent to outer product map).

global_gso(W: ndarray) ndarray

A method that orthogonalises a set of Nx1 vectors stores in some W matrix of shape MxN.

Parameters:

W (ndarray) – A MxN array that contains the vectors we want to orthogonalise in the rows (i.e. assumes that W = [w_1, w_N]^T).

Returns:

W_orth – A MxN array of orthogonalised vectors.

Return type:

ndarray

gram_schmidt_orthogonalisation(w: ndarray, W: ndarray, idx: int) ndarray
Parameters:
  • w (ndarray) – A Nx1 array that contains the vector we want to orthogonalise.

  • W (ndarray) – A MxN array that contains the vectors we want to orthogonalise w against.

  • idx (int) – The upper index (cannot be zero) for the rows of W that we want to orthogonalise against.

Returns:

  • w_orth (ndarray) – A Nx1 array that contains the orthogonalised w vector using the first idx + 1 vectors in W.

  • Note (I have played around with vectorised versions of this, but)

  • it did not offer significant computational improvements.

static projection_operator(u: ndarray, v: ndarray) ndarray

Calculates projection of v onto u (equivalent to outer product map).

Parameters:
  • u (ndarray) – A Nx1 array that we wish to orthogalise against (remains unchanged)

  • v (ndarray) – A Nx1 array that we wish to orthogalise (changes)

Returns:

The projection of v onto u

Return type:

ndarray

class spectrally_regularised_lvms.helper_methods.QuasiNewton(jacobian_update_type: str, use_inverse: bool = True)

Bases: object

This method implements different Hessian approximation strategies and performs the updates on each call.

The included quasi-Newton methods are: - Symmetric rank one (SR1) - Davidson Fletcher Powell (DFP) - Boyden-Fletcher-Goldfarb-Shanno (BFGS)

Each method accepts the delta_x are each iteration index and the delta_grad. These are the two attributes typically used for quasi-Newton iteration.

Methods

boyden_fletcher_goldfarb_shanno(...)

The BFGS update step

compute_update(gradient_vector)

A method used to compute the parameter update based on the gradient vector at time t.

davidson_fletcher_powell(delta_params_k, ...)

The DFP update step

initialise_jacobian(n_features)

A method that initialises the Jacobian matrix.

symmetric_rank_one(delta_params_k, grad_diff_k)

The SR1 update step

update_jacobian(delta_params_k, grad_diff_k)

A method that updates the jacobian_mat_iter attribute.

Methods:

boyden_fletcher_goldfarb_shanno(...)

The BFGS update step

compute_update(gradient_vector)

A method used to compute the parameter update based on the gradient vector at time t.

davidson_fletcher_powell(delta_params_k, ...)

The DFP update step

initialise_jacobian(n_features)

A method that initialises the Jacobian matrix.

symmetric_rank_one(delta_params_k, grad_diff_k)

The SR1 update step

update_jacobian(delta_params_k, grad_diff_k)

A method that updates the jacobian_mat_iter attribute.

boyden_fletcher_goldfarb_shanno(delta_params_k: ndarray, grad_diff_k: ndarray) ndarray

The BFGS update step

Parameters:
  • delta_params_k (ndarray) – The parameter difference (x_{t} - x_{t-1}) vector.

  • grad_diff_k (ndarray) – The gradient difference (df/dx @ x_{t} - df/dx @ x_{t-1}) vector.

Returns:

update_term – The BFGS update step factoring in whether the inverse Hessian or direct Hessian are approximated.

Return type:

ndarray

compute_update(gradient_vector: ndarray) ndarray

A method used to compute the parameter update based on the gradient vector at time t.

Parameters:

gradient_vector (ndarray) – The Nx1 gradient vector

Returns:

update – The quasi-Newton parameter update.

Return type:

ndarray

davidson_fletcher_powell(delta_params_k: ndarray, grad_diff_k: ndarray) ndarray

The DFP update step

Parameters:
  • delta_params_k (ndarray) – The parameter difference (x_{t} - x_{t-1}) vector.

  • grad_diff_k (ndarray) – The gradient difference (df/dx @ x_{t} - df/dx @ x_{t-1}) vector.

Returns:

update_term – The DFP update step factoring in whether the inverse Hessian or direct Hessian are approximated.

Return type:

ndarray

initialise_jacobian(n_features: int) None

A method that initialises the Jacobian matrix. Assumes that

Parameters:
  • n_features (int) – The dimensionality of the feature space.

  • -----------

  • self.jacobian_mat_iter (ndarray) – The Hessian used during iteration.

symmetric_rank_one(delta_params_k: ndarray, grad_diff_k: ndarray) ndarray

The SR1 update step

Parameters:
  • delta_params_k (ndarray) – The parameter difference (x_{t} - x_{t-1}) vector.

  • grad_diff_k (ndarray) – The gradient difference (df/dx @ x_{t} - df/dx @ x_{t-1}) vector.

Returns:

update_term – The SR1 update step factoring in whether the inverse Hessian or direct Hessian are approximated.

Return type:

ndarray

update_jacobian(delta_params_k: ndarray, grad_diff_k: ndarray) None

A method that updates the jacobian_mat_iter attribute.

Parameters:
  • delta_params_k (ndarray) – The parameter difference (x_{t} - x_{t-1}) vector.

  • grad_diff_k (ndarray) – The gradient difference (df/dx @ x_{t} - df/dx @ x_{t-1}) vector.

spectrally_regularised_lvms.helper_methods.hankel_matrix(signal: ndarray, Lw: int = 512, Lsft: int = 1) ndarray

A method that performs hankelisation for the user.

Parameters:
  • signal (ndarray) – A (n,) shaped array that contains a time series of measurement values

  • Lw (int) – The window length/signal segment length

  • Lsft (int) – The shift parameter for the sliding window

Returns:

Hmat – A no_of_samples x Lw array of sliding window segments.

Return type:

ndarray

5.5. Spectrally regularised model

5.5.1. MIT License

Copyright (c) 2023 Ryan Balshaw

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


This script defines model parameter estimation process via the LinearModel class.

Classes:

LinearModel(n_sources, cost_instance, Lw, Lsft)

This class encapsulates the parameter estimation step of linear LVMs by defining a class that combines each of the different aspects of this package.

Functions:

initialise_W(n_sources, n_features, init_type)

A method that initialises the W matrix.

initialise_lambda(n_sources)

A method that initialises the lambda terms for lagrange expression.

class spectrally_regularised_lvms.spectrally_regularised_model.LinearModel(n_sources: int, cost_instance: cost_inst, Lw: int, Lsft: int, whiten: bool = True, init_type: str = 'broadband', perform_gso: bool = True, batch_size: int | None = None, var_PCA: bool | None = None, alpha_reg: float = 1.0, sumt_flag: bool = False, sumt_parameters: dict[str, float] | None = None, organise_by_kurt: bool = False, hessian_update_type: str = 'actual', use_ls: bool = True, second_order: bool = True, save_dir: str | None = None, verbose: bool = False)

Bases: object

This class encapsulates the parameter estimation step of linear LVMs by defining a class that combines each of the different aspects of this package.

kurtosis(y)

A method that calculates the kurtosis of a set of samples y.

_function(param_vector, self_inst, W, X, idx)

A static method that calculates the langrange expression. It is typeset so that it can interface with scipy.optimize.minimize methods.

_gradient(param_vector, self_inst, W, X, idx)

A static method that calculates the gradient of the langrange expression. It is typeset so that it can interface with scipy.optimize.minimize methods.

_hessian(param_vector, self_inst, W, X, idx)

A static method that calculates the Hessian of the langrange expression. It is typeset so that it can interface with scipy.optimize.minimize methods.

A method that performs a 1D line search on the delta vector to find a step size that satisfies the Armijo condition.

lagrange_function(X, w, y, W, idx, lambda_vector)

A method that calculates the lagrange expression.

lagrange_gradient(X, w, y, W, idx, lambda_vector)

A method that calculates the gradient of the lagrange expression.

lagrange_hessian(X, w, y, W, idx, lambda_vector)

A method that calculates the Hessian of the lagrange expression.

parameter_update(self, X, w, y, W, idx, lambda_vector)

A method that performs a parameter update based on the users optimisation properties.

spectral_trainer(X, W, n_iters, learning_rate, tol, Lambda, Fs)

This method estimates the model parameters for some X and W.

update_params(w_current, lambda_current, delta_w, delta_lambda, W, idx)

A method which calculates the update to w and lambda based off some global delta Phi vector, normalises w and performs GSO is requested.

spectral_fit(X, W, Lambda, n_iters = 1, learning_rate, tol, Fs)

A method that estimates the model parameters.

fit(self, X, n_iters, learning_rate, tol, Fs)

A method that uses spectral_fit, and allows users to use the sequential unconstrained minimisation technique (SUMT). This was done to make the API call similar to scikit-learn.

transform(X)

A method that transforms X to the latent domain via X @ W. If whitening is enabled the X represents an unwhitened matrix.

inverse_transform(Z)

A method that transforms samples from the latent domain to the data domain. If whitening is enabled then the recovered matrix represents the standardised data domain.

compute_spectral_W(W)

A static method that computes the spectral representations of the vectors in W.

get_model_parameters()

A method which return the solution parameters in a model_dict dictionary.

set_model_parameters(model_dict, X)

A method which sets the solution parameters based off the model_dict dictionary and X (X defines the pre-processing steps).

Methods

compute_spectral_W(W)

This method computes the spectral representation of the vectors in the W matrix.

fit(x_signal[, n_iters, learning_rate, tol, ...])

This method follows the scikit-learn API call and estimates the model parameters based off the users initialisation choices.

get_hankel(x_signal)

A method that gets the hankel matrix of a signal.

get_model_parameters()

This method gets all the important model parameters and .

inverse_transform(Z[, full_inverse])

This method transforms a latent matrix Z to the data space.

kurtosis(y)

lagrange_function(X, w, y, W, idx, lambda_vector)

This method calculates the lagrangian expression.

lagrange_gradient(X, w, y, W, idx, lambda_vector)

This method calculates the gradient of the lagrangian expression.

lagrange_hessian(X, w, y, W, idx, lambda_vector)

This method calculates the Hessian of the lagrangian expression.

line_search(delta, gradient, w, ...)

Performs a 1D line search on the delta vector to find a step size that satisfies the Armijo condition.

parameter_update(X, w, y, W, idx, lambda_vector)

The method get an updated estimate of the parameters.

set_model_parameters(x_signal, dict_params)

This method takes the X matrix and a parameter dictionary, initialises the pre-processing components and then creates the necessary class attributes from the dictionary.

spectral_fit(X, W, Lambda[, n_iters, ...])

This method estimates the model parameters for some given X, W, and Lambda.

spectral_trainer(X, W, n_iters, ...)

This method estimates the model parameters for some X and W.

transform(x_signal)

This method transforms a data matrix X to the latent space.

update_params(w_current, lambda_current, ...)

A method that computes the update to the w and lambda parameters, performs GSO if required by the user and ensures that w is a unit vector.

Methods:

compute_spectral_W(W)

This method computes the spectral representation of the vectors in the W matrix.

fit(x_signal[, n_iters, learning_rate, tol, ...])

This method follows the scikit-learn API call and estimates the model parameters based off the users initialisation choices.

get_hankel(x_signal)

A method that gets the hankel matrix of a signal.

get_model_parameters()

This method gets all the important model parameters and .

inverse_transform(Z[, full_inverse])

This method transforms a latent matrix Z to the data space.

kurtosis(y)

lagrange_function(X, w, y, W, idx, lambda_vector)

This method calculates the lagrangian expression.

lagrange_gradient(X, w, y, W, idx, lambda_vector)

This method calculates the gradient of the lagrangian expression.

lagrange_hessian(X, w, y, W, idx, lambda_vector)

This method calculates the Hessian of the lagrangian expression.

line_search(delta, gradient, w, ...)

Performs a 1D line search on the delta vector to find a step size that satisfies the Armijo condition.

parameter_update(X, w, y, W, idx, lambda_vector)

The method get an updated estimate of the parameters.

set_model_parameters(x_signal, dict_params)

This method takes the X matrix and a parameter dictionary, initialises the pre-processing components and then creates the necessary class attributes from the dictionary.

spectral_fit(X, W, Lambda[, n_iters, ...])

This method estimates the model parameters for some given X, W, and Lambda.

spectral_trainer(X, W, n_iters, ...)

This method estimates the model parameters for some X and W.

transform(x_signal)

This method transforms a data matrix X to the latent space.

update_params(w_current, lambda_current, ...)

A method that computes the update to the w and lambda parameters, performs GSO if required by the user and ensures that w is a unit vector.

static compute_spectral_W(W: ndarray) ndarray

This method computes the spectral representation of the vectors in the W matrix.

Parameters:

W (ndarray) – The source vector matrix W.

Returns:

spectral_W – A matrix that contains the spectral magnitude information of the sources.

Return type:

ndarray

fit(x_signal: ndarray, n_iters: int = 500, learning_rate: float = 1.0, tol: float = 0.0001, use_tol: bool = True, Fs: float = 1.0) Self

This method follows the scikit-learn API call and estimates the model parameters based off the users initialisation choices.

Parameters:
  • x_signal (ndarray) – The single channel vibration data signal.

  • n_iters (int) – The max number of iterations that are to be performed for each source.

  • learning_rate (float) – The learning rate. This is only used if required by the user, and will only appear if use_ls is not activated.

  • tol (float) – The tolerance on the convergence error, error =| w_new^T @ w_prev - 1|. Used to stop the solver if it converges.

  • use_tol (bool) – A flag to specify if the convergence tolerance must be used. If use_tol = False, the process will run for n_iters each time.

  • Fs (float) – The sampling frequency of the observed signal. Only used if the user wants to store visualisations of the solution vectors.

Returns:

self – This method returns self so that it can be chained onto the initialisation of the class via model_inst = LinearModel(…).fit(…).

Return type:

instance

get_hankel(x_signal: ndarray) ndarray

A method that gets the hankel matrix of a signal.

Parameters:

x_signal (ndarray) – The single channel vibration data signal.

Returns:

Hmat – A no_of_samples x Lw array of sliding window segments.

Return type:

ndarray

get_model_parameters() Dict[str, Any]

This method gets all the important model parameters and .

Returns:

dict_params – A dictionary which stores all the solution information.

Return type:

dict

inverse_transform(Z: ndarray, full_inverse: bool = True) ndarray

This method transforms a latent matrix Z to the data space.

Parameters:
  • Z (ndarray) – The latent matrix Z.

  • full_inverse (bool) – A flag to specify whether the recovered data matrix X must be returned in centered form or in the original, un-centered domain.

Returns:

X_recon – The reconstruction of X from the latent matrix Z.

Return type:

ndarray

kurtosis(y: ndarray) float | ndarray
Parameters:

y (ndarray) – A vector or matrix of samples. If y is a vector, it is expected to be a column vector. If it is a matrix then each feature is given in a column.

Returns:

kurtosis of the samples.

Return type:

float | ndarray

lagrange_function(X: ndarray, w: ndarray, y: ndarray, W: ndarray, idx: int, lambda_vector: ndarray) float

This method calculates the lagrangian expression.

Parameters:
  • X (ndarray) – The data matrix X.

  • w (ndarray) – The current w vector being optimised.

  • y (ndarray) – The transformed variable X @ w.

  • W (ndarray) – The matrix of w vectors stored in the rows of W.

  • X – The data matrix.

  • idx (int) – The current iteration index for the parameters.

  • lambda_vector (ndarray) – The current lambda_eq value being optimised.

Return type:

The evaluation of the lagrangian expression.

lagrange_gradient(X: ndarray, w: ndarray, y: ndarray, W: ndarray, idx: int, lambda_vector: ndarray) ndarray

This method calculates the gradient of the lagrangian expression. :param X: The data matrix X. :type X: ndarray :param w: The current w vector being optimised. :type w: ndarray :param y: The transformed variable X @ w. :type y: ndarray :param W: The matrix of w vectors stored in the rows of W. :type W: ndarray :param X: The data matrix. :type X: ndarray :param idx: The current iteration index for the parameters. :type idx: int :param lambda_vector: The current lambda_eq value being optimised. :type lambda_vector: ndarray

Return type:

The evaluation of the gradient of the lagrangian expression.

lagrange_hessian(X: ndarray, w: ndarray, y: ndarray, W: ndarray, idx: int, lambda_vector: ndarray) ndarray

This method calculates the Hessian of the lagrangian expression. :param X: The data matrix X. :type X: ndarray :param w: The current w vector being optimised. :type w: ndarray :param y: The transformed variable X @ w. :type y: ndarray :param W: The matrix of w vectors stored in the rows of W. :type W: ndarray :param X: The data matrix. :type X: ndarray :param idx: The current iteration index for the parameters. :type idx: int :param lambda_vector: The current lambda_eq value being optimised. :type lambda_vector: ndarray

Return type:

The evaluation of the Hessian of the lagrangian expression.

line_search(delta: ndarray, gradient: ndarray, w: ndarray, lambda_vector: ndarray, W: ndarray, X: ndarray, idx: int) Tuple[float, bool]

Performs a 1D line search on the delta vector to find a step size that satisfies the Armijo condition. Uses scipy.optimize.minimize routines.

Parameters:
  • delta (ndarray) – The parameter update vector.

  • gradient (ndarray) – The gradient vector at the current iteration.

  • w (ndarray) – The current w vector being optimised.

  • lambda_vector (ndarray) – The current lambda_eq value being optimised.

  • W (ndarray) – The matrix of w vectors stored in the rows of W.

  • X (ndarray) – The data matrix.

  • idx (int) – The current iteration index for the parameters.

Returns:

  • alpha_val (float) – The step size that should be applied to delta.

  • conv_flag (bool) – A flag that specifies whether the line search converged.

parameter_update(X: ndarray, w: ndarray, y: ndarray, W: ndarray, idx: int, lambda_vector: ndarray) Tuple[ndarray, ndarray, ndarray]

The method get an updated estimate of the parameters. Combines all the user choices into one simple step. It accounts for standard gradient descent, stochastic gradient descent, second order methods, and quasi-second order methods.

Parameters:
  • X (ndarray) – The data matrix X.

  • w (ndarray) – The current w vector being optimised.

  • y (ndarray) – The transformed variable X @ w.

  • W (ndarray) – The matrix of w vectors stored in the rows of W.

  • X – The data matrix.

  • idx (int) – The current iteration index for the parameters.

  • lambda_vector (ndarray) – The current lambda_eq value being optimised.

Returns:

  • delta_w (ndarray) – The update that should be applied to w.

  • delta_lambda (ndarray) – The update that should be applied to lambda.

  • gradient (ndarray) – The gradient evaluation at the current iteration index.

set_model_parameters(x_signal: ndarray, dict_params: Dict[str, Any]) Self

This method takes the X matrix and a parameter dictionary, initialises the pre-processing components and then creates the necessary class attributes from the dictionary.

Parameters:
  • x_signal (ndarray) – The single channel vibration data signal.

  • dict_params (dict) – The parameter dictionary that is returned by the .get_model_parameters() method.

Returns:

self – This method returns self so that it can be chained onto the initialisation of the class via model_inst = LinearModel(…).set_model_parameters(…).

Return type:

instance

spectral_fit(X: ndarray, W: ndarray, Lambda: ndarray, n_iters: int = 1, learning_rate: float = 0.1, tol: float = 0.001, use_tol: bool = True, Fs: float | int = 25000.0)

This method estimates the model parameters for some given X, W, and Lambda.

Parameters:
  • X (ndarray) – The data matrix X.

  • W (ndarray) – The source vector matrix W.

  • Lambda (ndarray) – A vector of lambda parameters for the Lagrange expressions.

  • n_iters (int) – The max number of iterations that are to be performed for each source.

  • learning_rate (float) – A learning rate. This is only used if required by the user, and will only appear if use_ls is not activated.

  • tol (float) – The tolerance on the convergence error, error =| w_new^T @ w_prev - 1|. Used to stop the solver if it converges.

  • use_tol (bool) – A flag to specify if the convergence tolerance must be used. If use_tol = False, the process will run for n_iters each time.

  • Fs (float) – The sampling frequency of the observed signal. Only used if the user wants to store visualisations of the solution vectors.

Returns:

  • W_update (ndarray) – The estimated W matrix.

  • Lambda_update (ndarray) – The estimated Lambda values.

spectral_trainer(X: ndarray, W: ndarray, n_iters: int, learning_rate: float, tol: float, use_tol: bool, Lambda: ndarray, Fs: float) Tuple[ndarray, ndarray]

This method estimates the model parameters for some X and W.

Parameters:
  • X (ndarray) – The data matrix X.

  • W (ndarray) – The source vector matrix W.

  • n_iters (int) – The max number of iterations that are to be performed for each source.

  • learning_rate (float) – A learning rate. This is only used if required by the user, and will only appear if use_ls is not activated.

  • tol (float) – The tolerance on the convergence error, error =| w_new^T @ w_prev - 1|. Used to stop the solver if it converges.

  • use_tol (bool) – A flag to specify if the convergence tolerance must be used. If use_tol = False, the process will run for n_iters each time.

  • Lambda (ndarray) – A vector of lambda parameters for the Lagrange expressions.

  • Fs (float) – The sampling frequency of the observed signal. Only used if the user wants to store visualisations of the solution vectors.

Returns:

  • W_ (ndarray) – The estimated source vectors.

  • Lambda_ (ndarray) – The estimated lambda values for the lagrange expressions.

transform(x_signal: ndarray) ndarray

This method transforms a data matrix X to the latent space.

Parameters:

x_signal (ndarray) – The single channel vibration data signal.

Returns:

Z – The projection of X to the latent space.

Return type:

ndarray

update_params(w_current: ndarray, lambda_current: ndarray, delta_w: ndarray, delta_lambda: ndarray, W: ndarray, idx: int) Tuple[ndarray, ndarray]

A method that computes the update to the w and lambda parameters, performs GSO if required by the user and ensures that w is a unit vector.

Parameters:
  • w_current (ndarray) – The current source vector.

  • lambda_current (ndarray) – The current lambda value for the Lagrangian expression.

  • delta_w (ndarray) – The update to be applied to the source vector.

  • delta_lambda (ndarray) – The update to be applied to the lambda value.

  • W (ndarray) – The W matrix of source vectors.

  • idx (int) – The index of the source vector in W that is currently being solved for.

Returns:

  • w_new (ndarray) – The updated w vector.

  • lambda_new – The updated lambda value.

spectrally_regularised_lvms.spectrally_regularised_model.initialise_W(n_sources: int, n_features: int, init_type: str) ndarray

A method that initialises the W matrix.

Parameters:
  • n_sources (int) – The number of source vectors to initialise.

  • n_features (int) – The shape of the source vectors

  • init_type (str) – The initialisation type for the vectors. Options are either ‘broadband’ or ‘random’. Broadband implies that vectors are dirac deltas, random implies that the vectors are randomly samples.

Returns:

W – The initialised W matrix of shape (n_sources, n_features) that is normalised row-wise (ensures that each w vector is unit).

Return type:

ndarray

spectrally_regularised_lvms.spectrally_regularised_model.initialise_lambda(n_sources: int) ndarray

A method that initialises the lambda terms for lagrange expression.

This is initialised to be a vector of ones.

Parameters:

n_sources (int) – The number of source vectors to initialise.

Returns:

Lambda – A vector of ones with shape (n_sources, 1)

Return type:

ndarray