2. Getting started¶
2.1. Time-series data hankelisation¶
The spectrally_regularised_lvms.hankel_matrix() function can be used to convert a single channel time-series signal into a hankel matrix. The hankel matrix of a signal is given by
where \(L_w\) is the window length, \(L_{sft}\) is the shift parameter, and \(L_H = \lfloor\frac{L - L_w}{L_{sft}}\rfloor + 1\) represents the number of rows in \(\mathbf{X} \in \mathbb{R}^{L_H \times L_w}\). A simple example of this can be given for some sampled representation of a sinusoidal function.
1import spectrally_regularised_lvms as srLVMs
2import numpy as np
3
4Fs = 1000 # Sampling frequency
5t = np.arange(0, 10, 1/Fs) # Time representation of the signal
6x_signal = np.sin(2 * np.pi * t) # Create a toy signal
7X = srLVMs.hankel_matrix(x_signal, Lw=256, Lsft=1) # Develop the hankel matrix
2.2. Defining an objective function¶
The simplest way to define an objective function/cost instance is to use one of the provided variance maximisation or negentropy maximisation objective functions. Alternatively, a symbolic representation of the objective function can be used, or an explicit representation can be given via user-defined functions.
To ensure that any gradient or Hessian information can be computed for any objective function, a finite_diff_flag is given to each of the class instances that are used for objective function definition. This ensures that an approximation of the gradient or Hessian information can be readily computed if required.
Note
Although the objective functions used in this example is required to be maximised, the implementation strategy followed in this package performs minimisation. Converting an objective function which is to be maximised into one for minimisation simply results in changing the sign of the objective function.
For the symbolic and explicit objective functions, the principal component analysis (PCA) objective function is demonstrated here. This is given by
where it is assumed that \(\mathbf{x}\) is zero-mean and \(z_j = \mathbf{w}_i^T \mathbf{x}_j\). In a symbolic representation, the indices of \(z_j\) are symbolically represented for \(N\) samples from \(p(\mathbf{x})\).
2.2.1. Symbolic objective functions¶
For the symbolic implementation of an objective function, SymPy is used. It was decided that the easiest way to allow users to implement different objective functions is to create an indexable \(z\) variable, with index \(j\). This ensures that users can define a simple form of their objective function without having to create variables for the LVM component vectors \(\mathbf{w}\) and the hankel matrix \(\mathbf{X}\). The index \(j\) is configured to have a range of \(j \in [0, n-1]\) where \(n\) refers to the samples of the data feature vectors \(\mathbf{x}_j\). The gradient and hessian information is then obtained by symbolically deriving the objective function and exploiting the properties of the linear LVM transformation \(z_j = \mathbf{w}_i^T \mathbf{x}_j\).
1import sympy as sp
2import spectrally_regularised_lvms as srLVMs
3
4# Symbolic cost function implementation
5cost_inst = srLVMs.SymbolicCost(use_hessian = True,
6 verbose = False,
7 finite_diff_flag = False)
8
9z, j, n = cost_inst.get_symbolic_parameters()
10
11loss = -1/n * sp.Sum((z[j])**2, (j))
12
13cost_inst.set_cost(loss)
14
15# Visualise the loss
16sp.pretty_print(loss)
17
18# Visualise the properties of the indexed variables
19print(z[j].shape, z[j].ranges)
20print(j)
21print(n)
2.2.2. Explicit objective functions¶
Alternatively, users can provide an explicit version of an objective function via functions that use NumPy objects. This was done to allow for cases where specific objective functions, gradient vectors, or Hessian matrices can be encoded by the user. These functions expect three inputs: the hankel matrix \(\mathbf{X}\), the component vector \(\mathbf{w}\), and the latent transformation vector \(\mathbf{z} = \mathbf{X} \mathbf{w}\). Using the finite_diff_flag = True implies that the gradient and Hessian are approximated, and hence there would be no need to supply the gradient or Hessian functions.
1import numpy as np
2import spectrally_regularised_lvms as srLVMs
3
4# Explicit cost function implementation
5cost_inst = srLVMs.ExplicitCost(use_hessian = True,
6 verbose = False,
7 finite_diff_flag = False)
8
9obj = lambda X, w, z: -1 * np.mean((X @ w)**2, axis = 0)
10grad = lambda X, w, z: -2 * np.mean(z * X, axis = 0,
11 keepdims=True).T
12hess = lambda X, w, z: -2 / X.shape[0] * (X.T @ X)
13
14# Set the properties
15cost_inst.set_cost(obj)
16cost_inst.set_gradient(grad)
17cost_inst.set_hessian(hess)
2.2.3. Variance maximisation¶
The variance maximisation objective tries to obtain orthogonal projections wherein the variance of the projected samples is maximal. This objective is implemented as it is a common objective in the literature. The code to use this objective is given below.
1PCA_objective = srLVMs.VarianceCost(use_hessian = True,
2 verbose = False,
3 finite_diff_flag = False)
2.2.4. Negentropy maximisation¶
The negentropy maximisation objective tries to obtain projections that maximise the non-Gaussianity of the latent sources. The code to use this objective is given below.
1ICA_objective = srLVMs.NegentropyCost(source_name="exp",
2 source_params={"alpha": 1},
3 use_hessian = True,
4 verbose = False,
5 finite_diff_flag = False)
2.3. Initialising the LVM¶
The LVM can be initialised using an instance of the spectrally_regularised_lvms.LinearModel() class. Please refer to the class documentation or the documentation argument tables for more information regarding the parameters of this class.
1model_inst = srLVMs.LinearModel(n_sources = ...,
2 cost_instance = cost_inst,
3 Lw = ...,
4 Lsft = ...,
5 verbose = True)
where n_sources represents the number of latent source components (\(d\)) that are to be estimated, \(L_w\) represents the window length the signal segments stored in the Hankel matrix, and \(L_{sft}\) represents the shift parameter used to develop the hankel matrix.
2.4. Estimating the LVM parameters¶
Estimating the model parameters is achieved using the LinearModel().fit(...) method.
1model_inst = model_inst.fit(x_signal,
2 n_iters = 500,
3 learning_rate = 1,
4 tol = 1e-4,
5 Fs = Fs)
where x_signal refers to the single channel time-series signal. The arguments for the .fit(.) call are detailed in the documentation argument tables.
2.5. Using the LVM¶
Using the LVM reduces to using the .transform(...) method and the .inverse_transform() method for an instance of the LinearModel class.
1# Transition to the latent space
2Z_latent = model_inst.transform(x_signal)
3
4# Transition back to the data space
5X_recon = model_inst.inverse_transform(Z_latent)
Note that X_recon is the reconstructed hankel matrix of \(\mathbf{X}\). This concludes the code to get started with the package.