Principal Component Analysis (PCA) — SciPy Filters for QGIS

Principal Component Analysis (PCA)

class scipy_filters.algs.scipy_pca_algorithm.SciPyPCAAlgorithm[source]

Principal Component Analysis (PCA)

calculated using Singular Value Decomposition (SVD) using svd from scipy.linalg.

With default parameters, all components are kept. Optionally, either the number of components to keep or the percentage of variance explained by the kept components can be set.

Number of components to keep. 0 for all components. If negative: number of components to remove. Ignored if percentage of variance is set.

Percentage of variance to keep is only used if it is greater than 0 (typical values would be in the range between 90 and 100).

Output The output raster contains the data projected into the principal components (i.e. the PCA scores).

Output data type Float32 or Float64

The following values / vectors are avaible a) in the log tab of the processing window, b) in JSON format in the “Abstract” field of the metadata of the output raster layer, eventually to be used by subsequent transformations, and c) in the output dict if the tool has been called from the python console or a script: Singular values (of SVD), Variance explained (Eigenvalues), Ratio of variance explained, Cumulated sum of variance explained, Eigenvectors (V of SVD), Loadings (eigenvectors scaled by sqrt(eigenvalues)), Band Mean.

Keep only n components

class scipy_filters.algs.scipy_pca_helper_algorithms.SciPyKeepN[source]

Keep only n components

Utility to remove components of lesser importance after a principal components analysis (PCA)

Number of components to keep. Negative numbers: numbers of components to remove.

Transform from principal components

class scipy_filters.algs.scipy_pca_helper_algorithms.SciPyTransformFromPCAlgorithm[source]

Transform from principal components

Transform data from principal components (i.e. the PCA scores) back into the original feature space using a matrix of eigenvectors by taking the dot product of the scores the with the transpose of the matrix of eigenvectors and adding the original means to the result.

The eigenvectors can also be read from the metadata of the input layer, as long as they exist and are complete.

Eigenvectors Matrix of eigenvectors (as string). Optional if the next parameter is set. The matrix can be taken from the output of the PCA algorith of this plugin.

Mean of original bands As first step of PCA, the data of each band is centered by subtracting the means. These must be added after rotating back into the original feature space. Optional if the meta data of the input layer is complete. (Use false means if they were used for the forward transformation.)

Output data type Float32 or Float64.

Transform to principal components

class scipy_filters.algs.scipy_pca_helper_algorithms.SciPyTransformToPCAlgorithm[source]

Transform to principal components

Transform data into given principal components with a matrix of eigenvectors by taking the dot product with a matrix of weights (after centering the data).

The eigenvectors can also be read from the metadata of an existing PCA layer.

Eigenvectors Matrix of eigenvectors (as string). Optional if the next parameter is set.

Read eigenvectors from PCA layer metadata Reads the weights for the transformation from the metadata of a layer that was generated using the PCA algorithm of this plugin. Ignored if the parameter eigenvectors is used.

Number of components is only used if the value is greater than 0 and smaller than the count of original bands.

False mean for each band As first step of PCA, the data of each band is centered by subtracting the means. If false means are provided, these are substracted instead of the real means of the input layer. This allows to transform another raster image into the same space as the principal components of another layer. The result is usefull for comparation of several rasters, but should not be considered to be proper principal components. Only used if “Used false mean” is checked.

Use false mean See also false mean of each band. The false mean to be used can also be read from the metadata of a PCA layer.

Output data type Float32 or Float64.