PCAModel¶

class menpo.model.PCAModel(samples, centre=True, n_samples=None, max_n_components=None, inplace=True, verbose=False)[source]¶

Bases: VectorizableBackedModel, PCAVectorModel

A MeanLinearModel where components are Principal Components and the components are vectorized instances.

Principal Component Analysis (PCA) by eigenvalue decomposition of the data’s scatter matrix. For details of the implementation of PCA, see pca.

Parameters

samples (list or iterable of Vectorizable) – List or iterable of samples to build the model from.
centre (bool, optional) – When True (default) PCA is performed after mean centering the data. If False the data is assumed to be centred, and the mean will be 0.
n_samples (int, optional) – If provided then samples must be an iterator that yields n_samples. If not provided then samples has to be a list (so we know how large the data matrix needs to be).
max_n_components (int, optional) – The maximum number of components to keep in the model. Any components above and beyond this one are discarded.
inplace (bool, optional) – If True the data matrix is modified in place. Otherwise, the data matrix is copied.
verbose (bool, optional) – Whether to print building information or not.

component(index, with_mean=True, scale=1.0)[source]¶

Return a particular component of the linear model.

Parameters

index (int) – The component that is to be returned
with_mean (bool, optional) – If True, the component will be blended with the mean vector before being returned. If not, the component is returned on it’s own.
scale (float, optional) – A scale factor that should be applied to the component. Only valid in the case where with_mean == True. See component_vector() for how this scale factor is interpreted.

Returns

component (type(self.template_instance)) – The requested component instance.

component_vector(index, with_mean=True, scale=1.0)[source]¶

A particular component of the model.

Parameters: index (int) – The component that is to be returned.
Returns: component (type(self.template_instance)) – The component instance.

copy()¶

Generate an efficient copy of this object.

Note that Numpy arrays and other Copyable objects on self will be deeply copied. Dictionaries and sets will be shallow copied, and everything else will be assigned (no copy will be made).

Classes that store state other than numpy arrays and immutable types should overwrite this method to ensure all state is copied.

Returns: type(self) – A copy of this object

eigenvalues_cumulative_ratio()¶

Returns the cumulative ratio between the variance captured by the active components and the total amount of variance present on the original samples.

Returns: eigenvalues_cumulative_ratio ((n_active_components,) ndarray) – Array of cumulative eigenvalues.

eigenvalues_ratio()¶

Returns the ratio between the variance captured by each active component and the total amount of variance present on the original samples.

Returns: eigenvalues_ratio ((n_active_components,) ndarray) – The active eigenvalues array scaled by the original variance.

increment(samples, n_samples=None, forgetting_factor=1.0, verbose=False)[source]¶

Update the eigenvectors, eigenvalues and mean vector of this model by performing incremental PCA on the given samples.

Parameters

samples (list of Vectorizable) – List of new samples to update the model from.
n_samples (int, optional) – If provided then samples must be an iterator that yields n_samples. If not provided then samples has to be a list (so we know how large the data matrix needs to be).
forgetting_factor ([0.0, 1.0] float, optional) – Forgetting factor that weights the relative contribution of new samples vs old samples. If 1.0, all samples are weighted equally and, hence, the results is the exact same as performing batch PCA on the concatenated list of old and new simples. If <1.0, more emphasis is put on the new samples. See [1] for details.

References

1: David Ross, Jongwoo Lim, Ruei-Sung Lin, Ming-Hsuan Yang. “Incremental Learning for Robust Visual Tracking”. IJCV, 2007.

classmethod init_from_components(components, eigenvalues, mean, n_samples, centred, max_n_components=None)[source]¶

Build the Principal Component Analysis (PCA) using the provided components (eigenvectors) and eigenvalues.

Parameters

components ((n_components, n_features) ndarray) – The eigenvectors to be used.
eigenvalues ((n_components, ) ndarray) – The corresponding eigenvalues.
mean (Vectorizable) – The mean instance. It must be a Vectorizable and not an ndarray.
n_samples (int) – The number of samples used to generate the eigenvectors.
centred (bool, optional) – When True we assume that the data were centered before computing the eigenvectors.
max_n_components (int, optional) – The maximum number of components to keep in the model. Any components above and beyond this one are discarded.

classmethod init_from_covariance_matrix(C, mean, n_samples, centred=True, is_inverse=False, max_n_components=None)[source]¶

Build the Principal Component Analysis (PCA) by eigenvalue decomposition of the provided covariance/scatter matrix. For details of the implementation of PCA, see pcacov.

Parameters

C ((n_features, n_features) ndarray or scipy.sparse) – The Covariance/Scatter matrix. If it is a precision matrix (inverse covariance), then set is_inverse=True.
mean (Vectorizable) – The mean instance. It must be a Vectorizable and not an ndarray.
n_samples (int) – The number of samples used to generate the covariance matrix.
centred (bool, optional) – When True we assume that the data were centered before computing the covariance matrix.
is_inverse (bool, optional) – It True, then it is assumed that C is a precision matrix ( inverse covariance). Thus, the eigenvalues will be inverted. If False, then it is assumed that C is a covariance matrix.
max_n_components (int, optional) – The maximum number of components to keep in the model. Any components above and beyond this one are discarded.

instance(weights, normalized_weights=False)[source]¶

Creates a new instance of the model using the first len(weights) components.

Parameters

weights ((n_weights,) ndarray or list) – weights[i] is the linear contribution of the i’th component to the instance vector.
normalized_weights (bool, optional) – If True, the weights are assumed to be normalized w.r.t the eigenvalues. This can be easier to create unique instances by making the weights more interpretable.

Raises

ValueError – If n_weights > n_components

Returns

instance (type(self.template_instance)) – An instance of the model.

instance_vector(weights, normalized_weights=False)[source]¶

Creates a new instance of the model using the first len(weights) components.

Parameters: weights ((n_weights,) ndarray or list) – weights[i] is the linear contribution of the i’th component to the instance vector.
Raises: ValueError – If n_weights > n_components
Returns: instance (type(self.template_instance)) – An instance of the model.

instance_vectors(weights, normalized_weights=False)¶

Creates new vectorized instances of the model using the first components in a particular weighting.

Parameters

weights ((n_vectors, n_weights) ndarray or list of lists) –
The weightings for the first n_weights components that should be used per instance that is to be produced

weights[i, j] is the linear contribution of the j’th principal component to the i’th instance vector produced. Note that if n_weights < n_components, only the first n_weight components are used in the reconstruction (i.e. unspecified weights are implicitly 0).
normalized_weights (bool, optional) – If True, the weights are assumed to be normalized w.r.t the eigenvalues. This can be easier to create unique instances by making the weights more interpretable.

Returns

vectors ((n_vectors, n_features) ndarray) – The instance vectors for the weighting provided.

Raises

ValueError – If n_weights > n_components

inverse_noise_variance()¶

Returns the inverse of the noise variance.

Returns: inverse_noise_variance (float) – Inverse of the noise variance.
Raises: ValueError – If noise_variance() == 0

mean()[source]¶

Return the mean of the model.

Type: Vectorizable

noise_variance()¶

Returns the average variance captured by the inactive components, i.e. the sample noise assumed in a Probabilistic PCA formulation.

If all components are active, then noise_variance == 0.0.

Returns: noise_variance (float) – The mean variance of the inactive components.

noise_variance_ratio()¶

Returns the ratio between the noise variance and the total amount of variance present on the original samples.

Returns: noise_variance_ratio (float) – The ratio between the noise variance and the variance present in the original samples.

original_variance()¶

Returns the total amount of variance captured by the original model, i.e. the amount of variance present on the original samples.

Returns: optional_variance (float) – The variance captured by the model.

orthonormalize_against_inplace(linear_model)¶

Enforces that the union of this model’s components and another are both mutually orthonormal.

Note that the model passed in is guaranteed to not have it’s number of available components changed. This model, however, may loose some dimensionality due to reaching a degenerate state.

The removed components will always be trimmed from the end of components (i.e. the components which capture the least variance). If trimming is performed, n_components and n_available_components would be altered - see trim_components() for details.

Parameters: linear_model (LinearModel) – A second linear model to orthonormalize this against.

orthonormalize_inplace()¶: Enforces that this model’s components are orthonormalized, s.t. component_vector(i).dot(component_vector(j) = dirac_delta.

plot_eigenvalues(figure_id=None, new_figure=False, render_lines=True, line_colour='b', line_style='-', line_width=2, render_markers=True, marker_style='o', marker_size=6, marker_face_colour='b', marker_edge_colour='k', marker_edge_width=1.0, render_axes=True, axes_font_name='sans-serif', axes_font_size=10, axes_font_style='normal', axes_font_weight='normal', figure_size=(10, 6), render_grid=True, grid_line_style='--', grid_line_width=0.5)¶

Plot of the eigenvalues.

Parameters

figure_id (object, optional) – The id of the figure to be used.
new_figure (bool, optional) – If True, a new figure is created.
render_lines (bool, optional) – If True, the line will be rendered.

line_colour (See Below, optional) –

The colour of the lines. Example options

{``r``, ``g``, ``b``, ``c``, ``m``, ``k``, ``w``}
or
``(3, )`` `ndarray`
or
`list` of length ``3``

line_style ({-, --, -., :}, optional) – The style of the lines.
line_width (float, optional) – The width of the lines.
render_markers (bool, optional) – If True, the markers will be rendered.

marker_style (See Below, optional) –

The style of the markers. Example options

{``.``, ``,``, ``o``, ``v``, ``^``, ``<``, ``>``, ``+``,
 ``x``, ``D``, ``d``, ``s``, ``p``, ``*``, ``h``, ``H``,
 ``1``, ``2``, ``3``, ``4``, ``8``}

marker_size (int, optional) – The size of the markers in points.

marker_face_colour (See Below, optional) –

The face (filling) colour of the markers. Example options

{``r``, ``g``, ``b``, ``c``, ``m``, ``k``, ``w``}
or
``(3, )`` `ndarray`
or
`list` of length ``3``

marker_edge_colour (See Below, optional) –

The edge colour of the markers. Example options

{``r``, ``g``, ``b``, ``c``, ``m``, ``k``, ``w``}
or
``(3, )`` `ndarray`
or
`list` of length ``3``

marker_edge_width (float, optional) – The width of the markers’ edge.
render_axes (bool, optional) – If True, the axes will be rendered.
axes_font_name (See Below, optional) –
The font of the axes. Example options
```
{``serif``, ``sans-serif``, ``cursive``, ``fantasy``,
 ``monospace``}
```
axes_font_size (int, optional) – The font size of the axes.
axes_font_style ({normal, italic, oblique}, optional) – The font style of the axes.

axes_font_weight (See Below, optional) –

The font weight of the axes. Example options

{``ultralight``, ``light``, ``normal``, ``regular``,
 ``book``, ``medium``, ``roman``, ``semibold``,
 ``demibold``, ``demi``, ``bold``, ``heavy``,
 ``extra bold``, ``black``}

figure_size ((float, float) or None, optional) – The size of the figure in inches.
render_grid (bool, optional) – If True, the grid will be rendered.
grid_line_style ({-, --, -., :}, optional) – The style of the grid lines.
grid_line_width (float, optional) – The width of the grid lines.

Returns

viewer (MatplotlibRenderer) – The viewer object.

plot_eigenvalues_cumulative_ratio(figure_id=None, new_figure=False, render_lines=True, line_colour='b', line_style='-', line_width=2, render_markers=True, marker_style='o', marker_size=6, marker_face_colour='b', marker_edge_colour='k', marker_edge_width=1.0, render_axes=True, axes_font_name='sans-serif', axes_font_size=10, axes_font_style='normal', axes_font_weight='normal', figure_size=(10, 6), render_grid=True, grid_line_style='--', grid_line_width=0.5)¶

Plot of the cumulative variance ratio captured by the eigenvalues.

Parameters

figure_id (object, optional) – The id of the figure to be used.
new_figure (bool, optional) – If True, a new figure is created.
render_lines (bool, optional) – If True, the line will be rendered.

line_colour (See Below, optional) –

The colour of the lines. Example options

{``r``, ``g``, ``b``, ``c``, ``m``, ``k``, ``w``}
or
``(3, )`` `ndarray`
or
`list` of length ``3``

line_style ({-, --, -., :}, optional) – The style of the lines.
line_width (float, optional) – The width of the lines.
render_markers (bool, optional) – If True, the markers will be rendered.

marker_style (See Below, optional) –

The style of the markers. Example options

{``.``, ``,``, ``o``, ``v``, ``^``, ``<``, ``>``, ``+``,
 ``x``, ``D``, ``d``, ``s``, ``p``, ``*``, ``h``, ``H``,
 ``1``, ``2``, ``3``, ``4``, ``8``}

marker_size (int, optional) – The size of the markers in points.

marker_face_colour (See Below, optional) –

The face (filling) colour of the markers. Example options

{``r``, ``g``, ``b``, ``c``, ``m``, ``k``, ``w``}
or
``(3, )`` `ndarray`
or
`list` of length ``3``

marker_edge_colour (See Below, optional) –

The edge colour of the markers. Example options

{``r``, ``g``, ``b``, ``c``, ``m``, ``k``, ``w``}
or
``(3, )`` `ndarray`
or
`list` of length ``3``

marker_edge_width (float, optional) – The width of the markers’ edge.
render_axes (bool, optional) – If True, the axes will be rendered.
axes_font_name (See Below, optional) –
The font of the axes. Example options
```
{``serif``, ``sans-serif``, ``cursive``, ``fantasy``,
 ``monospace``}
```
axes_font_size (int, optional) – The font size of the axes.
axes_font_style ({normal, italic, oblique}, optional) – The font style of the axes.

axes_font_weight (See Below, optional) –

The font weight of the axes. Example options

{``ultralight``, ``light``, ``normal``, ``regular``,
 ``book``, ``medium``, ``roman``, ``semibold``,
 ``demibold``, ``demi``, ``bold``, ``heavy``,
 ``extra bold``, ``black``}

figure_size ((float, float) or None, optional) – The size of the figure in inches.
render_grid (bool, optional) – If True, the grid will be rendered.
grid_line_style ({-, --, -., :}, optional) – The style of the grid lines.
grid_line_width (float, optional) – The width of the grid lines.

Returns

viewer (MatplotlibRenderer) – The viewer object.

plot_eigenvalues_ratio(figure_id=None, new_figure=False, render_lines=True, line_colour='b', line_style='-', line_width=2, render_markers=True, marker_style='o', marker_size=6, marker_face_colour='b', marker_edge_colour='k', marker_edge_width=1.0, render_axes=True, axes_font_name='sans-serif', axes_font_size=10, axes_font_style='normal', axes_font_weight='normal', figure_size=(10, 6), render_grid=True, grid_line_style='--', grid_line_width=0.5)¶

Plot of the variance ratio captured by the eigenvalues.

Parameters

figure_id (object, optional) – The id of the figure to be used.
new_figure (bool, optional) – If True, a new figure is created.
render_lines (bool, optional) – If True, the line will be rendered.

line_colour (See Below, optional) –

The colour of the lines. Example options

{``r``, ``g``, ``b``, ``c``, ``m``, ``k``, ``w``}
or
``(3, )`` `ndarray`
or
`list` of length ``3``

line_style ({-, --, -., :}, optional) – The style of the lines.
line_width (float, optional) – The width of the lines.
render_markers (bool, optional) – If True, the markers will be rendered.

marker_style (See Below, optional) –

The style of the markers. Example options

{``.``, ``,``, ``o``, ``v``, ``^``, ``<``, ``>``, ``+``,
 ``x``, ``D``, ``d``, ``s``, ``p``, ``*``, ``h``, ``H``,
 ``1``, ``2``, ``3``, ``4``, ``8``}

marker_size (int, optional) – The size of the markers in points.

marker_face_colour (See Below, optional) –

The face (filling) colour of the markers. Example options

{``r``, ``g``, ``b``, ``c``, ``m``, ``k``, ``w``}
or
``(3, )`` `ndarray`
or
`list` of length ``3``

marker_edge_colour (See Below, optional) –

The edge colour of the markers. Example options

{``r``, ``g``, ``b``, ``c``, ``m``, ``k``, ``w``}
or
``(3, )`` `ndarray`
or
`list` of length ``3``

marker_edge_width (float, optional) – The width of the markers’ edge.
render_axes (bool, optional) – If True, the axes will be rendered.
axes_font_name (See Below, optional) –
The font of the axes. Example options
```
{``serif``, ``sans-serif``, ``cursive``, ``fantasy``,
 ``monospace``}
```
axes_font_size (int, optional) – The font size of the axes.
axes_font_style ({normal, italic, oblique}, optional) – The font style of the axes.

axes_font_weight (See Below, optional) –

The font weight of the axes. Example options

{``ultralight``, ``light``, ``normal``, ``regular``,
 ``book``, ``medium``, ``roman``, ``semibold``,
 ``demibold``, ``demi``, ``bold``, ``heavy``,
 ``extra bold``, ``black``}

figure_size ((float, float) or None, optional) – The size of the figure in inches.
render_grid (bool, optional) – If True, the grid will be rendered.
grid_line_style ({-, --, -., :}, optional) – The style of the grid lines.
grid_line_width (float, optional) – The width of the grid lines.

Returns

viewer (MatplotlibRenderer) – The viewer object.

project(instance)¶

Projects the instance onto the model, retrieving the optimal linear weightings.

Parameters: instance (Vectorizable) – A novel instance.
Returns: projected ((n_components,) ndarray) – A vector of optimal linear weightings.

project_out(instance)¶

Returns a version of instance where all the basis of the model have been projected out.

Parameters: instance (Vectorizable) – A novel instance of Vectorizable.
Returns: projected_out (self.instance_class) – A copy of instance, with all basis of the model projected out.

project_out_vector(instance_vector)[source]¶

Returns a version of instance where all the basis of the model have been projected out.

Parameters: instance (Vectorizable) – A novel instance of Vectorizable.
Returns: projected_out (self.instance_class) – A copy of instance, with all basis of the model projected out.

project_out_vectors(vectors)¶

Returns a version of vectors where all the bases of the model have been projected out.

Parameters: vectors ((n_vectors, n_features) ndarray) – A matrix of novel vectors.
Returns: projected_out ((n_vectors, n_features) ndarray) – A copy of vectors with all bases of the model projected out.

project_vector(instance_vector)[source]¶

Projects the instance onto the model, retrieving the optimal linear weightings.

Parameters: instance (Vectorizable) – A novel instance.
Returns: projected ((n_components,) ndarray) – A vector of optimal linear weightings.

project_vectors(vectors)¶

Projects each of the vectors onto the model, retrieving the optimal linear reconstruction weights for each instance.

Parameters: vectors ((n_samples, n_features) ndarray) – Array of vectorized novel instances.
Returns: projected ((n_samples, n_components) ndarray) – The matrix of optimal linear weights.

project_whitened(instance)[source]¶

Projects the instance onto the whitened components, retrieving the whitened linear weightings.

Parameters: instance (Vectorizable) – A novel instance.
Returns: projected ((n_components,)) – A vector of whitened linear weightings

project_whitened_vector(vector_instance)[source]¶

Projects the vector_instance onto the whitened components, retrieving the whitened linear weightings.

Parameters: vector_instance ((n_features,) ndarray) – A novel vector.
Returns: projected ((n_features,) ndarray) – A vector of whitened linear weightings

reconstruct(instance)¶

Projects a instance onto the linear space and rebuilds from the weights found.

Syntactic sugar for:

instance(project(instance))

but faster, as it avoids the conversion that takes place each time.

Parameters: instance (Vectorizable) – A novel instance of Vectorizable.
Returns: reconstructed (self.instance_class) – The reconstructed object.

reconstruct_vector(instance_vector)[source]¶

Projects a instance onto the linear space and rebuilds from the weights found.

Syntactic sugar for:

instance(project(instance))

but faster, as it avoids the conversion that takes place each time.

Parameters: instance (Vectorizable) – A novel instance of Vectorizable.
Returns: reconstructed (self.instance_class) – The reconstructed object.

reconstruct_vectors(vectors)¶

Projects the vectors onto the linear space and rebuilds vectors from the weights found.

Parameters: vectors ((n_vectors, n_features) ndarray) – A set of vectors to project.
Returns: reconstructed ((n_vectors, n_features) ndarray) – The reconstructed vectors.

trim_components(n_components=None)¶

Permanently trims the components down to a certain amount. The number of active components will be automatically reset to this particular value.

This will reduce self.n_components down to n_components (if None, self.n_active_components will be used), freeing up memory in the process.

Once the model is trimmed, the trimmed components cannot be recovered.

Parameters: n_components (int >= 1 or float > 0.0 or None, optional) – The number of components that are kept or else the amount (ratio) of variance that is kept. If None, self.n_active_components is used.

Notes

In case n_components is greater than the total number of components or greater than the amount of variance currently kept, this method does not perform any action.

variance()¶

Returns the total amount of variance retained by the active components.

Returns: variance (float) – Total variance captured by the active components.

variance_ratio()¶

Returns the ratio between the amount of variance retained by the active components and the total amount of variance present on the original samples.

Returns: variance_ratio (float) – Ratio of active components variance and total variance present in original samples.

whitened_components()¶

Returns the active components of the model, whitened.

Returns: whitened_components ((n_active_components, n_features) ndarray) – The whitened components.

property components¶

Returns the active components of the model.

Type: (n_active_components, n_features) ndarray

property eigenvalues¶

Returns the eigenvalues associated with the active components of the model, i.e. the amount of variance captured by each active component, sorted form largest to smallest.

Type: (n_active_components,) ndarray

property mean_vector¶

Return the mean of the model as a 1D vector.

Type: ndarray

property n_active_components¶

The number of components currently in use on this model.

Type: int

property n_components¶

The number of bases of the model.

Type: int

property n_features¶

The number of elements in each linear component.

Type: int