PCAVectorModel¶
-
class
menpo.model.
PCAVectorModel
(samples, centre=True, n_samples=None, max_n_components=None, inplace=True)[source]¶ Bases:
MeanLinearVectorModel
A
MeanLinearModel
where components are Principal Components.Principal Component Analysis (PCA) by eigenvalue decomposition of the data’s scatter matrix. For details of the implementation of PCA, see
pca
.- Parameters
samples (ndarray or list or iterable of ndarray) – List or iterable of numpy arrays to build the model from, or an existing data matrix.
centre (bool, optional) – When
True
(default) PCA is performed after mean centering the data. IfFalse
the data is assumed to be centred, and the mean will be0
.n_samples (int, optional) – If provided then
samples
must be an iterator that yieldsn_samples
. If not provided then samples has to be a list (so we know how large the data matrix needs to be).max_n_components (int, optional) – The maximum number of components to keep in the model. Any components above and beyond this one are discarded.
inplace (bool, optional) – If
True
the data matrix is modified in place. Otherwise, the data matrix is copied.
-
component
(index, with_mean=True, scale=1.0)[source]¶ A particular component of the model, in vectorized form.
- Parameters
index (int) – The component that is to be returned
with_mean (bool, optional) – If
True
, the component will be blended with the mean vector before being returned. If not, the component is returned on it’s own.scale (float, optional) – A scale factor that should be applied to the component. Only valid in the case where with_mean is
True
. The scale is applied in units of standard deviations (so a scale of1.0
with_mean visualizes the mean plus1
std. dev of the component in question).
- Returns
component_vector (
(n_features,)
ndarray) – The component vector of the given index.
-
copy
()¶ Generate an efficient copy of this object.
Note that Numpy arrays and other
Copyable
objects onself
will be deeply copied. Dictionaries and sets will be shallow copied, and everything else will be assigned (no copy will be made).Classes that store state other than numpy arrays and immutable types should overwrite this method to ensure all state is copied.
- Returns
type(self)
– A copy of this object
-
eigenvalues_cumulative_ratio
()[source]¶ Returns the cumulative ratio between the variance captured by the active components and the total amount of variance present on the original samples.
- Returns
eigenvalues_cumulative_ratio (
(n_active_components,)
ndarray) – Array of cumulative eigenvalues.
-
eigenvalues_ratio
()[source]¶ Returns the ratio between the variance captured by each active component and the total amount of variance present on the original samples.
- Returns
eigenvalues_ratio (
(n_active_components,)
ndarray) – The active eigenvalues array scaled by the original variance.
-
increment
(data, n_samples=None, forgetting_factor=1.0, verbose=False)[source]¶ Update the eigenvectors, eigenvalues and mean vector of this model by performing incremental PCA on the given samples.
- Parameters
samples (list of
Vectorizable
) – List of new samples to update the model from.n_samples (int, optional) – If provided then
samples
must be an iterator that yieldsn_samples
. If not provided then samples has to be a list (so we know how large the data matrix needs to be).forgetting_factor (
[0.0, 1.0]
float, optional) – Forgetting factor that weights the relative contribution of new samples vs old samples. If 1.0, all samples are weighted equally and, hence, the results is the exact same as performing batch PCA on the concatenated list of old and new simples. If <1.0, more emphasis is put on the new samples. See [1] for details.
References
- 1
David Ross, Jongwoo Lim, Ruei-Sung Lin, Ming-Hsuan Yang. “Incremental Learning for Robust Visual Tracking”. IJCV, 2007.
-
classmethod
init_from_components
(components, eigenvalues, mean, n_samples, centred, max_n_components=None)[source]¶ Build the Principal Component Analysis (PCA) using the provided components (eigenvectors) and eigenvalues.
- Parameters
components (
(n_components, n_features)
ndarray) – The eigenvectors to be used.eigenvalues (
(n_components, )
ndarray) – The corresponding eigenvalues.mean (
(n_features, )
ndarray) – The mean vector.n_samples (int) – The number of samples used to generate the eigenvectors.
centred (bool) – When
True
we assume that the data were centered before computing the eigenvectors.max_n_components (int, optional) – The maximum number of components to keep in the model. Any components above and beyond this one are discarded.
-
classmethod
init_from_covariance_matrix
(C, mean, n_samples, centred=True, is_inverse=False, max_n_components=None)[source]¶ Build the Principal Component Analysis (PCA) by eigenvalue decomposition of the provided covariance/scatter matrix. For details of the implementation of PCA, see
pcacov
.- Parameters
C (
(n_features, n_features)
ndarray or scipy.sparse) – The Covariance/Scatter matrix. If it is a precision matrix (inverse covariance), then set is_inverse=True.mean (
(n_features, )
ndarray) – The mean vector.n_samples (int) – The number of samples used to generate the covariance matrix.
centred (bool, optional) – When
True
we assume that the data were centered before computing the covariance matrix.is_inverse (bool, optional) – It
True
, then it is assumed that C is a precision matrix ( inverse covariance). Thus, the eigenvalues will be inverted. IfFalse
, then it is assumed that C is a covariance matrix.max_n_components (int, optional) – The maximum number of components to keep in the model. Any components above and beyond this one are discarded.
-
instance
(weights, normalized_weights=False)[source]¶ Creates a new vector instance of the model by weighting together the components.
- Parameters
weights (
(n_weights,)
ndarray or list) –The weightings for the first n_weights components that should be used.
weights[j]
is the linear contribution of the j’th principal component to the instance vector.normalized_weights (bool, optional) – If
True
, the weights are assumed to be normalized w.r.t the eigenvalues. This can be easier to create unique instances by making the weights more interpretable.
- Returns
vector (
(n_features,)
ndarray) – The instance vector for the weighting provided.
-
instance_vectors
(weights, normalized_weights=False)[source]¶ Creates new vectorized instances of the model using the first components in a particular weighting.
- Parameters
weights (
(n_vectors, n_weights)
ndarray or list of lists) –The weightings for the first n_weights components that should be used per instance that is to be produced
weights[i, j]
is the linear contribution of the j’th principal component to the i’th instance vector produced. Note that ifn_weights < n_components
, only the firstn_weight
components are used in the reconstruction (i.e. unspecified weights are implicitly0
).normalized_weights (bool, optional) – If
True
, the weights are assumed to be normalized w.r.t the eigenvalues. This can be easier to create unique instances by making the weights more interpretable.
- Returns
vectors (
(n_vectors, n_features)
ndarray) – The instance vectors for the weighting provided.- Raises
ValueError – If n_weights > n_components
-
inverse_noise_variance
()[source]¶ Returns the inverse of the noise variance.
- Returns
inverse_noise_variance (float) – Inverse of the noise variance.
- Raises
ValueError – If
noise_variance() == 0
-
mean
()¶ Return the mean of the model.
- Type
ndarray
-
noise_variance
()[source]¶ Returns the average variance captured by the inactive components, i.e. the sample noise assumed in a Probabilistic PCA formulation.
If all components are active, then
noise_variance == 0.0
.- Returns
noise_variance (float) – The mean variance of the inactive components.
-
noise_variance_ratio
()[source]¶ Returns the ratio between the noise variance and the total amount of variance present on the original samples.
- Returns
noise_variance_ratio (float) – The ratio between the noise variance and the variance present in the original samples.
-
original_variance
()[source]¶ Returns the total amount of variance captured by the original model, i.e. the amount of variance present on the original samples.
- Returns
optional_variance (float) – The variance captured by the model.
-
orthonormalize_against_inplace
(linear_model)[source]¶ Enforces that the union of this model’s components and another are both mutually orthonormal.
Note that the model passed in is guaranteed to not have it’s number of available components changed. This model, however, may loose some dimensionality due to reaching a degenerate state.
The removed components will always be trimmed from the end of components (i.e. the components which capture the least variance). If trimming is performed, n_components and n_available_components would be altered - see
trim_components()
for details.- Parameters
linear_model (
LinearModel
) – A second linear model to orthonormalize this against.
-
orthonormalize_inplace
()¶ Enforces that this model’s components are orthonormalized, s.t.
component_vector(i).dot(component_vector(j) = dirac_delta
.
-
plot_eigenvalues
(figure_id=None, new_figure=False, render_lines=True, line_colour='b', line_style='-', line_width=2, render_markers=True, marker_style='o', marker_size=6, marker_face_colour='b', marker_edge_colour='k', marker_edge_width=1.0, render_axes=True, axes_font_name='sans-serif', axes_font_size=10, axes_font_style='normal', axes_font_weight='normal', figure_size=(10, 6), render_grid=True, grid_line_style='--', grid_line_width=0.5)[source]¶ Plot of the eigenvalues.
- Parameters
figure_id (object, optional) – The id of the figure to be used.
new_figure (bool, optional) – If
True
, a new figure is created.render_lines (bool, optional) – If
True
, the line will be rendered.line_colour (See Below, optional) –
The colour of the lines. Example options
{``r``, ``g``, ``b``, ``c``, ``m``, ``k``, ``w``} or ``(3, )`` `ndarray` or `list` of length ``3``
line_style ({
-
,--
,-.
,:
}, optional) – The style of the lines.line_width (float, optional) – The width of the lines.
render_markers (bool, optional) – If
True
, the markers will be rendered.marker_style (See Below, optional) –
The style of the markers. Example options
{``.``, ``,``, ``o``, ``v``, ``^``, ``<``, ``>``, ``+``, ``x``, ``D``, ``d``, ``s``, ``p``, ``*``, ``h``, ``H``, ``1``, ``2``, ``3``, ``4``, ``8``}
marker_size (int, optional) – The size of the markers in points.
marker_face_colour (See Below, optional) –
The face (filling) colour of the markers. Example options
{``r``, ``g``, ``b``, ``c``, ``m``, ``k``, ``w``} or ``(3, )`` `ndarray` or `list` of length ``3``
marker_edge_colour (See Below, optional) –
The edge colour of the markers. Example options
{``r``, ``g``, ``b``, ``c``, ``m``, ``k``, ``w``} or ``(3, )`` `ndarray` or `list` of length ``3``
marker_edge_width (float, optional) – The width of the markers’ edge.
render_axes (bool, optional) – If
True
, the axes will be rendered.axes_font_name (See Below, optional) –
The font of the axes. Example options
{``serif``, ``sans-serif``, ``cursive``, ``fantasy``, ``monospace``}
axes_font_size (int, optional) – The font size of the axes.
axes_font_style ({
normal
,italic
,oblique
}, optional) – The font style of the axes.axes_font_weight (See Below, optional) –
The font weight of the axes. Example options
{``ultralight``, ``light``, ``normal``, ``regular``, ``book``, ``medium``, ``roman``, ``semibold``, ``demibold``, ``demi``, ``bold``, ``heavy``, ``extra bold``, ``black``}
figure_size ((float, float) or
None
, optional) – The size of the figure in inches.render_grid (bool, optional) – If
True
, the grid will be rendered.grid_line_style ({
-
,--
,-.
,:
}, optional) – The style of the grid lines.grid_line_width (float, optional) – The width of the grid lines.
- Returns
viewer (
MatplotlibRenderer
) – The viewer object.
-
plot_eigenvalues_cumulative_ratio
(figure_id=None, new_figure=False, render_lines=True, line_colour='b', line_style='-', line_width=2, render_markers=True, marker_style='o', marker_size=6, marker_face_colour='b', marker_edge_colour='k', marker_edge_width=1.0, render_axes=True, axes_font_name='sans-serif', axes_font_size=10, axes_font_style='normal', axes_font_weight='normal', figure_size=(10, 6), render_grid=True, grid_line_style='--', grid_line_width=0.5)[source]¶ Plot of the cumulative variance ratio captured by the eigenvalues.
- Parameters
figure_id (object, optional) – The id of the figure to be used.
new_figure (bool, optional) – If
True
, a new figure is created.render_lines (bool, optional) – If
True
, the line will be rendered.line_colour (See Below, optional) –
The colour of the lines. Example options
{``r``, ``g``, ``b``, ``c``, ``m``, ``k``, ``w``} or ``(3, )`` `ndarray` or `list` of length ``3``
line_style ({
-
,--
,-.
,:
}, optional) – The style of the lines.line_width (float, optional) – The width of the lines.
render_markers (bool, optional) – If
True
, the markers will be rendered.marker_style (See Below, optional) –
The style of the markers. Example options
{``.``, ``,``, ``o``, ``v``, ``^``, ``<``, ``>``, ``+``, ``x``, ``D``, ``d``, ``s``, ``p``, ``*``, ``h``, ``H``, ``1``, ``2``, ``3``, ``4``, ``8``}
marker_size (int, optional) – The size of the markers in points.
marker_face_colour (See Below, optional) –
The face (filling) colour of the markers. Example options
{``r``, ``g``, ``b``, ``c``, ``m``, ``k``, ``w``} or ``(3, )`` `ndarray` or `list` of length ``3``
marker_edge_colour (See Below, optional) –
The edge colour of the markers. Example options
{``r``, ``g``, ``b``, ``c``, ``m``, ``k``, ``w``} or ``(3, )`` `ndarray` or `list` of length ``3``
marker_edge_width (float, optional) – The width of the markers’ edge.
render_axes (bool, optional) – If
True
, the axes will be rendered.axes_font_name (See Below, optional) –
The font of the axes. Example options
{``serif``, ``sans-serif``, ``cursive``, ``fantasy``, ``monospace``}
axes_font_size (int, optional) – The font size of the axes.
axes_font_style ({
normal
,italic
,oblique
}, optional) – The font style of the axes.axes_font_weight (See Below, optional) –
The font weight of the axes. Example options
{``ultralight``, ``light``, ``normal``, ``regular``, ``book``, ``medium``, ``roman``, ``semibold``, ``demibold``, ``demi``, ``bold``, ``heavy``, ``extra bold``, ``black``}
figure_size ((float, float) or None, optional) – The size of the figure in inches.
render_grid (bool, optional) – If
True
, the grid will be rendered.grid_line_style ({
-
,--
,-.
,:
}, optional) – The style of the grid lines.grid_line_width (float, optional) – The width of the grid lines.
- Returns
viewer (
MatplotlibRenderer
) – The viewer object.
-
plot_eigenvalues_ratio
(figure_id=None, new_figure=False, render_lines=True, line_colour='b', line_style='-', line_width=2, render_markers=True, marker_style='o', marker_size=6, marker_face_colour='b', marker_edge_colour='k', marker_edge_width=1.0, render_axes=True, axes_font_name='sans-serif', axes_font_size=10, axes_font_style='normal', axes_font_weight='normal', figure_size=(10, 6), render_grid=True, grid_line_style='--', grid_line_width=0.5)[source]¶ Plot of the variance ratio captured by the eigenvalues.
- Parameters
figure_id (object, optional) – The id of the figure to be used.
new_figure (bool, optional) – If
True
, a new figure is created.render_lines (bool, optional) – If
True
, the line will be rendered.line_colour (See Below, optional) –
The colour of the lines. Example options
{``r``, ``g``, ``b``, ``c``, ``m``, ``k``, ``w``} or ``(3, )`` `ndarray` or `list` of length ``3``
line_style ({
-
,--
,-.
,:
}, optional) – The style of the lines.line_width (float, optional) – The width of the lines.
render_markers (bool, optional) – If
True
, the markers will be rendered.marker_style (See Below, optional) –
The style of the markers. Example options
{``.``, ``,``, ``o``, ``v``, ``^``, ``<``, ``>``, ``+``, ``x``, ``D``, ``d``, ``s``, ``p``, ``*``, ``h``, ``H``, ``1``, ``2``, ``3``, ``4``, ``8``}
marker_size (int, optional) – The size of the markers in points.
marker_face_colour (See Below, optional) –
The face (filling) colour of the markers. Example options
{``r``, ``g``, ``b``, ``c``, ``m``, ``k``, ``w``} or ``(3, )`` `ndarray` or `list` of length ``3``
marker_edge_colour (See Below, optional) –
The edge colour of the markers. Example options
{``r``, ``g``, ``b``, ``c``, ``m``, ``k``, ``w``} or ``(3, )`` `ndarray` or `list` of length ``3``
marker_edge_width (float, optional) – The width of the markers’ edge.
render_axes (bool, optional) – If
True
, the axes will be rendered.axes_font_name (See Below, optional) –
The font of the axes. Example options
{``serif``, ``sans-serif``, ``cursive``, ``fantasy``, ``monospace``}
axes_font_size (int, optional) – The font size of the axes.
axes_font_style ({
normal
,italic
,oblique
}, optional) – The font style of the axes.axes_font_weight (See Below, optional) –
The font weight of the axes. Example options
{``ultralight``, ``light``, ``normal``, ``regular``, ``book``, ``medium``, ``roman``, ``semibold``, ``demibold``, ``demi``, ``bold``, ``heavy``, ``extra bold``, ``black``}
figure_size ((float, float) or None, optional) – The size of the figure in inches.
render_grid (bool, optional) – If
True
, the grid will be rendered.grid_line_style ({
-
,--
,-.
,:
}, optional) – The style of the grid lines.grid_line_width (float, optional) – The width of the grid lines.
- Returns
viewer (
MatplotlibRenderer
) – The viewer object.
-
project
(vector)¶ Projects the vector onto the model, retrieving the optimal linear reconstruction weights.
- Parameters
vector (
(n_features,)
ndarray) – A vectorized novel instance.- Returns
weights (
(n_components,)
ndarray) – A vector of optimal linear weights.
-
project_out
(vector)¶ Returns a version of vector where all the basis of the model have been projected out.
- Parameters
vector (
(n_features,)
ndarray) – A novel vector.- Returns
projected_out (
(n_features,)
ndarray) – A copy of vector with all basis of the model projected out.
-
project_out_vectors
(vectors)¶ Returns a version of vectors where all the bases of the model have been projected out.
- Parameters
vectors (
(n_vectors, n_features)
ndarray) – A matrix of novel vectors.- Returns
projected_out (
(n_vectors, n_features)
ndarray) – A copy of vectors with all bases of the model projected out.
-
project_vectors
(vectors)¶ Projects each of the vectors onto the model, retrieving the optimal linear reconstruction weights for each instance.
- Parameters
vectors (
(n_samples, n_features)
ndarray) – Array of vectorized novel instances.- Returns
projected (
(n_samples, n_components)
ndarray) – The matrix of optimal linear weights.
-
project_whitened
(vector_instance)[source]¶ Projects the vector_instance onto the whitened components, retrieving the whitened linear weightings.
- Parameters
vector_instance (
(n_features,)
ndarray) – A novel vector.- Returns
projected (
(n_features,)
ndarray) – A vector of whitened linear weightings
-
reconstruct
(vector)¶ Project a vector onto the linear space and rebuild from the weights found.
- Parameters
vector (
(n_features, )
ndarray) – A vectorized novel instance to project.- Returns
reconstructed (
(n_features,)
ndarray) – The reconstructed vector.
-
reconstruct_vectors
(vectors)¶ Projects the vectors onto the linear space and rebuilds vectors from the weights found.
- Parameters
vectors (
(n_vectors, n_features)
ndarray) – A set of vectors to project.- Returns
reconstructed (
(n_vectors, n_features)
ndarray) – The reconstructed vectors.
-
trim_components
(n_components=None)[source]¶ Permanently trims the components down to a certain amount. The number of active components will be automatically reset to this particular value.
This will reduce self.n_components down to n_components (if
None
, self.n_active_components will be used), freeing up memory in the process.Once the model is trimmed, the trimmed components cannot be recovered.
- Parameters
n_components (int >=
1
or float >0.0
orNone
, optional) – The number of components that are kept or else the amount (ratio) of variance that is kept. IfNone
, self.n_active_components is used.
Notes
In case n_components is greater than the total number of components or greater than the amount of variance currently kept, this method does not perform any action.
-
variance
()[source]¶ Returns the total amount of variance retained by the active components.
- Returns
variance (float) – Total variance captured by the active components.
-
variance_ratio
()[source]¶ Returns the ratio between the amount of variance retained by the active components and the total amount of variance present on the original samples.
- Returns
variance_ratio (float) – Ratio of active components variance and total variance present in original samples.
-
whitened_components
()[source]¶ Returns the active components of the model, whitened.
- Returns
whitened_components (
(n_active_components, n_features)
ndarray) – The whitened components.
-
property
components
¶ Returns the active components of the model.
- Type
(n_active_components, n_features)
ndarray
-
property
eigenvalues
¶ Returns the eigenvalues associated with the active components of the model, i.e. the amount of variance captured by each active component, sorted form largest to smallest.
- Type
(n_active_components,)
ndarray
-
property
n_active_components
¶ The number of components currently in use on this model.
- Type
int
-
property
n_components
¶ The number of bases of the model.
- Type
int
-
property
n_features
¶ The number of elements in each linear component.
- Type
int