PCA variance ratio#

bullkpy.pl.pca_variance_ratio(adata, *, key='pca', n_comps=None, cumulative=True, figsize=(6.5, 4.5), save=None, show=True)[source]#

Scree plot of PCA variance ratio (+ optional cumulative curve). Requires adata.uns[key][‘variance_ratio’].

Scree plot for PCA explained variance, with an optional cumulative variance curve. This is the standard diagnostic plot used to decide how many principal components to retain.

PCA variance ratio

Example PCA variance ratio

What it does#

Reads per-PC explained variance ratios from adata.uns[key][“variance_ratio”] (as written by bk.tl.pca).

Plots a scree plot:

  • x-axis: principal component index (PC1, PC2, …)

  • y-axis: explained variance ratio.

Optionally overlays a cumulative variance curve on a secondary y-axis.

Requirements#

PCA must have been run beforehand:

bk.tl.pca(adata)

The AnnData object must contain:

adata.uns[key]["variance_ratio"]

Parameters#

Data source#

key (str, default “pca”).
Key in adata.uns where PCA metadata is stored.
The function expects adata.uns[key][“variance_ratio”].

n_comps (int | None, default None). If provided, plot only the first n_comps principal components.

Plot options#

cumulative (bool, default True).
If True, adds a cumulative explained variance curve on a secondary y-axis.

figsize (tuple[float, float], default (6.5, 4.5)).
Figure size in inches.

Output#

save (str | Path | None, default None). If provided, saves the figure to this path via _savefig.

show (bool, default True).
If True, displays the plot with plt.show().

Returns#

  • fig (matplotlib.figure.Figure)

  • ax (matplotlib.axes.Axes).

The primary axes (explained variance ratio).
If cumulative=True, the cumulative curve is drawn on a secondary y-axis created via ax.twinx().

Interpretation#

  • The scree plot (variance ratio per PC) shows how much variance each PC explains.

  • The cumulative curve helps choose a cutoff:
    e.g. retain PCs until cumulative variance ≥ 0.8 or 0.9.

  • A sharp “elbow” in the scree plot often indicates a natural dimensionality.

Examples#

  1. Default scree plot with cumulative variance

bk.pl.pca_variance_ratio(adata)
  1. Show only the first 20 PCs

bk.pl.pca_variance_ratio(adata, n_comps=20)
  1. Scree plot without cumulative curve

bk.pl.pca_variance_ratio(adata, cumulative=False)
  1. Save to file

bk.pl.pca_variance_ratio(adata, save="pca_variance_ratio.pdf")