PCA loadings heatmap#

bullkpy.pl.pca_loadings_heatmap(adata, *, pcs=(1, 2, 3), n_top=15, loadings_key='PCs', use_abs=False, show_negative=True, gene_symbol_key=None, z_score=False, cluster_genes=True, cluster_pcs=False, cmap='vlag', center=0.0, figsize=None, title=None, save=None, show=True)[source]#

Heatmap of PCA loadings for union of top genes across selected PCs.

Selects top positive and (optional) top negative genes for each PC.
Builds a matrix [genes x PCs] of loadings.
Optional clustering and z-scoring.

Heatmap of PCA loadings for the union of top-loading genes across one or more principal components.

This plot helps interpret multiple PCs at once by showing which genes load strongly on each selected PC, optionally separating positive/negative contributors and clustering genes (and/or PCs).

What it does.#

Reads PCA loadings from adata.varm[loadings_key] (shape: n_genes × n_pcs).
For each PC in pcs, selects top genes:

If use_abs=True: top n_top genes by |loading|.
Else: top n_top positive genes, plus (optionally) top n_top negative genes.

Forms the union of selected genes across PCs (preserving first-seen order).
Builds a matrix: genes × PCs of loadings for those genes.
Optionally z-scores each gene across PCs.
Plots as:

seaborn.clustermap if clustering is enabled (Scanpy-like dendrograms), else
seaborn.heatmap.

Parameters#

Core inputs#

adata: AnnData.
Must contain PCA loadings at adata.varm[loadings_key] (typically from bk.tl.pca()).

pcs: Sequence[int], default (1, 2, 3).
PCs to include (1-based indexing, e.g. (1,2,3) means PC1–PC3).

n_top: int, default 15.
Number of genes selected per PC (per sign if signed mode).

loadings_key: str, default “PCs”.
Key in adata.varm where loadings are stored.

Gene selection mode.#

use_abs: bool, default False.

False: signed selection (positive + optional negative).
True: select genes by absolute loading magnitude only.

show_negative: bool, default True. Only used when use_abs=False. If True, includes top negative loadings per PC.

####Gene labels

gene_symbol_key: str | None, default None.
If provided and present in adata.var, uses this column for gene labels instead of adata.var_names.
Note: If symbols are duplicated, the function uses the first occurrence of each symbol when mapping back to indices (a pragmatic choice for plotting).

Transformations#

z_score: bool, default False. If True, z-scores each gene across PCs:
[
z_{gene,pc} = \frac{loading_{gene,pc} - \mu_{gene}}{\sigma_{gene}}
]
Useful to emphasize relative PC preference per gene rather than absolute magnitude.

####Clustering / plotting.

cluster_genes: bool, default True.
Cluster rows (genes) with hierarchical clustering (uses seaborn.clustermap).

cluster_pcs: bool, default False.
Cluster columns (PCs). Often off by default because PC order is meaningful.

cmap: str, default “vlag”.
Diverging colormap suited for signed loadings.

center: float, default 0.0.
Center value for diverging colormap normalization (0 is typical for loadings).

figsize: tuple[float, float] | None. Auto-sized if None based on number of genes and PCs.

title: str | None.
Default: “PCA loadings heatmap”.

####Output controls

save: str | Path | None
If provided, saves the figure via the project’s _savefig() utility.

show: bool, default True.
Whether to display the plot with plt.show().

Returns#

fig: matplotlib.figure.Figure
If clustering is enabled, this is the clustermap figure (cg.fig); otherwise the standard heatmap figure.

Raises#

ImportError: If seaborn is not installed.
KeyError: If adata.varm[loadings_key] is missing.
ValueError. If any requested PC is out of range, or if no genes are selected (e.g. all loadings are NaN).

Examples#

Signed loadings for PC1–PC3 (pos + neg)

bk.pl.pca_loadings_heatmap(
    adata,
    pcs=(1, 2, 3),
    n_top=15,
    use_abs=False,
    show_negative=True,
)

Magnitude-only drivers across PCs

bk.pl.pca_loadings_heatmap(
    adata,
    pcs=(1, 2, 3, 4),
    n_top=25,
    use_abs=True,
)

Emphasize per-gene PC preference (z-score across PCs)

bk.pl.pca_loadings_heatmap(
    adata,
    pcs=(1, 2, 3),
    n_top=20,
    z_score=True,
)

No clustering (simple heatmap, preserves order)

bk.pl.pca_loadings_heatmap(
    adata,
    pcs=(1, 2, 3),
    cluster_genes=False,
    cluster_pcs=False,
)

Use gene symbols (if available in adata.var)

bk.pl.pca_loadings_heatmap(
    adata,
    pcs=(1, 2),
    gene_symbol_key="gene_symbol",
)

Notes & tips#

Signed mode (use_abs=False) is best when you want to interpret opposing programs on a PC (genes loading positive vs negative).
Absolute mode (use_abs=True) is best to identify the strongest overall contributors.
If you see repeated labels with gene_symbol_key, consider deduplicating upstream or switching to adata.var_names to avoid ambiguity.
For exporting per-PC gene sets (pos/neg/abs) for enrichment, use bk.tl.pca_loadings() instead.