scSemiProfiler.utils.inspect_data¶

scSemiProfiler.utils.inspect_data(bulk, logged=False, normed=True, geneselection=True, save=None)[source]¶

To accommodate users who prefer all-in-one-batch profiling, we have developed hits new function. This function allows users to assess data heterogeneity (using Silhouette scores and bulk data visualization) and determine an approximate number of representative clusters needed for their dataset. Once the number of clusters is established, users can sequence all representatives in one batch to minimize potential batch effects.

Parameters

name – Project name.
bulk (str) – Path to bulk data as an h5ad file. Sample IDs should be stored in adata.obs[‘sample_ids’] and gene names should be stored in adata.var.index.
logged (bool) – Whether the data has been logged or not
normed (bool) – Whether the library size has been normalized or not
geneselection (typing.Union[bool, int]) – Either a boolean value indicating whether to perform gene selection using the bulk data or not, or a integer specifying the number of highly variable genes should be selected.
save – Specify a folder for saving figures

Return type

None

Returns

None

Example

>>> import scSemiProfiler as semi
>>> from scSemiProfiler.utils import *
>>> bulk = 'example_data/bulkdata.h5ad'
>>> logged = False
>>> normed = True
>>> geneselection = False
>>> inspect_data(bulk,logged,normed,geneselection)