scSemiProfiler.utils.inspect_data¶
-
scSemiProfiler.utils.inspect_data(bulk, logged=False, normed=True, geneselection=True, save=None)[source]¶ To accommodate users who prefer all-in-one-batch profiling, we have developed hits new function. This function allows users to assess data heterogeneity (using Silhouette scores and bulk data visualization) and determine an approximate number of representative clusters needed for their dataset. Once the number of clusters is established, users can sequence all representatives in one batch to minimize potential batch effects.
- Parameters
name – Project name.
bulk (
str) – Path to bulk data as an h5ad file. Sample IDs should be stored in adata.obs[‘sample_ids’] and gene names should be stored in adata.var.index.logged (
bool) – Whether the data has been logged or notnormed (
bool) – Whether the library size has been normalized or notgeneselection (
typing.Union[bool,int]) – Either a boolean value indicating whether to perform gene selection using the bulk data or not, or a integer specifying the number of highly variable genes should be selected.save – Specify a folder for saving figures
- Return type
- Returns
None
Example
>>> import scSemiProfiler as semi >>> from scSemiProfiler.utils import * >>> bulk = 'example_data/bulkdata.h5ad' >>> logged = False >>> normed = True >>> geneselection = False >>> inspect_data(bulk,logged,normed,geneselection)