scSemiProfiler.initial_setup.initsetup¶
-
scSemiProfiler.initial_setup.initsetup(name, bulk, logged=False, normed=True, geneselection=True, batch=4)[source]¶ Initial setup of the semi-profiling pipeline, including processing the bulk data, clustering for finding the initial representatives. Bulk data should be provided as an ‘h5ad’ file. Sample IDs should be stored in adata.obs[‘sample_ids’] and gene names should be stored in adata.var.index. If not using active learning for iterative representative selection, directly set the batch size to be the total number of representatives desired.
- Parameters
name (
str) – Project name.bulk (
str) – Path to bulk data as an h5ad file. Sample IDs should be stored in adata.obs[‘sample_ids’] and gene names should be stored in adata.var.index.logged (
bool) – Whether the data has been logged or notnormed (
bool) – Whether the library size has been normalized or notgeneselection (
typing.Union[bool,int]) – Either a boolean value indicating whether to perform gene selection using the bulk data or not, or a integer specifying the number of highly variable genes should be selected.batch (
int) – Representative selection batch size.
- Return type
- Returns
None
Example
>>> import scSemiProfiler >>> name = 'runexample' >>> bulk = 'example_data/bulkdata.h5ad' >>> logged = False >>> normed = True >>> geneselection = False >>> batch = 2 >>> scSemiProfiler.initsetup(name, bulk,logged,normed,geneselection,batch)