scSemiProfiler.inference.scinfer

scSemiProfiler.inference.scinfer(name, representatives, cluster, bulktype='pseudobulk', lambdad=4.0, pretrain1batch=128, pretrain1lr=0.001, pretrain1vae=100, pretrain1gan=100, lambdabulkr=1, pretrain2lr=0.0001, pretrain2vae=50, pretrain2gan=50, inferepochs=150, lambdabulkt=8.0, inferlr=0.0002, pseudocount=0.1, ministages=5, k=15, device='cuda:0')[source]

Computationally infer the single-cell data of all non-representative samples (target samples) based on the cohort’s bulk data and the representatives’ single-cell data

Parameters
  • name (str) – The project name.

  • representatives (str) – Path to a “txt” file containing the representative sample IDs (number)

  • cluster (str) – Path to a “txt” file containing the cluster label information

  • bulktype (str) – Pseudobulk or real bulk data

  • lambdad (float) – Scaling factor for the discriminator loss.

  • pretrain1batch (int) – The mini-batch size during the first pretrain stage.

  • pretrain1lr (float) – The learning rate used in the first pretrain stage.

  • pretrain1vae (int) – The number of epochs for training the VAE during the first pretrain stage.

  • pretrain1gan (int) – The number of iterations for training GAN during the first pretrain stage.

  • lambdabulkr (float) – Scaling factor for represenatative bulk loss for pretrain 2.

  • pretrain2lr (float) – Pretrain 2 learning rate.

  • pretrain2vae (int) – The number of epochs for training the VAE during the second pretrain stage.

  • pretrain2gan (int) – The number of iterations for training the GAN during the second pretrain stage.

  • inferepochs (int) – The number of epochs used for each mini-stage during inference.

  • lambdabulkt (float) – Scaling factor for the initial target bulk loss.

  • inferlr (float) – Infer stage learning rate.

  • ministages (int) – Number of ministages during inference

  • k (int) – The number of nearest neighbors used in cell graph.

  • device (str) – Which device to use, e.g. ‘cpu’, ‘cuda:0’.

  • pseudocount (float) – Pseudocount used when converting data from real bulk space to pseudobulk space

Return type

None

Returns

None

Example

>>> name = 'project_name'
>>> representatives = name + '/status/init_representatives.txt'
>>> cluster = name + '/status/init_cluster_labels.txt'
>>> scSemiProfiler.scinfer(name = name, representatives = representatives, cluster = cluster, bulktype = 'pseudobulk')