`Utils_downstream-checkpoint`

Module Contents

Functions

`rank_features_by_arch`(X, inference_result, var_names)
`correlation_by_archetype`(matrix, inference_result[, ...])	Compute correlations between archetype cell scores and variables in a matrix.
`compute_feature_importance`(model, input_matrix[, ...])	Compute feature importance in a MIDAA model by measuring the change in the latent space reconstruction

Utils_downstream-checkpoint.rank_features_by_arch(X, inference_result, var_names, scale=True, plot=True)

Utils_downstream-checkpoint.correlation_by_archetype(matrix, inference_result, correlation_type='spearman', mt_correction_method='fdr_bh', variable_names=None)

Compute correlations between archetype cell scores and variables in a matrix.

This function calculates the correlation coefficients between each archetype’s cell scores and each variable (column) in the provided matrix. It supports different types of correlation methods and applies multiple testing correction to the p-values.

Parameters:

matrix (array-like, shape (n_cells, n_variables)) – A 2D array or list of lists representing the input matrix used for fitting the model. Each row corresponds to a cell, and each column corresponds to a variable.
inference_result (dict) – A dictionary containing inference results with the key “inferred_quantities” that maps to another dictionary containing “A”. “A” should be a 2D array-like structure with shape (n_cells, n_scores), where each column represents the scores for a particular archetype.
correlation_type (str, optional (default="spearman")) –
The type of correlation to compute. Supported options are:
- ’pearson’ : Pearson correlation coefficient
- ’spearman’ : Spearman rank correlation
- ’kendall’ : Kendall tau correlation
mt_correction_method (str, optional (default='fdr_bh')) – The method to use for multiple testing correction of p-values. Default is Benjamini-Hochberg (‘fdr_bh’). Other methods supported by statsmodels.stats.multitest.multipletests can be used.
variable_names (list of str, optional (default=None)) – A list of names for the variables (columns) in the matrix. If not provided, variables will be named as ‘Variable_1’, ‘Variable_2’, etc.

Returns:

A DataFrame containing the correlation results with the following columns:

’Variable’ : Name of the variable.
’Archetype’ : Identifier for the archetype score.
’Correlation’ : Correlation coefficient between the archetype score and the variable.
’P-value’ : P-value for the correlation.
’Corrected P-value’ : P-value after multiple testing correction.

Return type:

pandas.DataFrame

Utils_downstream-checkpoint.compute_feature_importance(model, input_matrix, feature_names=None, device='cpu')

Compute feature importance in a MIDAA model by measuring the change in the latent space reconstruction when each feature is held out, using the Frobenius norm.