Skip to contents

This function calculates scores for gene sets based on brain data. It supports different null models. If a null model is specified, the function calculates null scores based on the chosen model. If the null model is 'none', the function calculates the raw (empirical) scores.

Usage

brainscore(
  brain_data,
  gene_data,
  annoData,
  cor_method = c("pearson", "spearman", "pls1c", "pls1w"),
  aggre_method = c("mean", "median", "meanabs", "meansqr", "maxmean", "ks_orig",
    "ks_weighted", "ks_pos_neg_sum", "sign_test", "rank_sum"),
  null_model = c("none", "spin_brain", "resample_gene", "coexp_matched"),
  minGSSize = 10,
  maxGSSize = 200,
  n_cores = 0,
  n_perm = NULL,
  perm_id = NULL,
  coord.l = NULL,
  coord.r = NULL,
  seed = NULL,
  matchcoexp_tol = 0.05,
  matchcoexp_max_iter = 1e+06,
  verbose = TRUE
)

Arguments

brain_data

A data frame of brain data with regions as rows and subjects as columns. The row names (i.e., region names) must match those in gene_data.

gene_data

A data frame of gene expression data with regions as rows and genes as columns. The row names (i.e., region names) must match those in brain_data.

annoData

An environment containing annotation data. See get_annoData for more details.

cor_method

A character string specifying the correlation method. Default is 'pearson'. Other options include 'spearman', 'pls1c', and 'pls1w'. If a custom function that takes (gene_data, brain_data) as input is provided, the function uses the custom correlation method and sets cor_method to 'custom'.

aggre_method

A character string specifying the aggregation method. Default is 'mean'. Other options include 'median', 'meanabs', 'meansqr', 'maxmean', 'ks_orig', 'ks_weighted', 'ks_pos_neg_sum', 'sign_test', and 'rank_sum'. If a custom function that takes (geneList, geneSet) as input is provided, the function uses the custom aggregation method and sets aggre_method to 'custom'.

null_model

A character string specifying the null model. Default is 'none', which calculates raw (empirical) scores. Other options include 'spin_brain', 'resample_gene', and 'coexp_matched'.

minGSSize

An integer specifying the minimum gene set size after intersecting with the genes in gene_data. Default is 10.

maxGSSize

An integer specifying the maximum gene set size after intersecting with the genes in gene_data. Default is 200.

n_cores

An integer specifying the number of cores to use for parallel processing during permutation. Default is 0 (uses all available cores minus one).

n_perm

An integer specifying the number of permutations for null models. Default is NULL (used when null_model is 'none').

perm_id

A matrix of permutation indices for the 'spin_brain' null model. Default is NULL. Either perm_id or coord.l/coord.r must be provided when using the 'spin_brain' model.

coord.l

A matrix of coordinates for the left hemisphere, used in the 'spin_brain' null model. Default is NULL. It can be NULL if coord.r or perm_id is provided.

coord.r

A matrix of coordinates for the right hemisphere, used in the 'spin_brain' null model. Default is NULL. It can be NULL if coord.l or perm_id is provided.

seed

An integer specifying the seed for reproducibility when using the 'spin_brain' model. Default is NULL.

matchcoexp_tol

A numeric value specifying the tolerance for matched co-expression. Lower values result in better matching but require more iterations. Default is 0.05. See resample_geneSetList_matching_coexp for more details.

matchcoexp_max_iter

An integer specifying the maximum number of iterations for matched co-expression. Default is 1,000,000. See resample_geneSetList_matching_coexp for more details.

verbose

A logical indicating whether to print messages during processing. Default is TRUE.

Value

A data frame containing the gene set scores with regions as rows and gene sets as columns.