Skip to contents

This function performs a linear model test on brain score data with the option to use various null models for comparison. It calculates gene set scores, performs linear modeling, calculates p-values, and identifies core genes.

Usage

brainscore.lm_test(
  pred_df,
  cov_df,
  brain_data,
  gene_data,
  annoData,
  gsScoreList.null = NULL,
  cor_method = c("pearson", "spearman", "pls1c", "pls1w", "custom"),
  aggre_method = c("mean", "median", "ks_pos_neg_sum"),
  null_model = c("spin_brain", "resample_gene", "coexp_matched", "none"),
  minGSSize = 10,
  maxGSSize = 200,
  n_cores = 0,
  n_perm = 5000,
  perm_id = NULL,
  coord.l = NULL,
  coord.r = NULL,
  seed = NULL,
  threshold_type = c("sd", "percentile", "none"),
  threshold_value = 1,
  pvalueCutoff = 0.05,
  pAdjustMethod = c("fdr", "holm", "hochberg", "hommel", "bonferroni", "BH", "BY",
    "none"),
  padjCutoff = NULL,
  matchcoexp_tol = 0.05,
  matchcoexp_max_iter = 1e+06,
  gsea_obj = TRUE,
  normality_check = TRUE,
  normality_method = c("ks", "shapiro", "both"),
  normality_alpha = 0.05,
  normality_p_adjust = "fdr",
  normality_shapiro_max_n = 5000,
  normality_seed = NULL
)

Arguments

pred_df

Data frame of predictor variables.

cov_df

Data frame of covariate variables.

brain_data

Data frame of brain imaging data.

gene_data

Data frame of gene expression data.

annoData

Environment containing annotation data.

gsScoreList.null

Precomputed list of gene set scores for the null model by brainscore/brainscore.hpc function. Default is NULL.

cor_method

Character string specifying the correlation method. Default is 'pearson'. Other options include 'spearman', 'pls1c', 'pls1w', 'custom'.

aggre_method

Character string or function specifying the aggregation method for the built-in linear regression workflow. Default is 'mean'. Supported character options are 'mean', 'median', and 'ks_pos_neg_sum'. A custom aggregation function can also be provided. Other aggregation methods remain available through brainscore() for customized downstream analyses.

null_model

Character string specifying the null model method. Default is 'spin_brain'. Other options include 'resample_gene', 'coexp_matched', 'none'.

minGSSize

Integer specifying the minimum gene set size. Default is 10.

maxGSSize

Integer specifying the maximum gene set size. Default is 200.

n_cores

Integer specifying the number of cores to use for parallel processing. Default is 0.

n_perm

Integer specifying the number of permutations. Default is 5000.

perm_id

Optional permutation ID.

coord.l

Optional left hemisphere coordinates.

coord.r

Optional right hemisphere coordinates.

seed

Optional random seed for generating perm_id.

threshold_type

Character string specifying the threshold type for core genes. Default is 'sd'. Other options include 'percentile'.

threshold_value

Numeric value specifying the threshold level. Default is 1.

pvalueCutoff

Numeric value specifying the p-value cutoff for significant results. Default is 0.05.

pAdjustMethod

Character string specifying the method ("fdr","holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "none") for p-value adjustment. Default is 'fdr'. see p.adjust for more details.

padjCutoff

Numeric value specifying the adjusted p-value cutoff for significant results. Default is NULL.

matchcoexp_tol

Numeric value specifying the tolerance for matched coexpression. Default is 0.05.

matchcoexp_max_iter

Integer specifying the maximum number of iterations for matched coexpression. Default is 1000000.

gsea_obj

Logical specifying whether to return a GSEA object otherwise only a table will be returned. Default is TRUE.

normality_check

Logical indicating whether to report normality diagnostics for empirical individual-level gene set scores. Default is TRUE.

normality_method

Character string specifying the normality diagnostic method. Default is "ks". Other options are "shapiro" and "both".

normality_alpha

Numeric significance threshold used to flag non-normal score distributions after p-value adjustment. Default is 0.05.

normality_p_adjust

Character string specifying the method for normality p-value adjustment. Default is "fdr". See stats::p.adjust() for details.

normality_shapiro_max_n

Maximum sample size used for Shapiro-Wilk tests. Default is 5000.

normality_seed

Optional random seed used when subsampling observations for Shapiro-Wilk tests.

Value

A data frame containing the results of the linear model test, including p-values, adjusted p-values, q-values, descriptions, and core genes.