source("custom_functions.R")Geography-independent mucosal microbiota alterations in primary sclerosing cholangitis persist after liver transplantation
2. Hypothesis: PSC recurrence is associated with a specific composition of the gut microbiome
1 Introduction
1.1 About
rPSC vs non-rPSC vs Healthy analysis on merged data
Alpha diversity –> group effect, country effect, interaction effect
Beta diversity – PERMANOVA, PCA -> group effect, country effect, interaction effect
DAA ->
Group effect – linDA + MaAsLin2 intersection
Country effect – taxa with significant interaction effect were excluded based on individual post-hoc analysis. Taxa had to have the same direction in both countries
ML models, trained and validated using out-of-sample bootstrap (500 resamplings):
o ENET
o kNN
o GBoost
o RF
Pairwise comparisons:
rPSC vs non-rPSC –by this comparison, we will find the effect of recurrence (whether the microbial composition is associated with it)
non-rPSC vs Healthy – by this comparison, we will find if the non-rPSC samples are „closer“ to the healthy samples then rPSC samples in terms of the microbial composition. In other words, does non-rPSC samples have „healthy microbiome“?
rPSC vs Healthy – by this comparison, we will find how much are the rPSC samples different from healthy samples.
However, the rPSC and non-rPSC samples are not different at all (see results). Therefore, we can do EXCLUSIVE LEFT JOIN of differentially abundant taxa from the analyses (rPSC vs Healthy, non-rPSC vs Healthy)
Importing libraries and custom functions built for this analysis
1.2 Data Import
Importing ASV, taxa and metadata tables for both Czech and Norway samples.
Czech
path = "../../../../data/analysis_ready_data/ikem/"
asv_tab_ikem <- as.data.frame(fread(file.path(path,"asv_table_ikem.csv"),
check.names = FALSE))
taxa_tab_ikem <- as.data.frame(fread(file.path(path,"taxa_table_ikem.csv"),
check.names = FALSE))
metadata_ikem <- as.data.frame(fread(file.path(path,"metadata_ikem.csv"),
check.names = FALSE))Norway
path = "../../../../data/analysis_ready_data/norway/"
asv_tab_norway <- as.data.frame(fread(file.path(path,"asv_table_norway.csv"),
check.names = FALSE))
taxa_tab_norway <- as.data.frame(fread(file.path(path,"taxa_table_norway.csv"),
check.names = FALSE))
metadata_norway <- as.data.frame(fread(file.path(path,"metadata_norway.csv"),
check.names = FALSE))1.3 Merging data
Merging two countries to create whole dataset
asv_tab <- merge(asv_tab_ikem,asv_tab_norway,by="SeqID",all=TRUE)
taxa_tab <- merging_taxa_tables(taxa_tab_ikem,taxa_tab_norway)Merging at ASV level
Finding inconsistencies in taxonomy, trying to keep the ones that have better taxonomy assignment
Merging two countries based on the different matrices - Ileum, Colon.
Terminal ileum
ileum_data <- merging_data(asv_tab_1=asv_tab_ikem,
asv_tab_2=asv_tab_norway,
taxa_tab_1=taxa_tab_ikem,
taxa_tab_2=taxa_tab_norway,
metadata_1=metadata_ikem,
metadata_2=metadata_norway,
segment="TI",Q="Q2")Removing 1498 ASV(s)
Removing 1834 ASV(s)
Merging at ASV level
Finding inconsistencies in taxonomy, trying to keep the ones that have better taxonomy assignment
Removing 641 ASV(s)
ileum_asv_tab <- ileum_data[[1]]
ileum_taxa_tab <- ileum_data[[2]]
ileum_metadata <- ileum_data[[3]]Colon
colon_data <- merging_data(asv_tab_1=asv_tab_ikem,
asv_tab_2=asv_tab_norway,
taxa_tab_1=taxa_tab_ikem,
taxa_tab_2=taxa_tab_norway,
metadata_1=metadata_ikem,
metadata_2=metadata_norway,
segment="colon",Q="Q2")Removing 739 ASV(s)
Removing 266 ASV(s)
Merging at ASV level
Finding inconsistencies in taxonomy, trying to keep the ones that have better taxonomy assignment
Removing 1096 ASV(s)
colon_asv_tab <- colon_data[[1]]
colon_taxa_tab <- colon_data[[2]]
colon_metadata <- colon_data[[3]]2 Data Analysis - Terminal ileum
segment="terminal_ileum"2.1 Filtering
Rules:
sequencing depth > 10000
nearZeroVar() with default settings
Rarefaction Curve
path="../../../intermediate_files/rarecurves"
seq_depth_threshold <- 10000ps <- construct_phyloseq(ileum_asv_tab,ileum_taxa_tab,ileum_metadata)
rareres <- get_rarecurve(obj=ps, chunks=500)
save(rareres,file = file.path(path,"rarefaction_ileum.Rdata"))load(file.path(path,"rarefaction_ileum.Rdata"))
prare <- ggrarecurve(obj=rareres,
factorNames="Country",
indexNames=c("Observe")) +
theme_bw() +
theme(axis.text=element_text(size=8), panel.grid=element_blank(),
strip.background = element_rect(colour=NA,fill="grey"),
strip.text.x = element_text(face="bold")) +
geom_vline(xintercept = seq_depth_threshold,
linetype="dashed",
color = "red") +
xlim(0, 20000)The color has been set automatically, you can reset it manually by adding scale_color_manual(values=yourcolors)
prareLibrary size
read_counts(ileum_asv_tab, line = c(5000,10000))Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
2.1.1 Sequencing depth
data_filt <- seq_depth_filtering(ileum_asv_tab,
ileum_taxa_tab,
ileum_metadata,
seq_depth_threshold = 10000)Removing 104 ASV(s)
filt_ileum_asv_tab <- data_filt[[1]]; alpha_ileum_asv_tab <- filt_ileum_asv_tab
filt_ileum_taxa_tab <- data_filt[[2]]; alpha_ileum_taxa_tab <- filt_ileum_taxa_tab
filt_ileum_metadata <- data_filt[[3]]; alpha_ileum_metadata <- filt_ileum_metadata
seq_step <- dim(filt_ileum_asv_tab)[1]Library size
read_counts(filt_ileum_asv_tab,line = c(5000,10000))Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
2.1.2 NearZeroVar
data_filt <- nearzerovar_filtering(filt_ileum_asv_tab,
filt_ileum_taxa_tab,
filt_ileum_metadata)
filt_ileum_asv_tab <- data_filt[[1]]
filt_ileum_taxa_tab <- data_filt[[2]]
nearzero_step <- dim(filt_ileum_asv_tab)[1]Library size
read_counts(filt_ileum_asv_tab,line = c(5000,10000))Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
2.1.3 Final Counts
final_counts_filtering(ileum_asv_tab,
filt_ileum_asv_tab,
filt_ileum_metadata,
seq_step, 0, nearzero_step) %>% `colnames<-`("Count") Count
Raw data: ASVs 4491
Raw data: Samples 220
Sequencing depth filt: ASVs 4387
Prevalence filt: ASVs 0
NearZeroVar filt: ASVs 575
Filt data: ASVs 575
Filt data: Samples 211
Filt data: Patients 211
Filt data: Patients.1 0
Filtered samples 9
Matrices TI
healthy 73
non-rPSC 105
rPSC 33
pre_ltx 0
post_ltx 0
ETOH 0
2.2 Alpha diversity
path = "../../../results/merged_sites/main_analysis/Q2/alpha_diversity"Calculation
# Construct MPSE object
alpha_ileum_metadata$Sample <- alpha_ileum_metadata$SampleID
ileum_mpse <- as.MPSE(construct_phyloseq(alpha_ileum_asv_tab,
alpha_ileum_taxa_tab,
alpha_ileum_metadata))
ileum_mpse %<>% mp_rrarefy(raresize = 10000,seed = 123)
# Calculate alpha diversity - rarefied counts
ileum_mpse %<>% mp_cal_alpha(.abundance=RareAbundance, force=TRUE)alpha_div_plots <- list()
# preparing data frame
alpha_data <- data.frame(SampleID=ileum_mpse$Sample.x,
Observe=ileum_mpse$Observe,
Shannon=ileum_mpse$Shannon,
Simpson=ileum_mpse$Simpson,
Pielou=ileum_mpse$Pielou,
Group=ileum_mpse$Group,
Country=ileum_mpse$Country,
Patient=ileum_mpse$Patient)
write.csv(alpha_data,file.path(path,paste0("alpha_indices_",segment,".csv")),
row.names = FALSE)2.2.1 Plots
p_boxplot_alpha <- alpha_diversity_countries(alpha_data)Using SampleID, Group, Country, Patient as id variables
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Using SampleID, Group, Country, Patient as id variables
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
# save the results
alpha_div_plots[[paste(segment,"Country")]] <- p_boxplot_alpha
# see the results
p_boxplot_alpha2.2.2 Linear models
path = "../../../results/merged_sites/main_analysis/Q2/alpha_diversity"
alpha_data <- read.csv(file.path(path,paste0("alpha_indices_",segment,".csv")))Richness
results_model <- pairwise.lm(formula = "Observe ~ Group * Country",
factors=alpha_data$Group,
data=alpha_data)
# check interaction
if (!is.data.frame(results_model)){
results_model_observe <- results_model[[1]]
results_model_observe_emeans <- results_model[[2]]
} else {
results_model_observe <- results_model
results_model_observe_emeans <- NA
}
# save the results
pc_observed <- list();
pc_observed[[segment]] <- results_model_observe# see the results
knitr::kable(results_model_observe,digits = 3,
caption = "Raw results of linear model of richness estimation.")| Estimate | Std..Error | t.value | Pr…t.. | p.adj | sig | |
|---|---|---|---|---|---|---|
| non-rPSC vs GrouprPSC | -12.199 | 11.972 | -1.019 | 0.310 | 0.349 | |
| non-rPSC , rPSC - CZ vs NO | -21.486 | 11.452 | -1.876 | 0.063 | 0.141 | |
| non-rPSC vs GrouprPSC:CountryNO | 9.347 | 23.088 | 0.405 | 0.686 | 0.686 | |
| healthy vs GrouprPSC | -37.741 | 12.284 | -3.072 | 0.003 | 0.025 | * |
| healthy , rPSC - CZ vs NO | 11.203 | 10.973 | 1.021 | 0.310 | 0.349 | |
| healthy vs GrouprPSC:CountryNO | -23.342 | 21.355 | -1.093 | 0.277 | 0.349 | |
| healthy vs Groupnon-rPSC | -25.542 | 9.322 | -2.740 | 0.007 | 0.031 | * |
| healthy , non-rPSC - CZ vs NO | 11.203 | 10.933 | 1.025 | 0.307 | 0.349 | |
| healthy vs Groupnon-rPSC:CountryNO | -32.689 | 15.108 | -2.164 | 0.032 | 0.096 |
knitr::kable(results_model_observe_emeans,digits = 3,
caption = "Raw results of independent country analysis")| contrast | Country | estimate | SE | df | t.ratio | p.value |
|---|---|---|---|---|---|---|
| (non-rPSC) - healthy | CZ | -25.542 | 9.322 | 174 | -2.740 | 0.007 |
| (non-rPSC) - healthy | NO | -58.231 | 11.889 | 174 | -4.898 | 0.000 |
Shannon
results_model <- pairwise.lm(formula = "Shannon ~ Group * Country",
factors=alpha_data$Group,
data=alpha_data)
# check interaction
if (!is.data.frame(results_model)){
results_model_shannon <- results_model[[1]]
results_model_shannon_emeans <- results_model[[2]]
} else {
results_model_shannon <- results_model
results_model_shannon_emeans <- NA
}
# save the results
pc_shannon <- list();
pc_shannon[[segment]] <- as.data.frame(results_model_shannon)# see the results
knitr::kable(results_model_shannon,digits = 3,
caption = "Raw results of linear model of Shannon estimation.")| Estimate | Std..Error | t.value | Pr…t.. | p.adj | sig | |
|---|---|---|---|---|---|---|
| non-rPSC vs GrouprPSC | -0.102 | 0.146 | -0.696 | 0.487 | 0.731 | |
| non-rPSC , rPSC - CZ vs NO | -0.349 | 0.140 | -2.500 | 0.014 | 0.123 | |
| non-rPSC vs GrouprPSC:CountryNO | -0.006 | 0.281 | -0.021 | 0.983 | 0.983 | |
| healthy vs GrouprPSC | -0.210 | 0.140 | -1.501 | 0.137 | 0.321 | |
| healthy , rPSC - CZ vs NO | 0.005 | 0.125 | 0.043 | 0.966 | 0.983 | |
| healthy vs GrouprPSC:CountryNO | -0.360 | 0.244 | -1.478 | 0.142 | 0.321 | |
| healthy vs Groupnon-rPSC | -0.109 | 0.115 | -0.948 | 0.344 | 0.620 | |
| healthy , non-rPSC - CZ vs NO | 0.005 | 0.135 | 0.040 | 0.968 | 0.983 | |
| healthy vs Groupnon-rPSC:CountryNO | -0.354 | 0.186 | -1.905 | 0.058 | 0.263 |
knitr::kable(results_model_shannon_emeans,digits = 3,
caption = "Raw results of independent country analysis")| contrast | Country | estimate | SE | df | t.ratio | p.value |
|---|---|---|---|---|---|---|
| (non-rPSC) - healthy | CZ | -0.109 | 0.115 | 174 | -0.948 | 0.344 |
| (non-rPSC) - healthy | NO | -0.463 | 0.146 | 174 | -3.164 | 0.002 |
Simpson
results_model <- pairwise.lm(formula = "Simpson ~ Group * Country",
factors=alpha_data$Group,
data=alpha_data)
# check interaction
if (!is.data.frame(results_model)){
results_model_simpson <- results_model[[1]]
results_model_simpson_emeans <- results_model[[2]]
} else {
results_model_simpson <- results_model
results_model_simpson_emeans <- NA
}
# save the results
pc_simpson <- list();
pc_simpson[[segment]] <- as.data.frame(results_model_simpson)# see the results
knitr::kable(results_model_simpson,digits = 3,
caption = "Raw results of linear model of Simpson estimation.")| Estimate | Std..Error | t.value | Pr…t.. | p.adj | sig | |
|---|---|---|---|---|---|---|
| non-rPSC vs GrouprPSC | 0.000 | 0.023 | 0.019 | 0.985 | 0.985 | |
| non-rPSC , rPSC - CZ vs NO | -0.034 | 0.022 | -1.573 | 0.118 | 0.531 | |
| non-rPSC vs GrouprPSC:CountryNO | -0.029 | 0.043 | -0.661 | 0.510 | 0.767 | |
| healthy vs GrouprPSC | -0.009 | 0.018 | -0.500 | 0.618 | 0.767 | |
| healthy , rPSC - CZ vs NO | -0.008 | 0.016 | -0.501 | 0.617 | 0.767 | |
| healthy vs GrouprPSC:CountryNO | -0.055 | 0.031 | -1.779 | 0.078 | 0.531 | |
| healthy vs Groupnon-rPSC | -0.009 | 0.016 | -0.564 | 0.574 | 0.767 | |
| healthy , non-rPSC - CZ vs NO | -0.008 | 0.019 | -0.410 | 0.682 | 0.767 | |
| healthy vs Groupnon-rPSC:CountryNO | -0.026 | 0.027 | -0.975 | 0.331 | 0.767 |
knitr::kable(results_model_simpson_emeans,digits = 3,
caption = "Raw results of independent country analysis")| contrast | Country | estimate | SE | df | t.ratio | p.value |
|---|---|---|---|---|---|---|
| rPSC - healthy | CZ | -0.009 | 0.018 | 102 | -0.500 | 0.618 |
| rPSC - healthy | NO | -0.064 | 0.025 | 102 | -2.526 | 0.013 |
Pielou
results_model <- pairwise.lm(formula = "Pielou ~ Group * Country",
factors=alpha_data$Group,
data=alpha_data)
# check interaction
if (!is.data.frame(results_model)){
results_model_pielou <- results_model[[1]]
results_model_pielou_emeans <- results_model[[2]]
} else {
results_model_pielou <- results_model
results_model_pielou_emeans <- NA
}
# save the results
pc_pielou <- list();
pc_pielou[[segment]] <- as.data.frame(results_model_pielou)# see the results
knitr::kable(results_model_pielou,digits = 3,
caption = "Raw results of linear model of Pielou estimation.")| Estimate | Std..Error | t.value | Pr…t.. | p.adj | sig | |
|---|---|---|---|---|---|---|
| non-rPSC vs GrouprPSC | -0.004 | 0.022 | -0.184 | 0.854 | 0.946 | |
| non-rPSC , rPSC - CZ vs NO | -0.047 | 0.021 | -2.260 | 0.025 | 0.229 | |
| non-rPSC vs GrouprPSC:CountryNO | -0.005 | 0.042 | -0.108 | 0.914 | 0.946 | |
| healthy vs GrouprPSC | -0.001 | 0.021 | -0.067 | 0.946 | 0.946 | |
| healthy , rPSC - CZ vs NO | -0.009 | 0.018 | -0.463 | 0.645 | 0.946 | |
| healthy vs GrouprPSC:CountryNO | -0.043 | 0.036 | -1.195 | 0.235 | 0.705 | |
| healthy vs Groupnon-rPSC | 0.003 | 0.018 | 0.146 | 0.884 | 0.946 | |
| healthy , non-rPSC - CZ vs NO | -0.009 | 0.021 | -0.409 | 0.683 | 0.946 | |
| healthy vs Groupnon-rPSC:CountryNO | -0.038 | 0.029 | -1.329 | 0.185 | 0.705 |
knitr::kable(results_model_pielou_emeans,digits = 3,
caption = "Raw results of independent country analysis")Table: Raw results of independent country analysis
2.2.3 Saving results
alpha_list <- list(
Richness=pc_observed[[segment]] %>% rownames_to_column("Comparison"),
Shannon=pc_shannon[[segment]] %>% rownames_to_column("Comparison"),
Simpson=pc_simpson[[segment]] %>% rownames_to_column("Comparison"),
Pielou=pc_pielou[[segment]] %>% rownames_to_column("Comparison"))
write.xlsx(alpha_list,
file = file.path(path,paste0("alpha_diversity_results_",segment,".xlsx")))2.3 Beta diversity
Calculating Aitchison distance (euclidean distance on clr-transformed data), both at ASV and genus level.
2.3.1 Main analysis
Genus level, Aitchison distance
level="genus"
path = "../../../results/merged_sites/main_analysis/Q2/beta_diversity"
pairwise_aitchison_raw <- list()
pca_plots_list <- list()Aggregation, filtering
# Aggregation
genus_data <- aggregate_taxa(ileum_asv_tab,
ileum_taxa_tab,
taxonomic_level=level,
names=TRUE)
# Filtration
filt_data <- filtering_steps(genus_data[[1]],
genus_data[[2]],
ileum_metadata,
seq_depth_threshold=10000)Removing 2 ASV(s)
filt_ileum_genus_tab <- filt_data[[1]]
filt_ileum_genus_taxa <- filt_data[[2]]
filt_ileum_metadata <- filt_data[[3]]2.3.1.0.1 PERMANOVA
pairwise_df <- filt_ileum_genus_tab %>% column_to_rownames("SeqID") %>% t()
# main effect
pp_main <- pairwise.adonis(pairwise_df,
filt_ileum_metadata$Group,
covariate = filt_ileum_metadata$Country, sim.method = "robust.aitchison", p.adjust.m="BH")
# interaction
pp_int <- pairwise.adonis(pairwise_df,filt_ileum_metadata$Group,covariate = filt_ileum_metadata$Country, interaction = TRUE, sim.method = "robust.aitchison", p.adjust.m="BH")
# tidy the results
pp_factor <- pp_main[[1]]
pp_cov <- pp_main[[2]]
pp_fac.cov <- pp_int[[3]]
cols <- c("pairs","Df","SumsOfSqs", "F.Model","R2","p.value", "p.adjusted", "sig")
colnames(pp_factor) <- cols; colnames(pp_cov) <- cols; colnames(pp_fac.cov) <- cols;
# save raw results
pairwise_aitchison_raw[[paste(level, segment)]] <- rbind(pp_factor,pp_cov,pp_fac.cov)# see the results
knitr::kable(pp_factor,digits = 3,caption = "PERMANOVA, GROUP separation")| pairs | Df | SumsOfSqs | F.Model | R2 | p.value | p.adjusted | sig |
|---|---|---|---|---|---|---|---|
| rPSC vs non-rPSC | 1 | 215.579 | 1.208 | 0.009 | 0.107 | 0.107 | |
| rPSC vs healthy | 1 | 561.696 | 3.345 | 0.030 | 0.001 | 0.002 | ** |
| non-rPSC vs healthy | 1 | 760.242 | 4.431 | 0.024 | 0.001 | 0.002 | ** |
knitr::kable(pp_cov,digits = 3,caption = "PERMANOVA, COUNTRY separation")| pairs | Df | SumsOfSqs | F.Model | R2 | p.value | p.adjusted | sig |
|---|---|---|---|---|---|---|---|
| rPSC vs non-rPSC , Country | 1 | 601.929 | 3.372 | 0.024 | 0.001 | 0.001 | *** |
| rPSC vs healthy , Country | 1 | 496.841 | 2.959 | 0.027 | 0.001 | 0.001 | *** |
| non-rPSC vs healthy , Country | 1 | 624.474 | 3.640 | 0.020 | 0.001 | 0.001 | *** |
knitr::kable(pp_fac.cov,digits = 3,caption = "PERMANOVA, INTERACTION GROUP:Country")| pairs | Df | SumsOfSqs | F.Model | R2 | p.value | p.adjusted | sig |
|---|---|---|---|---|---|---|---|
| rPSC vs non-rPSC : Country | 1 | 156.176 | 0.874 | 0.006 | 0.788 | 0.788 | |
| rPSC vs healthy : Country | 1 | 164.743 | 0.981 | 0.009 | 0.464 | 0.696 | |
| non-rPSC vs healthy : Country | 1 | 209.954 | 1.225 | 0.007 | 0.089 | 0.267 |
Interaction check
interaction_sig <- pp_fac.cov$pairs[pp_fac.cov$p.adjusted < 0.05]
if (length(interaction_sig)>0){
for (i in 1:length(interaction_sig)){
group1 <- unlist(strsplit(interaction_sig[i],split = " vs "))[1]
group2 <- unlist(strsplit(interaction_sig[i],split = " vs "))[2]
group2 <- unlist(strsplit(group2,split = " : "))[1]
result_list <- adonis_postanalysis(x=pairwise_df,
factors = filt_ileum_metadata$Group,
covariate = filt_ileum_metadata$Country,
group1 = group1,
group2 = group2)
print(result_list)
}
}2.3.1.0.2 Plots
p <- pca_plot_custom(filt_ileum_genus_tab,
filt_ileum_genus_taxa,
filt_ileum_metadata,
show_boxplots = TRUE,
variable = "Group", size=3, show_legend=FALSE)
# save the results
pca_plots_list[[paste(segment,level,"custom")]] <- p
# see the results
p2.3.1.1 Saving results
write.xlsx(pairwise_aitchison_raw[[paste(level, segment)]],
file = file.path(path,
paste0("beta_diversity_results_", segment,".xlsx")))2.4 Univariate Analysis
2.4.1 Main analysis
level="genus"
# needed paths
path = "../../../results/merged_sites/main_analysis/Q2/univariate_analysis"
path_maaslin=file.path("../../../intermediate_files/maaslin/Q2",level)
# variables
raw_linda_results_genus <- list();
raw_linda_results_genus[[segment]] <- list()
linda_results_genus <- list();
linda_results_genus[[segment]] <- list()
# country and interaction problems
list_country_union <- list()
list_intersections <- list()
list_venns <- list()
uni_statistics <- list()
# workbook for final df
wb <- createWorkbook()
# rPSC effect
rpsc_effect <- list()Aggregate taxa
genus_data <- aggregate_taxa(ileum_asv_tab,
ileum_taxa_tab,
taxonomic_level = level)
ileum_genus_tab <- genus_data[[1]]
ileum_genus_taxa_tab <- genus_data[[2]]
ileum_genus_asv_taxa_tab <- create_asv_taxa_table(ileum_genus_tab,
ileum_genus_taxa_tab)2.4.1.1 rPSC vs non-rPSC
group <- c("non-rPSC","rPSC")
comparison_name <- paste0(group[1], " vs ",group[2])2.4.1.1.1 linDA
taxa_to_test <- read.xlsx("../../../results/merged_sites/main_analysis/Q1/univariate_analysis/significant_taxa_terminal_ileum.xlsx", sheet=" genus healthy vs pre_ltx")$SeqID
ileum_genus_tab_2 <- ileum_genus_tab %>% dplyr::filter(SeqID %in% taxa_to_test)
ileum_genus_taxa_tab_2 <- ileum_genus_taxa_tab %>% dplyr::filter(SeqID %in% taxa_to_test)
# prepare the data
linda_data <- binomial_prep(ileum_genus_tab_2,
ileum_genus_taxa_tab_2,
ileum_metadata,
group, usage="linDA", filtering = FALSE)
filt_ileum_uni_data <- linda_data[[1]]
filt_ileum_uni_taxa <- linda_data[[2]]
filt_ileum_uni_metadata <- linda_data[[3]]
# fit the model
linda.obj <- linda(filt_ileum_uni_data,
filt_ileum_uni_metadata,
formula = '~ Group * Country')0 features are filtered!
The filtered data has 56 samples and 25 features will be tested!
Imputation approach is used.
Fit linear models ...
Completed.
linda.output <- linda.obj$output
linda.output <- linda_renaming(linda.output, group)
# save the results
group1 <- paste0(group[1], " vs ","Group",group[2])
group2 <- paste0(group[1], " , ",group[2], " - CZ vs NO")
group3 <- paste0(group[1], " vs ","Group",group[2], ":CountryNO")
for (grp in c(group1,group2,group3)){
raw_linda_results_genus[[segment]][[grp]] <-
rawlinda.df(linda.output,
grp,
filt_ileum_uni_data,
filt_ileum_uni_taxa)
linda_results_genus[[segment]][[grp]] <-
linda.df(linda.output,
grp,
filt_ileum_uni_data,
filt_ileum_uni_taxa)
}# volcano plot
volcano_1 <- volcano_plot_linda(linda.output,
group1,
taxa_table = filt_ileum_uni_taxa) +
ggtitle(comparison_title)
volcano_2 <- volcano_plot_linda(linda.output, group2,
taxa_table = filt_ileum_uni_taxa) +
ggtitle("NO vs CZ")
volcano_3 <- volcano_plot_linda(linda.output, group3,
taxa_table = filt_ileum_uni_taxa) +
ggtitle("Interaction effect")
volcano <- ggarrange(volcano_1,volcano_2,volcano_3, ncol=3)
volcano2.4.1.1.2 MaAsLin2
volcano1 <- volcano_plot_maaslin(fit_data,filt_ileum_uni_taxa) +
ggtitle(comparison_title)
volcano2 <- volcano_plot_maaslin(fit_data,filt_ileum_uni_taxa,variable="Country") +
ggtitle("NO vs CZ")
volcano <- ggarrange(volcano1,volcano2, ncol=2)
volcano2.4.1.1.3 Group - Intersection
intersection_results <- group_intersection(group, list_intersections, list_venns,
linda.output, fit_data,
raw_linda_results_genus,
segment = segment,
level=level)
list_intersections <- intersection_results[[1]]
list_venns <- intersection_results[[2]]
venn <- intersection_results[[3]]
# show the results
venn2.4.1.1.4 Interaction effect
list_interaction_significant <- country_interaction(group,
linda.output,
list_intersections,
filt_ileum_uni_data,
filt_ileum_uni_metadata,
segment=segment,
level=level)
# see the result
## significant interaction effect
list_interaction_significant[[1]]NULL
## results for czech cohort
list_interaction_significant[[2]][1] NA
## results for norwegian cohort
list_interaction_significant[[3]][1] NA
Removing problematic taxa
list_intersections <- removing_interaction_problems(group,
list_interaction_significant,
list_intersections,
segment=segment,
level=level)2.4.1.1.5 Basic statistics
uni_df <- merge(basic_univariate_statistics(linda_data,group),
raw_linda_results_genus[[segment]][[group1]],
by="SeqID",all=TRUE)[1] "non-rPSC"
[1] "rPSC"
[1] "non-rPSC"
[1] "rPSC"
uni_df[["final_sig"]] <- uni_df$SeqID %in% list_intersections[[paste(segment,level,comparison_name)]][["SeqID"]]
uni_statistics[[segment]][[paste(level,comparison_name)]] <- uni_df
# for comparison
new_name <- comparison_name
addWorksheet(wb, sheetName = new_name)
writeData(wb, sheet = new_name, uni_df, rowNames=FALSE)2.4.1.1.6 Prevalence testing
uni_df$prevalence_diff <- abs(uni_df$`PREVALENCE_ rPSC` - uni_df$`PREVALENCE_ non-rPSC`)
#uni_df %<>% dplyr::filter(prevalence_diff >=0.2)
#filt_colon_uni_data <- filt_colon_uni_data[uni_df$SeqID,]
filt_ileum_uni_data <- ifelse(filt_ileum_uni_data>0,"YES","NO")
filt_ileum_uni_data %<>% t() %>% as.data.frame()
#filt_colon_uni_taxa %<>% dplyr::filter(SeqID %in% uni_df$SeqID)
filt_ileum_uni_data <- merge(filt_ileum_uni_data %>% rownames_to_column("SampleID"),filt_ileum_metadata,by="SampleID",all.x=TRUE) %>% column_to_rownames("SampleID")
p_values <- c()
for (taxon in uni_df$SeqID){
data <- filt_ileum_uni_data %>%
group_by(!!sym(taxon), Group) %>%
count() %>%
group_by(!!sym(taxon), Group) %>%
summarise(n = sum(n), .groups = "drop") %>%
pivot_wider(names_from = Group, values_from = n) %>%
column_to_rownames(taxon) %>%
as.matrix()
if (nrow(data)>1){
data[is.na(data)] <- 0
chi_res <- chisq.test(data)
chi_res$expected
#p_values <- c(p_values,chi_res$p.value)
test <- fisher.test(data)
test$p.value
p_values <- c(p_values,test$p.value)
} else p_values <- c(p_values,NA)
}Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
p_values_adjusted <- p.adjust(p_values,method = "BH")
names(p_values_adjusted) <- uni_df$SeqID
names(p_values) <- uni_df$SeqID
p_values_adjusted[which(p_values_adjusted<0.05)]named numeric(0)
asv_tab <- as.data.frame(rrarefy(
linda_data[[1]] %>% t(),
sample=10000)) %>% t() %>% as.data.frame()
asv_tab <- ifelse(asv_tab>10,"YES","NO")
asv_tab %<>% t() %>% as.data.frame()
asv_tab <- merge(asv_tab %>% rownames_to_column("SampleID"),filt_ileum_metadata,by="SampleID",all.x=TRUE) %>% column_to_rownames("SampleID")
p_values <- c()
for (taxon in uni_df$SeqID){
data <- asv_tab %>%
group_by(!!sym(taxon), Group) %>%
count() %>%
group_by(!!sym(taxon), Group) %>%
summarise(n = sum(n), .groups = "drop") %>%
pivot_wider(names_from = Group, values_from = n) %>%
column_to_rownames(taxon) %>%
as.matrix()
if (nrow(data)>1){
data[is.na(data)] <- 0
chi_res <- chisq.test(data)
chi_res$expected
test <- fisher.test(data)
test$p.value
chi_res$p.value
p_values <- c(p_values,test$p.value)
} else p_values <- c(p_values,NA)
}Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
p_values_adjusted <- p.adjust(p_values,method = "BH")
names(p_values_adjusted) <- uni_df$SeqID
names(p_values) <- uni_df$SeqID
p_values_adjusted[which(p_values_adjusted<0.05)]named numeric(0)
df <- data.frame(SeqID=uni_df$SeqID,
p=p_values,
p_adj=p_values_adjusted)prevalence_df <- uni_df[,c("SeqID","PREVALENCE_ non-rPSC","PREVALENCE_ rPSC")]
prevalence_df <- merge(prevalence_df,df,by="SeqID",all=TRUE)
write.xlsx(prevalence_df,file.path(path,paste0("supplements_prevalence_",segment,".xlsx")))2.4.1.2 rPSC vs healthy
group <- c("healthy","rPSC")
comparison_name <- paste0(group[1], " vs ",group[2])2.4.1.2.1 linDA
# prepare the data
linda_data <- binomial_prep(ileum_genus_tab,
ileum_genus_taxa_tab,
ileum_metadata,
group, usage="linDA")Removing 67 ASV(s)
Removing 4 ASV(s)
filt_ileum_uni_data <- linda_data[[1]]
filt_ileum_uni_taxa <- linda_data[[2]]
filt_ileum_uni_metadata <- linda_data[[3]]
# fit the model
linda.obj <- linda(filt_ileum_uni_data,
filt_ileum_uni_metadata,
formula = '~ Group * Country')0 features are filtered!
The filtered data has 106 samples and 207 features will be tested!
Imputation approach is used.
Fit linear models ...
Completed.
linda.output <- linda.obj$output
linda.output <- linda_renaming(linda.output, group)
# save the results
group1 <- paste0(group[1], " vs ","Group",group[2])
group2 <- paste0(group[1], " , ",group[2], " - CZ vs NO")
group3 <- paste0(group[1], " vs ","Group",group[2], ":CountryNO")
for (grp in c(group1,group2,group3)){
raw_linda_results_genus[[segment]][[grp]] <-
rawlinda.df(linda.output,
grp,
filt_ileum_uni_data,
filt_ileum_uni_taxa)
linda_results_genus[[segment]][[grp]] <-
linda.df(linda.output,
grp,
filt_ileum_uni_data,
filt_ileum_uni_taxa)
}# volcano plot
volcano_1 <- volcano_plot_linda(linda.output,
group1,
taxa_table = filt_ileum_uni_taxa) +
ggtitle(comparison_title)
volcano_2 <- volcano_plot_linda(linda.output, group2,
taxa_table = filt_ileum_uni_taxa) +
ggtitle("NO vs CZ")
volcano_3 <- volcano_plot_linda(linda.output, group3,
taxa_table = filt_ileum_uni_taxa) +
ggtitle("Interaction effect")
volcano <- ggarrange(volcano_1,volcano_2,volcano_3, ncol=3)2.4.1.2.2 MaAsLin2
volcano1 <- volcano_plot_maaslin(fit_data,filt_ileum_uni_taxa) +
ggtitle(comparison_title)
volcano2 <- volcano_plot_maaslin(fit_data,filt_ileum_uni_taxa,variable="Country") +
ggtitle("NO vs CZ")
volcano <- ggarrange(volcano1,volcano2, ncol=2)
volcano2.4.1.2.3 Group - Intersection
intersection_results <- group_intersection(group, list_intersections, list_venns,
linda.output, fit_data,
raw_linda_results_genus,
segment = segment,
level=level)
list_intersections <- intersection_results[[1]]
list_venns <- intersection_results[[2]]
venn <- intersection_results[[3]]
# show the results
venn2.4.1.2.4 Interaction effect
list_interaction_significant <- country_interaction(group,
linda.output,
list_intersections,
filt_ileum_uni_data,
filt_ileum_uni_metadata,
segment=segment,
level=level)
# see the result
## significant interaction effect
list_interaction_significant[[1]]NULL
## results for czech cohort
list_interaction_significant[[2]][1] NA
## results for norwegian cohort
list_interaction_significant[[3]][1] NA
Removing problematic taxa
list_intersections <- removing_interaction_problems(group,
list_interaction_significant,
list_intersections,
segment=segment,
level=level)2.4.1.2.5 Basic statistics
uni_df <- merge(basic_univariate_statistics(linda_data,group),
raw_linda_results_genus[[segment]][[group1]],
by="SeqID",all=TRUE)[1] "healthy"
[1] "rPSC"
[1] "healthy"
[1] "rPSC"
uni_df[["final_sig"]] <- uni_df$SeqID %in% list_intersections[[paste(segment,level,comparison_name)]][["SeqID"]]
uni_statistics[[segment]][[paste(level,comparison_name)]] <- uni_df
# for comparison
new_name <- comparison_name
addWorksheet(wb, sheetName = new_name)
writeData(wb, sheet = new_name, uni_df, rowNames=FALSE)2.4.1.3 non-rPSC vs healthy
group <- c("healthy","non-rPSC")
comparison_name <- paste0(group[1], " vs ",group[2])2.4.1.3.1 linDA
# prepare the data
linda_data <- binomial_prep(ileum_genus_tab,
ileum_genus_taxa_tab,
ileum_metadata,
group, usage="linDA")Removing 24 ASV(s)
Removing 2 ASV(s)
filt_ileum_uni_data <- linda_data[[1]]
filt_ileum_uni_taxa <- linda_data[[2]]
filt_ileum_uni_metadata <- linda_data[[3]]
# fit the model
linda.obj <- linda(filt_ileum_uni_data,
filt_ileum_uni_metadata,
formula = '~ Group * Country')0 features are filtered!
The filtered data has 178 samples and 179 features will be tested!
Imputation approach is used.
Fit linear models ...
Completed.
linda.output <- linda.obj$output
linda.output <- linda_renaming(linda.output, group)
# save the results
group1 <- paste0(group[1], " vs ","Group",group[2])
group2 <- paste0(group[1], " , ",group[2], " - CZ vs NO")
group3 <- paste0(group[1], " vs ","Group",group[2], ":CountryNO")
for (grp in c(group1,group2,group3)){
raw_linda_results_genus[[segment]][[grp]] <-
rawlinda.df(linda.output,
grp,
filt_ileum_uni_data,
filt_ileum_uni_taxa)
linda_results_genus[[segment]][[grp]] <-
linda.df(linda.output,
grp,
filt_ileum_uni_data,
filt_ileum_uni_taxa)
}# volcano plot
volcano_1 <- volcano_plot_linda(linda.output,
group1,
taxa_table = filt_ileum_uni_taxa) +
ggtitle(comparison_title)
volcano_2 <- volcano_plot_linda(linda.output, group2,
taxa_table = filt_ileum_uni_taxa) +
ggtitle("NO vs CZ")
volcano_3 <- volcano_plot_linda(linda.output, group3,
taxa_table = filt_ileum_uni_taxa) +
ggtitle("Interaction effect")
volcano <- ggarrange(volcano_1,volcano_2,volcano_3, ncol=3)
volcano2.4.1.3.2 MaAsLin2
volcano1 <- volcano_plot_maaslin(fit_data,filt_ileum_uni_taxa) +
ggtitle(comparison_title)
volcano2 <- volcano_plot_maaslin(fit_data,filt_ileum_uni_taxa,variable="Country") +
ggtitle("NO vs CZ")
volcano <- ggarrange(volcano1,volcano2, ncol=2)
volcano2.4.1.3.3 Group - Intersection
intersection_results <- group_intersection(group, list_intersections, list_venns,
linda.output, fit_data,
raw_linda_results_genus,
segment = segment,
level=level)
list_intersections <- intersection_results[[1]]
list_venns <- intersection_results[[2]]
venn <- intersection_results[[3]]
# show the results
venn2.4.1.3.4 Interaction effect
list_interaction_significant <- country_interaction(group,
linda.output,
list_intersections,
filt_ileum_uni_data,
filt_ileum_uni_metadata,
segment=segment,
level=level)
# see the result
## significant interaction effect
list_interaction_significant[[1]]NULL
## results for czech cohort
list_interaction_significant[[2]][1] NA
## results for norwegian cohort
list_interaction_significant[[3]][1] NA
Removing problematic taxa
list_intersections <- removing_interaction_problems(group,
list_interaction_significant,
list_intersections,
segment=segment,
level=level)2.4.1.3.5 Basic statistics
uni_df <- merge(basic_univariate_statistics(linda_data,group),
raw_linda_results_genus[[segment]][[group1]],
by="SeqID",all=TRUE)[1] "healthy"
[1] "non-rPSC"
[1] "healthy"
[1] "non-rPSC"
uni_df[["final_sig"]] <- uni_df$SeqID %in% list_intersections[[paste(segment,level,comparison_name)]][["SeqID"]]
uni_statistics[[segment]][[paste(level,comparison_name)]] <- uni_df
# for comparison
new_name <- comparison_name
addWorksheet(wb, sheetName = new_name)
writeData(wb, sheet = new_name, uni_df, rowNames=FALSE)2.4.1.4 Visualization
Heatmap visualizing the linDA’s logFoldChange for taxa with p < 0.1.
list_heatmap <- list_intersections[grep(paste(segment,level),
names(list_intersections),value=TRUE)]
p_heatmap_linda <- heatmap_linda(list_heatmap,ileum_taxa_tab)
p_heatmap_lindaDot heatmap
dotheatmap_linda <- dot_heatmap_linda(
list_heatmap, uni_statistics$terminal_ileum[grepl(level,names(uni_statistics$terminal_ileum))],
ileum_taxa_tab) + xlab("") + ylab("")min_clr -1.727234
max_clr 7.029952
min_log -4.215042
max_log 3.331439
dotheatmap_lindaHorizontal bar plot
p_prevalence <- horizontal_barplot(wb,taxa=levels(dotheatmap_linda$data$SeqID))Using SeqID as id variables
p_prevalence_final <- ggarrange(p_prevalence,
ggplot() + theme_minimal(),
nrow = 2,heights = c(1,0.085))
p <- ggarrange(dotheatmap_linda + theme(legend.position = "none"),p_prevalence_final,ncol=2,widths = c(1,0.3))
pdot_heatmap_ileum <- p2.4.1.5 rPSC effect
pre_LTx vs Healthy and Post_LTx vs Healthy intersection
A <- list_intersections[[paste(segment,level,"healthy vs rPSC")]]
B <- list_intersections[[paste(segment,level,"healthy vs non-rPSC")]]
df <- A[!(A$SeqID %in% B$SeqID),]
rpsc_effect[[paste(segment,level)]] <- df
# see the results
rpsc_effect[[paste(segment,level)]] SeqID
6 Collinsella
8 Adlercreutzia
15 Bacteroides
16 Barnesiella
29 Parabacteroides
32 Bilophila
41 Holdemania
75 Lachnospiraceae_ND3007_group
76 Lachnospiraceae_NK4A136_group
84 Sellimonas
99 Butyricicoccus
134 Family_XIII_AD3011_group
155 Pseudomonas
169 Lacticaseibacillus
188 Veillonellaceae
199 Klebsiella
Taxonomy
6 k__Bacteria;p__Actinobacteriota;c__Coriobacteriia;o__Coriobacteriales;f__Coriobacteriaceae;g__Collinsella
8 k__Bacteria;p__Actinobacteriota;c__Coriobacteriia;o__Coriobacteriales;f__Eggerthellaceae;g__Adlercreutzia
15 k__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Bacteroidaceae;g__Bacteroides
16 k__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Barnesiellaceae;g__Barnesiella
29 k__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Tannerellaceae;g__Parabacteroides
32 k__Bacteria;p__Desulfobacterota;c__Desulfovibrionia;o__Desulfovibrionales;f__Desulfovibrionaceae;g__Bilophila
41 k__Bacteria;p__Firmicutes;c__Bacilli;o__Erysipelotrichales;f__Erysipelotrichaceae;g__Holdemania
75 k__Bacteria;p__Firmicutes;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Lachnospiraceae_ND3007_group
76 k__Bacteria;p__Firmicutes;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Lachnospiraceae_NK4A136_group
84 k__Bacteria;p__Firmicutes;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Sellimonas
99 k__Bacteria;p__Firmicutes;c__Clostridia;o__Oscillospirales;f__Butyricicoccaceae;g__Butyricicoccus
134 k__Bacteria;p__Firmicutes;c__Clostridia;o__Peptostreptococcales-Tissierellales;f__Anaerovoracaceae;g__Family_XIII_AD3011_group
155 k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas
169 k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Lactobacillaceae;g__Lacticaseibacillus
188 k__Bacteria;p__Firmicutes;c__Negativicutes;o__Veillonellales-Selenomonadales;f__Veillonellaceae;g__Veillonellaceae
199 k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Klebsiella
log2FoldChange p_value padj
6 -2.349468 1.092155e-02 0.0443286425
8 -1.638315 7.387504e-03 0.0332437679
15 -2.119974 3.932211e-04 0.0040698386
16 -4.215042 1.576685e-05 0.0004079671
29 -3.049283 2.871549e-05 0.0005403732
32 -2.837220 4.305534e-05 0.0007427046
41 -1.686545 1.351704e-03 0.0099929559
75 -2.216009 4.785012e-03 0.0260657213
76 -2.596492 1.644208e-04 0.0020100837
84 1.773614 1.044921e-02 0.0441425798
99 -1.751985 3.166657e-03 0.0204843095
134 -2.011614 1.298386e-04 0.0019186055
155 1.025212 8.916234e-03 0.0384512596
169 1.281555 1.747899e-04 0.0020100837
188 1.244317 1.665793e-04 0.0020100837
199 2.461632 4.570063e-03 0.0260223513
2.4.1.6 Saving results
# ALL DATA
saveWorkbook(wb,file.path(path,paste0("uni_analysis_wb_",segment,".xlsx")),
overwrite = TRUE)
# PSC effect
write.xlsx(rpsc_effect[[paste(segment,level)]],file.path(path,paste0("rpsc_effect_",segment,".xlsx")))
# SIGNIFICANT taxa
write.xlsx(list_intersections[grepl(segment,names(list_intersections))] %>%
`names<-`(gsub(segment, "", names(
list_intersections[grepl(segment,names(list_intersections))]))),
file.path(path,paste0("significant_taxa_",segment,".xlsx")))2.5 Machine learning
path = "../../../results/merged_sites/main_analysis/Q2/models"2.5.1 ElasticNet
model="enet"2.5.1.1 Genus level
level="genus"Aggregate taxa
genus_data <- aggregate_taxa(ileum_asv_tab,
ileum_taxa_tab,
taxonomic_level = level)
ileum_genus_tab <- genus_data[[1]]
ileum_genus_taxa_tab <- genus_data[[2]]2.5.1.1.1 rPSC vs non-rPSC - all
group <- c("rPSC","non-rPSC")
comparison_name <- paste0(group[1], " vs ",group[2])model_name <- paste(comparison_name,level,segment)
# prepare the data
filt_ileum_uni_data <- binomial_prep(ileum_genus_tab,
ileum_genus_taxa_tab,
ileum_metadata,
group,
usage="ml_clr")Removing 55 ASV(s)
Removing 2 ASV(s)
# fit the model
enet_model <- glmnet_binomial(filt_ileum_uni_data,
sample_method = "atypboot",
outcome="Group",
N=500,
reuse=TRUE,
file=model_name,
Q="Q2")
# ROC
roc_c <- roc_curve(enet_model, group)
# save the results
models_summ <- list()
models_cm <- list()
roc_cs <- list()
betas <- list()
models_summ[[model_name]] <- enet_model$model_summary
models_cm[[model_name]] <- enet_model$conf_matrices$original
roc_cs[[model_name]] <- enet_model$kfold_rocobjs
betas[[model_name]] <- as.matrix(enet_model$betas)
# see the results
enet_model$model_summary %>% t() [,1]
alpha 0.4000000
lambda 0.3119645
auc 0.6432900
auc_czech 0.6567842
auc_no 0.6069959
auc_optimism_corrected 0.5736934
auc_optimism_corrected_CIL 0.3905857
auc_optimism_corrected_CIU 0.7322617
accuracy 0.7608696
accuracy_czech NaN
accuracy_no 0.7500000
accuracy_optimism_corrected 0.7353858
accuracy_optimism_corrected_CIL 0.6288683
accuracy_optimism_corrected_CIU 0.8277572
enet_model$conf_matrices$original
0
0 105 0
1 33 0
$czech
0
0 78 0
1 24 0
$no
0
0 27 0
1 9 0
enet_model$plot`geom_smooth()` using formula = 'y ~ x'
`geom_smooth()` using formula = 'y ~ x'
roc_c2.5.1.1.2 rPSC vs non-rPSC - subset
group <- c("rPSC","non-rPSC")
comparison_name <- paste0(group[1], " vs ",group[2]," subset")taxa_to_test <- read.xlsx("../../../results/merged_sites/main_analysis/Q1/univariate_analysis/significant_taxa_terminal_ileum.xlsx", sheet=" genus healthy vs pre_ltx")
model_name <- paste(comparison_name,level,segment)
# prepare the data
filt_ileum_uni_data <- binomial_prep_psc_effect(ileum_genus_tab,
ileum_genus_taxa_tab,
ileum_metadata,
taxa_to_test,
usage="ml_clr")Removing 55 ASV(s)
Removing 2 ASV(s)
# fit the model
enet_model <- glmnet_binomial(filt_ileum_uni_data,
sample_method = "atypboot",
outcome="Group",
N=500,
reuse=TRUE,
file=model_name,
Q="Q2")
# ROC
roc_c <- roc_curve(enet_model, group)
# save the results
models_summ[[model_name]] <- enet_model$model_summary
models_cm[[model_name]] <- enet_model$conf_matrices$original
roc_cs[[model_name]] <- enet_model$kfold_rocobjs
betas[[model_name]] <- as.matrix(enet_model$betas)
# see the results
enet_model$model_summary %>% t() [,1]
alpha 0.2000000
lambda 0.5017248
auc 0.5160173
auc_czech 0.5224359
auc_no 0.5000000
auc_optimism_corrected 0.5599297
auc_optimism_corrected_CIL 0.3863553
auc_optimism_corrected_CIU 0.7232201
accuracy 0.7608696
accuracy_czech NaN
accuracy_no 0.7500000
accuracy_optimism_corrected 0.7539422
accuracy_optimism_corrected_CIL 0.6525954
accuracy_optimism_corrected_CIU 0.8442308
enet_model$conf_matrices$original
0
0 105 0
1 33 0
$czech
0
0 78 0
1 24 0
$no
0
0 27 0
1 9 0
enet_model$plot`geom_smooth()` using formula = 'y ~ x'
`geom_smooth()` using formula = 'y ~ x'
roc_c2.5.1.2 Saving results
models_summ_df_ileum <- do.call(rbind,
models_summ[grep(segment,names(models_summ),value = TRUE)])
write.csv(models_summ_df_ileum,file.path(path,paste0("elastic_net_",segment,".csv")))2.6 Results overview
2.6.0.1 Alpha diversity
Models
knitr::kable(pc_observed[[segment]],
digits = 3,
caption = "Results of linear model testing ASV Richness")| Estimate | Std..Error | t.value | Pr…t.. | p.adj | sig | |
|---|---|---|---|---|---|---|
| non-rPSC vs GrouprPSC | -12.199 | 11.972 | -1.019 | 0.310 | 0.349 | |
| non-rPSC , rPSC - CZ vs NO | -21.486 | 11.452 | -1.876 | 0.063 | 0.141 | |
| non-rPSC vs GrouprPSC:CountryNO | 9.347 | 23.088 | 0.405 | 0.686 | 0.686 | |
| healthy vs GrouprPSC | -37.741 | 12.284 | -3.072 | 0.003 | 0.025 | * |
| healthy , rPSC - CZ vs NO | 11.203 | 10.973 | 1.021 | 0.310 | 0.349 | |
| healthy vs GrouprPSC:CountryNO | -23.342 | 21.355 | -1.093 | 0.277 | 0.349 | |
| healthy vs Groupnon-rPSC | -25.542 | 9.322 | -2.740 | 0.007 | 0.031 | * |
| healthy , non-rPSC - CZ vs NO | 11.203 | 10.933 | 1.025 | 0.307 | 0.349 | |
| healthy vs Groupnon-rPSC:CountryNO | -32.689 | 15.108 | -2.164 | 0.032 | 0.096 |
knitr::kable(pc_shannon[[segment]],
digits = 3,
caption = "Results of linear model testing Shannon index")| Estimate | Std..Error | t.value | Pr…t.. | p.adj | sig | |
|---|---|---|---|---|---|---|
| non-rPSC vs GrouprPSC | -0.102 | 0.146 | -0.696 | 0.487 | 0.731 | |
| non-rPSC , rPSC - CZ vs NO | -0.349 | 0.140 | -2.500 | 0.014 | 0.123 | |
| non-rPSC vs GrouprPSC:CountryNO | -0.006 | 0.281 | -0.021 | 0.983 | 0.983 | |
| healthy vs GrouprPSC | -0.210 | 0.140 | -1.501 | 0.137 | 0.321 | |
| healthy , rPSC - CZ vs NO | 0.005 | 0.125 | 0.043 | 0.966 | 0.983 | |
| healthy vs GrouprPSC:CountryNO | -0.360 | 0.244 | -1.478 | 0.142 | 0.321 | |
| healthy vs Groupnon-rPSC | -0.109 | 0.115 | -0.948 | 0.344 | 0.620 | |
| healthy , non-rPSC - CZ vs NO | 0.005 | 0.135 | 0.040 | 0.968 | 0.983 | |
| healthy vs Groupnon-rPSC:CountryNO | -0.354 | 0.186 | -1.905 | 0.058 | 0.263 |
knitr::kable(pc_simpson[[segment]],
digits = 3,
caption = "Results of linear model testing Simpson index")| Estimate | Std..Error | t.value | Pr…t.. | p.adj | sig | |
|---|---|---|---|---|---|---|
| non-rPSC vs GrouprPSC | 0.000 | 0.023 | 0.019 | 0.985 | 0.985 | |
| non-rPSC , rPSC - CZ vs NO | -0.034 | 0.022 | -1.573 | 0.118 | 0.531 | |
| non-rPSC vs GrouprPSC:CountryNO | -0.029 | 0.043 | -0.661 | 0.510 | 0.767 | |
| healthy vs GrouprPSC | -0.009 | 0.018 | -0.500 | 0.618 | 0.767 | |
| healthy , rPSC - CZ vs NO | -0.008 | 0.016 | -0.501 | 0.617 | 0.767 | |
| healthy vs GrouprPSC:CountryNO | -0.055 | 0.031 | -1.779 | 0.078 | 0.531 | |
| healthy vs Groupnon-rPSC | -0.009 | 0.016 | -0.564 | 0.574 | 0.767 | |
| healthy , non-rPSC - CZ vs NO | -0.008 | 0.019 | -0.410 | 0.682 | 0.767 | |
| healthy vs Groupnon-rPSC:CountryNO | -0.026 | 0.027 | -0.975 | 0.331 | 0.767 |
knitr::kable(pc_pielou[[segment]],
digits = 3,
caption = "Results of linear model testing Pielou index")| Estimate | Std..Error | t.value | Pr…t.. | p.adj | sig | |
|---|---|---|---|---|---|---|
| non-rPSC vs GrouprPSC | -0.004 | 0.022 | -0.184 | 0.854 | 0.946 | |
| non-rPSC , rPSC - CZ vs NO | -0.047 | 0.021 | -2.260 | 0.025 | 0.229 | |
| non-rPSC vs GrouprPSC:CountryNO | -0.005 | 0.042 | -0.108 | 0.914 | 0.946 | |
| healthy vs GrouprPSC | -0.001 | 0.021 | -0.067 | 0.946 | 0.946 | |
| healthy , rPSC - CZ vs NO | -0.009 | 0.018 | -0.463 | 0.645 | 0.946 | |
| healthy vs GrouprPSC:CountryNO | -0.043 | 0.036 | -1.195 | 0.235 | 0.705 | |
| healthy vs Groupnon-rPSC | 0.003 | 0.018 | 0.146 | 0.884 | 0.946 | |
| healthy , non-rPSC - CZ vs NO | -0.009 | 0.021 | -0.409 | 0.683 | 0.946 | |
| healthy vs Groupnon-rPSC:CountryNO | -0.038 | 0.029 | -1.329 | 0.185 | 0.705 |
Plots
alpha_div_plots[[paste(segment,"Country")]]2.6.0.2 Beta diversity
Main results
knitr::kable(pairwise_aitchison_raw[[paste("genus", segment)]],
digits = 3,
caption = "Results of PERMANOVA - robust aitchison distance")| pairs | Df | SumsOfSqs | F.Model | R2 | p.value | p.adjusted | sig |
|---|---|---|---|---|---|---|---|
| rPSC vs non-rPSC | 1 | 215.579 | 1.208 | 0.009 | 0.107 | 0.107 | |
| rPSC vs healthy | 1 | 561.696 | 3.345 | 0.030 | 0.001 | 0.002 | ** |
| non-rPSC vs healthy | 1 | 760.242 | 4.431 | 0.024 | 0.001 | 0.002 | ** |
| rPSC vs non-rPSC , Country | 1 | 601.929 | 3.372 | 0.024 | 0.001 | 0.001 | *** |
| rPSC vs healthy , Country | 1 | 496.841 | 2.959 | 0.027 | 0.001 | 0.001 | *** |
| non-rPSC vs healthy , Country | 1 | 624.474 | 3.640 | 0.020 | 0.001 | 0.001 | *** |
| rPSC vs non-rPSC : Country | 1 | 156.176 | 0.874 | 0.006 | 0.788 | 0.788 | |
| rPSC vs healthy : Country | 1 | 164.743 | 0.981 | 0.009 | 0.464 | 0.696 | |
| non-rPSC vs healthy : Country | 1 | 209.954 | 1.225 | 0.007 | 0.089 | 0.267 |
PCA
pca_plots_list[[paste(segment,"genus custom")]]2.6.0.3 Univariate analysis
Number of significant taxa
knitr::kable(cbind(as.data.frame(lapply(list_intersections,nrow)),
as.data.frame(lapply(rpsc_effect,nrow))) %>% t() %>%
`colnames<-`("Count") %>%
`rownames<-`(c(names(list_intersections),"PSC effect Genus")),caption="Number of significant taxa")| Count | |
|---|---|
| terminal_ileum genus non-rPSC vs rPSC | 0 |
| terminal_ileum genus healthy vs rPSC | 34 |
| terminal_ileum genus healthy vs non-rPSC | 25 |
| PSC effect Genus | 16 |
2.6.0.4 Machine learning
Main models
knitr::kable(models_summ_df_ileum %>% dplyr::select(
"alpha","lambda",
"auc_optimism_corrected",
"auc_optimism_corrected_CIL",
"auc_optimism_corrected_CIU"),
digits=2,caption="Elastic net results")| alpha | lambda | auc_optimism_corrected | auc_optimism_corrected_CIL | auc_optimism_corrected_CIU | |
|---|---|---|---|---|---|
| rPSC vs non-rPSC genus terminal_ileum | 0.4 | 0.31 | 0.57 | 0.39 | 0.73 |
| rPSC vs non-rPSC subset genus terminal_ileum | 0.2 | 0.50 | 0.56 | 0.39 | 0.72 |
ROC - Genus level
p <- roc_curve_all_custom(roc_cs[c(1:2)],Q="Q2",
model_name="enet_model",legend = FALSE)
p3 Data Analysis - Colon
segment="colon"3.1 Filtering
Rules: - sequencing depth > 10000
- nearZeroVar() with default settings
Rarefaction Curve
path="../../../intermediate_files/rarecurves"
seq_depth_threshold <- 10000ps <- construct_phyloseq(colon_asv_tab,colon_taxa_tab,colon_metadata)
rareres <- get_rarecurve(obj=ps, chunks=500)
save(rareres,file = file.path(path,"rarefaction_colon.Rdata"))load(file.path(path,"rarefaction_colon.Rdata"))
seq_depth_threshold <- 10000
prare <- ggrarecurve(obj=rareres,
factorNames="Country",
indexNames=c("Observe")) +
theme_bw()+
theme(axis.text=element_text(size=8),
panel.grid=element_blank(),
strip.background = element_rect(colour=NA,fill="grey"),
strip.text.x = element_text(face="bold")) +
geom_vline(xintercept = seq_depth_threshold,
linetype="dashed", color = "red") +
xlim(0, 20000)The color has been set automatically, you can reset it manually by adding scale_color_manual(values=yourcolors)
prareLibrary size
read_counts(colon_asv_tab, line = c(5000,10000))Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
3.1.1 Sequencing depth
data_filt <- seq_depth_filtering(colon_asv_tab,
colon_taxa_tab,
colon_metadata,
seq_depth_threshold = 10000)Removing 50 ASV(s)
filt_colon_asv_tab <- data_filt[[1]]; alpha_colon_asv_tab <- filt_colon_asv_tab
filt_colon_taxa_tab <- data_filt[[2]]; alpha_colon_taxa_tab <- filt_colon_taxa_tab
filt_colon_metadata <- data_filt[[3]]; alpha_colon_metadata <- filt_colon_metadata
seq_step <- dim(filt_colon_asv_tab)[1]Library size
read_counts(filt_colon_asv_tab,line = c(10000))Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
3.1.2 NearZeroVar
data_filt <- nearzerovar_filtering(filt_colon_asv_tab,
filt_colon_taxa_tab,
filt_colon_metadata)
filt_colon_asv_tab <- data_filt[[1]]
filt_colon_taxa_tab <- data_filt[[2]]
nearzero_step <- dim(filt_colon_asv_tab)[1]Library size
read_counts(filt_colon_asv_tab,line = c(5000,10000))Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Check zero depth
data_filt <- check_zero_depth(filt_colon_asv_tab,
filt_colon_taxa_tab,
filt_colon_metadata)
filt_colon_asv_tab <- data_filt[[1]];
filt_colon_taxa_tab <- data_filt[[2]];
filt_colon_metadata <- data_filt[[3]]; Library size
read_counts(filt_colon_asv_tab,line = c(5000,10000))Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
3.1.3 Final Counts
final_counts_filtering(colon_asv_tab,
filt_colon_asv_tab,
filt_colon_metadata,
seq_step, 0, nearzero_step) V1
Raw data: ASVs 5840
Raw data: Samples 527
Sequencing depth filt: ASVs 5790
Prevalence filt: ASVs 0
NearZeroVar filt: ASVs 444
Filt data: ASVs 444
Filt data: Samples 511
Filt data: Patients 250
Filt data: Patients.1 0
Filtered samples 16
Matrices Cecum;Rectum;CD;CA;SI
healthy 161
non-rPSC 263
rPSC 87
pre_ltx 0
post_ltx 0
ETOH 0
3.2 Alpha diversity
path = "../../../results/merged_sites/main_analysis/Q2/alpha_diversity"Calculation
# Construct MPSE object
alpha_colon_metadata$Sample <- alpha_colon_metadata$SampleID
colon_mpse <- as.MPSE(construct_phyloseq(alpha_colon_asv_tab,
alpha_colon_taxa_tab,
alpha_colon_metadata))
colon_mpse %<>% mp_rrarefy(raresize = 10000,seed = 123)
# Calculate alpha diversity - rarefied counts
colon_mpse %<>% mp_cal_alpha(.abundance=RareAbundance, force=TRUE)alpha_data <- data.frame(SampleID=colon_mpse$Sample.x,
Observe=colon_mpse$Observe,
Shannon=colon_mpse$Shannon,
Simpson=colon_mpse$Simpson,
Pielou=colon_mpse$Pielou,
Group=colon_mpse$Group,
Country=colon_mpse$Country,
Patient=colon_mpse$Patient)
write.csv(alpha_data,file.path(path,paste0("alpha_indices_",segment,".csv")),
row.names = FALSE)3.2.1 Plots
p_boxplot_alpha <- alpha_diversity_countries(alpha_data)Using SampleID, Group, Country, Patient as id variables
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Using SampleID, Group, Country, Patient as id variables
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
# save the results
alpha_div_plots[[paste(segment,"Country")]] <- p_boxplot_alpha
# see the results
p_boxplot_alpha3.2.2 Linear Models
path = "../../../results/merged_sites/main_analysis/Q2/alpha_diversity"
alpha_data <- read.csv(file.path(path,paste0("alpha_indices_",segment,".csv")))Richness
results_model <- pairwise.lmer(
formula = "Observe ~ Group * Country + (1|Patient)",
factors=alpha_data$Group,
data=alpha_data)
# check interaction
if (!is.data.frame(results_model)){
results_model_observe <- results_model[[1]]
results_model_observe_detailed <- results_model[[2]]
} else {
results_model_observe <- results_model
results_model_observe_detailed <- NA
}
# save the results
pc_observed[[segment]] <- results_model_observe# see the results
knitr::kable(results_model_observe,digits = 3,
caption = "Raw results of linear model of richness estimation.")| Estimate | Std..Error | df | t.value | Pr…t.. | p.adj | sig | |
|---|---|---|---|---|---|---|---|
| non-rPSC vs GrouprPSC | -17.252 | 12.469 | 156.883 | -1.384 | 0.168 | 0.192 | |
| non-rPSC vs rPSC - CZ vs NO | -17.941 | 10.644 | 153.303 | -1.686 | 0.094 | 0.192 | |
| non-rPSC vs GrouprPSC:CountryNO | 3.483 | 21.371 | 152.108 | 0.163 | 0.871 | 0.871 | |
| healthy vs GrouprPSC | -40.307 | 11.924 | 125.516 | -3.380 | 0.001 | 0.009 | ** |
| healthy vs rPSC - CZ vs NO | 14.783 | 10.544 | 126.613 | 1.402 | 0.163 | 0.192 | |
| healthy vs GrouprPSC:CountryNO | -29.232 | 19.769 | 123.028 | -1.479 | 0.142 | 0.192 | |
| healthy vs Groupnon-rPSC | -22.976 | 8.905 | 211.354 | -2.580 | 0.011 | 0.047 | * |
| healthy vs non-rPSC - CZ vs NO | 14.961 | 10.892 | 212.956 | 1.374 | 0.171 | 0.192 | |
| healthy vs Groupnon-rPSC:CountryNO | -32.911 | 14.701 | 208.353 | -2.239 | 0.026 | 0.079 |
knitr::kable(results_model_observe_detailed,digits = 3,
caption = "Raw results of independent country analysis")
|
Shannon
results_model <- pairwise.lmer(
formula = "Shannon ~ Group * Country + (1|Patient)",
factors=alpha_data$Group,
data=alpha_data)
# check interaction
if (!is.data.frame(results_model)){
results_model_shannon <- results_model[[1]]
results_model_shannon_detailed <- results_model[[2]]
} else {
results_model_shannon <- results_model
results_model_shannon_detailed <- NA
}
# save the results
pc_shannon[[segment]] <- as.data.frame(results_model_shannon)# see the results
knitr::kable(results_model_shannon,digits = 3,
caption = "Raw results of linear model of Shannon estimation.")| Estimate | Std..Error | df | t.value | Pr…t.. | p.adj | sig | |
|---|---|---|---|---|---|---|---|
| non-rPSC vs GrouprPSC | -0.146 | 0.161 | 154.485 | -0.907 | 0.366 | 0.549 | |
| non-rPSC vs rPSC - CZ vs NO | -0.355 | 0.138 | 151.788 | -2.581 | 0.011 | 0.051 | |
| non-rPSC vs GrouprPSC:CountryNO | 0.008 | 0.277 | 150.856 | 0.030 | 0.976 | 0.976 | |
| healthy vs GrouprPSC | -0.330 | 0.128 | 126.201 | -2.572 | 0.011 | 0.051 | |
| healthy vs rPSC - CZ vs NO | 0.017 | 0.113 | 127.217 | 0.146 | 0.884 | 0.976 | |
| healthy vs GrouprPSC:CountryNO | -0.364 | 0.213 | 123.907 | -1.709 | 0.090 | 0.162 | |
| healthy vs Groupnon-rPSC | -0.182 | 0.101 | 207.051 | -1.795 | 0.074 | 0.162 | |
| healthy vs non-rPSC - CZ vs NO | 0.018 | 0.124 | 208.617 | 0.145 | 0.884 | 0.976 | |
| healthy vs Groupnon-rPSC:CountryNO | -0.375 | 0.168 | 204.129 | -2.240 | 0.026 | 0.078 |
knitr::kable(results_model_shannon_detailed,digits = 3,
caption = "Raw results of independent country analysis")
|
|
Simpson
results_model <- pairwise.lmer(
formula = "Simpson ~ Group * Country + (1|Patient)",
factors=alpha_data$Group,
data=alpha_data)
# check interaction
if (!is.data.frame(results_model)){
results_model_simpson <- results_model[[1]]
results_model_simpson_detailed <- results_model[[2]]
} else {
results_model_simpson <- results_model
results_model_simpson_detailed <- NA
}
# save the results
pc_simpson[[segment]] <- as.data.frame(results_model_simpson)# see the results
knitr::kable(results_model_simpson,digits = 3,
caption = "Raw results of linear model of Simpson estimation.")| Estimate | Std..Error | df | t.value | Pr…t.. | p.adj | sig | |
|---|---|---|---|---|---|---|---|
| non-rPSC vs GrouprPSC | -0.005 | 0.029 | 153.608 | -0.169 | 0.866 | 0.866 | |
| non-rPSC vs rPSC - CZ vs NO | -0.032 | 0.024 | 151.251 | -1.296 | 0.197 | 0.443 | |
| non-rPSC vs GrouprPSC:CountryNO | -0.027 | 0.049 | 150.427 | -0.554 | 0.581 | 0.806 | |
| healthy vs GrouprPSC | -0.031 | 0.020 | 127.660 | -1.523 | 0.130 | 0.391 | |
| healthy vs rPSC - CZ vs NO | -0.007 | 0.018 | 128.411 | -0.402 | 0.688 | 0.806 | |
| healthy vs GrouprPSC:CountryNO | -0.052 | 0.034 | 125.982 | -1.540 | 0.126 | 0.391 | |
| healthy vs Groupnon-rPSC | -0.026 | 0.016 | 201.924 | -1.600 | 0.111 | 0.391 | |
| healthy vs non-rPSC - CZ vs NO | -0.007 | 0.020 | 203.469 | -0.363 | 0.717 | 0.806 | |
| healthy vs Groupnon-rPSC:CountryNO | -0.025 | 0.026 | 199.046 | -0.952 | 0.342 | 0.616 |
knitr::kable(results_model_simpson_detailed,digits = 3,
caption = "Raw results of independent country analysis")Table: Raw results of independent country analysis
Pielou
results_model <- pairwise.lmer(
formula = "Pielou ~ Group * Country + (1|Patient)",
factors=alpha_data$Group,
data=alpha_data)
# check interaction
if (!is.data.frame(results_model)){
results_model_pielou <- results_model[[1]]
results_model_pielou_detailed <- results_model[[2]]
} else {
results_model_pielou <- results_model
results_model_pielou_detailed <- NA
}
# save the results
pc_pielou[[segment]] <- as.data.frame(results_model_pielou)# see the results
knitr::kable(results_model_pielou,digits = 3,
caption = "Raw results of linear model of Pielou estimation.")| Estimate | Std..Error | df | t.value | Pr…t.. | p.adj | sig | |
|---|---|---|---|---|---|---|---|
| non-rPSC vs GrouprPSC | -0.007 | 0.026 | 153.863 | -0.292 | 0.771 | 0.843 | |
| non-rPSC vs rPSC - CZ vs NO | -0.052 | 0.022 | 150.735 | -2.372 | 0.019 | 0.171 | |
| non-rPSC vs GrouprPSC:CountryNO | -0.009 | 0.044 | 149.674 | -0.198 | 0.843 | 0.843 | |
| healthy vs GrouprPSC | -0.023 | 0.020 | 127.665 | -1.183 | 0.239 | 0.538 | |
| healthy vs rPSC - CZ vs NO | -0.008 | 0.017 | 129.013 | -0.457 | 0.648 | 0.843 | |
| healthy vs GrouprPSC:CountryNO | -0.052 | 0.033 | 124.565 | -1.605 | 0.111 | 0.333 | |
| healthy vs Groupnon-rPSC | -0.015 | 0.015 | 204.651 | -1.006 | 0.316 | 0.568 | |
| healthy vs non-rPSC - CZ vs NO | -0.008 | 0.019 | 206.473 | -0.416 | 0.678 | 0.843 | |
| healthy vs Groupnon-rPSC:CountryNO | -0.044 | 0.025 | 201.171 | -1.745 | 0.083 | 0.333 |
knitr::kable(results_model_pielou_detailed,digits = 3,
caption = "Raw results of independent country analysis")
|
3.2.3 Saving results
alpha_list <- list(
Richness=pc_observed[[segment]] %>% rownames_to_column("Comparison"),
Shannon=pc_shannon[[segment]] %>% rownames_to_column("Comparison"),
Simpson=pc_simpson[[segment]] %>% rownames_to_column("Comparison"),
Pielou=pc_pielou[[segment]] %>% rownames_to_column("Comparison"))
write.xlsx(alpha_list,
file = file.path(path,paste0("alpha_diversity_results_",segment,".xlsx")))3.3 Beta diversity
Calculating Aitchison distance (euclidean distance on clr-transformed data), both at ASV and genus level.
3.3.1 Main analysis
Genus level, Aitchison distance
level="genus"
# needed path
path = "../../../results/merged_sites/main_analysis/Q2/beta_diversity"Aggregation, filtering
genus_data <- aggregate_taxa(colon_asv_tab,
colon_taxa_tab,
taxonomic_level=level,
names=TRUE)
filt_data <- filtering_steps(genus_data[[1]],
genus_data[[2]],
colon_metadata,
seq_depth_threshold=10000)Removing 5 ASV(s)
filt_colon_genus_tab <- filt_data[[1]]
filt_colon_genus_taxa <- filt_data[[2]]
filt_colon_genus_metadata <- filt_data[[3]]3.3.1.0.1 PERMANOVA
pairwise_df <- filt_colon_genus_tab %>% column_to_rownames("SeqID") %>% t()
# main effect
pp_main <- pairwise.adonis(pairwise_df,filt_colon_genus_metadata$Group,
covariate = filt_colon_genus_metadata$Country,
patients = filt_colon_genus_metadata$Patient,
sim.method = "robust.aitchison", p.adjust.m="BH")
# interaction
pp_int <- pairwise.adonis(pairwise_df,filt_colon_genus_metadata$Group,
covariate = filt_colon_genus_metadata$Country,
interaction = TRUE,
patients = filt_colon_genus_metadata$Patient,
sim.method = "robust.aitchison", p.adjust.m="BH")
# tidy the results
pp_factor <- pp_main[[1]]
pp_cov <- pp_main[[2]]
pp_fac.cov <- pp_int[[3]]
cols <- c("pairs","Df","SumsOfSqs", "F.Model","R2","p.value", "p.adjusted", "sig")
colnames(pp_factor) <- cols; colnames(pp_cov) <- cols; colnames(pp_fac.cov) <- cols;
# save raw results
pairwise_aitchison_raw[[paste(level, segment)]] <-rbind(pp_factor,pp_cov,pp_fac.cov)# see the results
knitr::kable(pp_factor,digits = 3,caption = "PERMANOVA, GROUP separation")| pairs | Df | SumsOfSqs | F.Model | R2 | p.value | p.adjusted | sig |
|---|---|---|---|---|---|---|---|
| rPSC vs non-rPSC | 1 | 419.973 | 2.475 | 0.007 | 0.415 | 0.415 | |
| rPSC vs healthy | 1 | 1473.966 | 8.988 | 0.035 | 0.001 | 0.002 | ** |
| non-rPSC vs healthy | 1 | 1742.331 | 10.401 | 0.024 | 0.001 | 0.002 | ** |
knitr::kable(pp_cov,digits = 3,caption = "PERMANOVA, COUNTRY separation")| pairs | Df | SumsOfSqs | F.Model | R2 | p.value | p.adjusted | sig |
|---|---|---|---|---|---|---|---|
| rPSC vs non-rPSC , Country | 1 | 1556.933 | 9.177 | 0.026 | 0.001 | 0.001 | *** |
| rPSC vs healthy , Country | 1 | 883.284 | 5.386 | 0.021 | 0.001 | 0.001 | *** |
| non-rPSC vs healthy , Country | 1 | 1558.672 | 9.304 | 0.021 | 0.001 | 0.001 | *** |
knitr::kable(pp_fac.cov,digits = 3,caption = "PERMANOVA, INTERACTION GROUP:Country")| pairs | Df | SumsOfSqs | F.Model | R2 | p.value | p.adjusted | sig |
|---|---|---|---|---|---|---|---|
| rPSC vs non-rPSC : Country | 1 | 318.175 | 1.880 | 0.005 | 0.970 | 0.970 | |
| rPSC vs healthy : Country | 1 | 319.594 | 1.956 | 0.008 | 0.820 | 0.970 | |
| non-rPSC vs healthy : Country | 1 | 407.191 | 2.439 | 0.006 | 0.163 | 0.489 |
Interaction check
interaction_sig <- pp_fac.cov$pairs[pp_fac.cov$p.adjusted < 0.05]
if (length(interaction_sig)>0){
for (i in 1:length(interaction_sig)){
group1 <- unlist(strsplit(interaction_sig[i],split = " vs "))[1]
group2 <- unlist(strsplit(interaction_sig[i],split = " vs "))[2]
group2 <- unlist(strsplit(group2,split = " : "))[1]
result_list <- adonis_postanalysis(x=pairwise_df,
factors = filt_colon_genus_metadata$Group,
covariate = filt_colon_genus_metadata$Country,
group1 = group1,
group2 = group2,
patients = filt_colon_genus_metadata$Patient)
print(result_list)
}
}3.3.1.0.2 Plots
p <- pca_plot_custom(filt_colon_genus_tab,
filt_colon_genus_taxa,
filt_colon_genus_metadata,
show_boxplots = TRUE,
variable = "Group", size=2,
show_legend=FALSE)
# save the results
pca_plots_list[[paste(segment,level,"custom")]] <- p
# see the results
p3.3.1.1 Saving results
write.xlsx(pairwise_aitchison_raw[[paste(level, segment)]],
file = file.path(path,
paste0("beta_diversity_results_", segment,".xlsx")))3.4 Univariate Analysis
3.4.1 Main - Genus level
level="genus"
# needed paths
path = "../../../results/merged_sites/main_analysis/Q2/univariate_analysis"
path_maaslin=file.path("../../../intermediate_files/maaslin/Q2",level)
# variables
raw_linda_results_genus[[segment]] <- list()
linda_results_genus[[segment]] <- list()
# country and interaction problems
list_country_union <- list()
list_venns <- list()
# workbook for final df
wb <- createWorkbook()
# PSC effect
rpsc_effect <- list()3.4.1.1 Genus level
level="genus"Aggregate taxa
genus_data <- aggregate_taxa(colon_asv_tab,
colon_taxa_tab,
taxonomic_level = level)
colon_genus_tab <- genus_data[[1]]
colon_genus_taxa_tab <- genus_data[[2]]
colon_genus_asv_taxa_tab <- create_asv_taxa_table(colon_genus_tab,
colon_genus_taxa_tab)3.4.1.1.1 rPSC vs non-rPSC
3.4.1.1.1.1 linDA
group <- c("non-rPSC","rPSC")
comparison_name <- paste0(group[1], " vs ",group[2])taxa_to_test <- read.xlsx("../../../results/merged_sites/main_analysis/Q1/univariate_analysis/significant_taxa_colon.xlsx", sheet=" genus healthy vs pre_ltx")$SeqID
colon_genus_tab_2 <- colon_genus_tab %>% dplyr::filter(SeqID %in% taxa_to_test)
colon_genus_taxa_tab_2 <- colon_genus_taxa_tab %>% dplyr::filter(SeqID %in% taxa_to_test)
# prepare the data
linda_data <- binomial_prep(colon_genus_tab_2,
colon_genus_taxa_tab_2,
colon_metadata,
group, usage="linDA",filtering = FALSE)
filt_colon_uni_data <- linda_data[[1]]
filt_colon_uni_taxa <- linda_data[[2]]
filt_colon_uni_metadata <- linda_data[[3]]
# fit the model
linda.obj <- linda(filt_colon_uni_data,
filt_colon_uni_metadata,
formula = '~ Group * Country + (1|Patient)')0 features are filtered!
The filtered data has 231 samples and 33 features will be tested!
Imputation approach is used.
Fit linear mixed effects models ...
Completed.
linda.output <- linda.obj$output
linda.output <- linda_renaming(linda.output, group)
# save the results
group1 <- paste0(group[1], " vs ","Group",group[2])
group2 <- paste0(group[1], " , ",group[2], " - CZ vs NO")
group3 <- paste0(group[1], " vs ","Group",group[2], ":CountryNO")
for (grp in c(group1,group2,group3)){
raw_linda_results_genus[[segment]][[grp]] <-
rawlinda.df(linda.output,
grp,
filt_colon_uni_data,
filt_colon_uni_taxa)
linda_results_genus[[segment]][[grp]] <-
linda.df(linda.output,
grp,
filt_colon_uni_data,
filt_colon_uni_taxa)
}# volcano plot
volcano_1 <- volcano_plot_linda(linda.output, group1,
taxa_table = filt_colon_uni_taxa) +
ggtitle(paste(group,collapse=" vs "))
volcano_2 <- volcano_plot_linda(linda.output, group2,
taxa_table = filt_colon_uni_taxa) +
ggtitle("NO vs CZ")
volcano_3 <- volcano_plot_linda(linda.output, group3,
taxa_table = filt_colon_uni_taxa) +
ggtitle("Interaction effect")
volcano <- ggarrange(volcano_1,volcano_2,volcano_3, ncol=3)
# see the plot
volcano3.4.1.1.1.2 MaAsLin2
volcano1 <- volcano_plot_maaslin(fit_data,filt_colon_uni_taxa) +
ggtitle(paste(group[1], "vs", group[2]))
volcano2 <- volcano_plot_maaslin(fit_data,filt_colon_uni_taxa,variable="Country") +
ggtitle("NO vs CZ")
volcano <- ggarrange(volcano1,volcano2, ncol=2)
volcano3.4.1.1.1.3 Group - Intersection
intersection_results <- group_intersection(group, list_intersections, list_venns,
linda.output, fit_data,
raw_linda_results_genus,
segment = segment,
level=level)
list_intersections <- intersection_results[[1]]
list_venns <- intersection_results[[2]]
venn <- intersection_results[[3]]
# show the results
venn3.4.1.1.1.4 Interaction effect
list_interaction_significant <- country_interaction(group,
linda.output,
list_intersections,
filt_colon_uni_data,
filt_colon_uni_metadata,
segment=segment,
level=level)
# see the result
## significant interaction effect
list_interaction_significant[[1]]NULL
## results for czech cohort
list_interaction_significant[[2]][1] NA
## results for norwegian cohort
list_interaction_significant[[3]][1] NA
Removing problematic taxa
list_intersections <- removing_interaction_problems(group,
list_interaction_significant,
list_intersections,
segment=segment,
level=level)3.4.1.1.2 Basic statistics
uni_df <- merge(basic_univariate_statistics(linda_data,group),
raw_linda_results_genus[[segment]][[group1]],
by="SeqID",all=TRUE)[1] "non-rPSC"
[1] "rPSC"
[1] "non-rPSC"
[1] "rPSC"
uni_df[["final_sig"]] <- uni_df$SeqID %in% list_intersections[[paste(segment,level,comparison_name)]][["SeqID"]]
uni_statistics[[segment]][[paste(level,comparison_name)]] <- uni_df
# for comparison
new_name <- comparison_name
addWorksheet(wb, sheetName = new_name)
writeData(wb, sheet = new_name, uni_df, rowNames=FALSE)3.4.1.1.3 Prevalence testing
uni_df$prevalence_diff <- abs(uni_df$`PREVALENCE_ rPSC` - uni_df$`PREVALENCE_ non-rPSC`)
#uni_df %<>% dplyr::filter(prevalence_diff >=0.2)
#filt_colon_uni_data <- filt_colon_uni_data[uni_df$SeqID,]
filt_colon_uni_data <- ifelse(filt_colon_uni_data>0,"YES","NO")
filt_colon_uni_data %<>% t() %>% as.data.frame()
#filt_colon_uni_taxa %<>% dplyr::filter(SeqID %in% uni_df$SeqID)
filt_colon_uni_data <- merge(filt_colon_uni_data %>% rownames_to_column("SampleID"),filt_colon_metadata,by="SampleID",all.x=TRUE) %>% column_to_rownames("SampleID") %>% distinct(Patient, .keep_all = TRUE)
p_values <- c()
for (taxon in uni_df$SeqID){
data <- filt_colon_uni_data %>%
group_by(!!sym(taxon), Group) %>%
count() %>%
group_by(!!sym(taxon), Group) %>%
summarise(n = sum(n), .groups = "drop") %>%
pivot_wider(names_from = Group, values_from = n) %>%
column_to_rownames(taxon) %>%
as.matrix()
data
chi_res <- chisq.test(data)
chi_res$expected
#p_values <- c(p_values,chi_res$p.value)
test <- fisher.test(data)
test$p.value
p_values <- c(p_values,test$p.value)
}Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
p_values_adjusted <- p.adjust(p_values,method = "BH")
names(p_values_adjusted) <- uni_df$SeqID
names(p_values) <- uni_df$SeqID
p_values_adjusted[which(p_values_adjusted<0.05)] Bilophila
0.01992028
asv_tab <- as.data.frame(rrarefy(
linda_data[[1]] %>% t(),
sample=10000)) %>% t() %>% as.data.frame()
asv_tab <- ifelse(asv_tab>10,"YES","NO")
asv_tab %<>% t() %>% as.data.frame()
set.seed(123)
asv_tab <- merge(asv_tab %>% rownames_to_column("SampleID"),filt_colon_metadata,by="SampleID",all.x=TRUE) %>% column_to_rownames("SampleID") %>% distinct(Patient, .keep_all = TRUE)
p_values <- c()
for (taxon in uni_df$SeqID){
data <- asv_tab %>%
group_by(!!sym(taxon), Group) %>%
count() %>%
group_by(!!sym(taxon), Group) %>%
summarise(n = sum(n), .groups = "drop") %>%
pivot_wider(names_from = Group, values_from = n) %>%
column_to_rownames(taxon) %>%
as.matrix()
data2 <- asv_tab %>%
group_by(!!sym(taxon), Group,Country) %>%
count()
data[is.na(data)] <- 0
chi_res <- chisq.test(data)
chi_res$expected
test <- fisher.test(data)
test$p.value
chi_res$p.value
p_values <- c(p_values,test$p.value)
}Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
Warning in chisq.test(data): Chi-squared approximation may be incorrect
p_values_adjusted <- p.adjust(p_values,method = "BH")
names(p_values_adjusted) <- uni_df$SeqID
names(p_values) <- uni_df$SeqID
p_values_adjusted[which(p_values_adjusted<0.05)] Bilophila
0.04557538
df <- data.frame(SeqID=uni_df$SeqID,
p=p_values,
p_adj=p_values_adjusted)prevalence_df <- uni_df[,c("SeqID","PREVALENCE_ non-rPSC","PREVALENCE_ rPSC")]
prevalence_df <- merge(prevalence_df,df,by="SeqID",all=TRUE)
write.xlsx(prevalence_df,file.path(path,paste0("supplements_prevalence_",segment,".xlsx")))3.4.1.2 rPSC vs healthy
group <- c("healthy","rPSC")
comparison_name <- paste0(group[1], " vs ",group[2])3.4.1.2.1 linDA
# prepare the data
linda_data <- binomial_prep(colon_genus_tab,
colon_genus_taxa_tab,
colon_metadata,
group, usage="linDA")Removing 98 ASV(s)
filt_colon_uni_data <- linda_data[[1]]
filt_colon_uni_taxa <- linda_data[[2]]
filt_colon_uni_metadata <- linda_data[[3]]
# fit the model
linda.obj <- linda(filt_colon_uni_data,
filt_colon_uni_metadata,
formula = '~ Group * Country + (1|Patient)')0 features are filtered!
The filtered data has 248 samples and 178 features will be tested!
Imputation approach is used.
Fit linear mixed effects models ...
Completed.
linda.output <- linda.obj$output
linda.output <- linda_renaming(linda.output, group)
# save the results
group1 <- paste0(group[1], " vs ","Group",group[2])
group2 <- paste0(group[1], " , ",group[2], " - CZ vs NO")
group3 <- paste0(group[1], " vs ","Group",group[2], ":CountryNO")
for (grp in c(group1,group2,group3)){
raw_linda_results_genus[[segment]][[grp]] <-
rawlinda.df(linda.output,
grp,
filt_colon_uni_data,
filt_colon_uni_taxa)
linda_results_genus[[segment]][[grp]] <-
linda.df(linda.output,
grp,
filt_colon_uni_data,
filt_colon_uni_taxa)
}# volcano plot
volcano_1 <- volcano_plot_linda(linda.output, group1,
taxa_table = filt_colon_uni_taxa) +
ggtitle(paste(group,collapse=" vs "))
volcano_2 <- volcano_plot_linda(linda.output, group2,
taxa_table = filt_colon_uni_taxa) +
ggtitle("NO vs CZ")
volcano_3 <- volcano_plot_linda(linda.output, group3,
taxa_table = filt_colon_uni_taxa) +
ggtitle("Interaction effect")
volcano <- ggarrange(volcano_1,volcano_2,volcano_3, ncol=3)
# see the plot
volcano3.4.1.2.1.1 MaAsLin2
volcano1 <- volcano_plot_maaslin(fit_data,filt_colon_uni_taxa) +
ggtitle(paste(group[1], "vs", group[2]))
volcano2 <- volcano_plot_maaslin(fit_data,filt_colon_uni_taxa,variable="Country") +
ggtitle("NO vs CZ")
volcano <- ggarrange(volcano1,volcano2, ncol=2)
volcano3.4.1.2.1.2 Group - Intersection
intersection_results <- group_intersection(group,
list_intersections,
list_venns,
linda.output, fit_data,
raw_linda_results_genus,
segment = segment,
level=level)
list_intersections <- intersection_results[[1]]
list_venns <- intersection_results[[2]]
venn <- intersection_results[[3]]
# show the results
venn3.4.1.2.1.3 Interaction effect
list_interaction_significant <- country_interaction(group,
linda.output,
list_intersections,
filt_colon_uni_data,
filt_colon_uni_metadata,
segment=segment,
level=level)
# see the result
## significant interaction effect
list_interaction_significant[[1]]NULL
## results for czech cohort
list_interaction_significant[[2]][1] NA
## results for norwegian cohort
list_interaction_significant[[3]][1] NA
Removing problematic taxa
list_intersections <- removing_interaction_problems(group,
list_interaction_significant,
list_intersections,
segment=segment,
level=level)3.4.1.2.2 Basic statistics
uni_df <- merge(basic_univariate_statistics(linda_data,group),
raw_linda_results_genus[[segment]][[group1]],
by="SeqID",all=TRUE)[1] "healthy"
[1] "rPSC"
[1] "healthy"
[1] "rPSC"
uni_df[["final_sig"]] <- uni_df$SeqID %in% list_intersections[[paste(segment,level,comparison_name)]][["SeqID"]]
uni_statistics[[segment]][[paste(level,comparison_name)]] <- uni_df
# for comparison
new_name <- comparison_name
addWorksheet(wb, sheetName = new_name)
writeData(wb, sheet = new_name, uni_df, rowNames=FALSE)3.4.1.2.3 non-rPSC vs healthy
group <- c("healthy","non-rPSC")
comparison_name <- paste0(group[1], " vs ",group[2])3.4.1.2.3.1 linDA
# prepare the data
linda_data <- binomial_prep(colon_genus_tab,
colon_genus_taxa_tab,
colon_metadata,group,
usage="linDA")Removing 39 ASV(s)
Removing 6 ASV(s)
filt_colon_uni_data <- linda_data[[1]]
filt_colon_uni_taxa <- linda_data[[2]]
filt_colon_uni_metadata <- linda_data[[3]]
# fit the model
linda.obj <- linda(filt_colon_uni_data,
filt_colon_uni_metadata,
formula = '~ Group * Country + (1|Patient)')0 features are filtered!
The filtered data has 424 samples and 159 features will be tested!
Imputation approach is used.
Fit linear mixed effects models ...
Completed.
linda.output <- linda.obj$output
linda.output <- linda_renaming(linda.output, group)
# save the results
group1 <- paste0(group[1], " vs ","Group",group[2])
group2 <- paste0(group[1], " , ",group[2], " - CZ vs NO")
group3 <- paste0(group[1], " vs ","Group",group[2], ":CountryNO")
for (grp in c(group1,group2,group3)){
raw_linda_results_genus[[segment]][[grp]] <-
rawlinda.df(linda.output,
grp,
filt_colon_uni_data,
filt_colon_uni_taxa)
linda_results_genus[[segment]][[grp]] <-
linda.df(linda.output,
grp,
filt_colon_uni_data,
filt_colon_uni_taxa)
}# volcano plot
volcano_1 <- volcano_plot_linda(linda.output, group1,
taxa_table = filt_colon_uni_taxa) +
ggtitle(paste(group,collapse=" vs "))
volcano_2 <- volcano_plot_linda(linda.output, group2,
taxa_table = filt_colon_uni_taxa) +
ggtitle("NO vs CZ")
volcano_3 <- volcano_plot_linda(linda.output, group3,
taxa_table = filt_colon_uni_taxa) +
ggtitle("Interaction effect")
volcano <- ggarrange(volcano_1,volcano_2,volcano_3, ncol=3)
# see the plot
volcano3.4.1.2.3.2 MaAsLin2
volcano1 <- volcano_plot_maaslin(fit_data,filt_colon_uni_taxa) +
ggtitle(paste(group[1], "vs", group[2]))
volcano2 <- volcano_plot_maaslin(fit_data,filt_colon_uni_taxa,variable="Country") +
ggtitle("NO vs CZ")
volcano <- ggarrange(volcano1,volcano2, ncol=2)
volcano3.4.1.2.3.3 Group - Intersection
intersection_results <- group_intersection(group, list_intersections, list_venns,
linda.output, fit_data,
raw_linda_results_genus,
segment = segment,
level=level)
list_intersections <- intersection_results[[1]]
list_venns <- intersection_results[[2]]
venn <- intersection_results[[3]]
# show the results
venn3.4.1.2.3.4 Interaction effect
list_interaction_significant <- country_interaction(group,
linda.output,
list_intersections,
filt_colon_uni_data,
filt_colon_uni_metadata,
segment=segment,
level=level)
# see the result
## significant interaction effect
list_interaction_significant[[1]]NULL
## results for czech cohort
list_interaction_significant[[2]][1] NA
## results for norwegian cohort
list_interaction_significant[[3]][1] NA
Removing problematic taxa
list_intersections <- removing_interaction_problems(group,
list_interaction_significant,
list_intersections,
segment=segment,
level=level)3.4.1.2.4 Basic statistics
uni_df <- merge(basic_univariate_statistics(linda_data,group),
raw_linda_results_genus[[segment]][[group1]],
by="SeqID",all=TRUE)[1] "healthy"
[1] "non-rPSC"
[1] "healthy"
[1] "non-rPSC"
uni_df[["final_sig"]] <- uni_df$SeqID %in% list_intersections[[paste(segment,level,comparison_name)]][["SeqID"]]
uni_statistics[[segment]][[paste(level,comparison_name)]] <- uni_df
# for comparison
new_name <- comparison_name
addWorksheet(wb, sheetName = new_name)
writeData(wb, sheet = new_name, uni_df, rowNames=FALSE)3.4.1.2.5 Visualization
Heatmap visualizing the linDA’s logFoldChange for taxa with p < 0.1.
list_heatmap <- list_intersections[grep(paste(segment,level),
names(list_intersections),value=TRUE)]
p_heatmap_linda <- heatmap_linda(list_heatmap,colon_taxa_tab)
p_heatmap_lindaDot heatmap
dotheatmap_linda <- dot_heatmap_linda(list_heatmap,
uni_statistics$colon[grepl(level,names(uni_statistics$colon))],
colon_taxa_tab) + xlab("") + ylab("")min_clr -2.080021
max_clr 6.901023
min_log -4.181218
max_log 3.872081
dotheatmap_lindaHorizontal bar plot
p_prevalence <- horizontal_barplot(wb,taxa=levels(dotheatmap_linda$data$SeqID))Using SeqID as id variables
p_prevalence_final <- ggarrange(p_prevalence,
ggplot() + theme_minimal(),
nrow = 2,heights = c(1,0.085))
p <- ggarrange(dotheatmap_linda + theme(legend.position = "none"),p_prevalence_final,ncol=2,widths = c(1,0.3))
pdot_heatmap_colon <- p3.4.1.3 Saving results
# ALL DATA
saveWorkbook(wb,file.path(path,paste0("uni_analysis_wb_",segment,".xlsx")),
overwrite = TRUE)
# PSC effect
write.xlsx(rpsc_effect[[paste(segment,level)]],file.path(path,paste0("rpsc_effect_",segment,".xlsx")))
# SIGNIFICANT taxa
write.xlsx(list_intersections[grepl(segment,names(list_intersections))] %>%
`names<-`(gsub(segment, "", names(
list_intersections[grepl(segment,names(list_intersections))]))),
file.path(path,paste0("significant_taxa_",segment,".xlsx")))3.5 Machine learning
path = "../../../results/merged_sites/main_analysis/Q2/models"3.5.1 ElasticNet
model="enet"3.5.1.1 Genus level
level="genus"Aggregate taxa
genus_data <- aggregate_taxa(colon_asv_tab,
colon_taxa_tab,
taxonomic_level = level)
colon_genus_tab <- genus_data[[1]]
colon_genus_taxa_tab <- genus_data[[2]]3.5.1.1.1 rPSC vs non-rPSC - all
group <- c("rPSC","non-rPSC")
comparison_name <- paste0(group[1], " vs ",group[2])model_name <- paste(comparison_name,level,segment)
# prepare the data
filt_colon_uni_data <- binomial_prep(colon_genus_tab,
colon_genus_taxa_tab,
colon_metadata,
group,
usage="ml_clr",
patient = TRUE)Removing 28 ASV(s)
Removing 5 ASV(s)
# fit the model
enet_model <- glmnet_binomial(filt_colon_uni_data,
sample_method = "atypboot",
outcome="Group",
N=500,
clust_var="Patient",
reuse=TRUE,
file=model_name,
Q="Q2")
# ROC
roc_c <- roc_curve(enet_model, group)
# save the results
models_summ[[model_name]] <- enet_model$model_summary
models_cm[[model_name]] <- enet_model$conf_matrices$original
roc_cs[[model_name]] <- enet_model$kfold_rocobjs
betas[[model_name]] <- as.matrix(enet_model$betas)
# see the results
enet_model$model_summary %>% t() [,1]
alpha 0.00000000
lambda 0.03550553
auc 0.99707181
auc_czech 0.99495342
auc_no 1.00000000
auc_optimism_corrected 0.62839790
auc_optimism_corrected_CIL 0.49485088
auc_optimism_corrected_CIU 0.76009870
accuracy 0.98571429
accuracy_czech NaN
accuracy_no 0.98581560
accuracy_optimism_corrected 0.70988684
accuracy_optimism_corrected_CIL 0.61932014
accuracy_optimism_corrected_CIU 0.80479243
enet_model$conf_matrices$original
Predicted
True 0 1
0 262 1
1 4 83
$czech
Predicted
True 0 1
0 160 1
1 2 46
$no
Predicted
True 0 1
0 102 0
1 2 37
enet_model$plot`geom_smooth()` using formula = 'y ~ x'
`geom_smooth()` using formula = 'y ~ x'
roc_c3.5.1.1.2 rPSC vs non-rPSC - subset
group <- c("rPSC","non-rPSC")
comparison_name <- paste0(group[1], " vs ",group[2]," subset")taxa_to_test <- read.xlsx("../../../results/merged_sites/main_analysis/Q1/univariate_analysis/significant_taxa_colon.xlsx", sheet=" genus healthy vs pre_ltx")
model_name <- paste(comparison_name,level,segment)
# prepare the data
filt_colon_uni_data <- binomial_prep_psc_effect(colon_genus_tab,
colon_genus_taxa_tab,
colon_metadata,
taxa_to_test,
usage="ml_clr",patient = TRUE)Removing 28 ASV(s)
Removing 5 ASV(s)
# fit the model
enet_model <- glmnet_binomial(filt_colon_uni_data,
sample_method = "atypboot",
outcome="Group",
N=500,
reuse=TRUE,clust_var = "Patient",
file=model_name,
Q="Q2")
# ROC
roc_c <- roc_curve(enet_model, group)
# save the results
models_summ[[model_name]] <- enet_model$model_summary
models_cm[[model_name]] <- enet_model$conf_matrices$original
roc_cs[[model_name]] <- enet_model$kfold_rocobjs
betas[[model_name]] <- as.matrix(enet_model$betas)
# see the results
enet_model$model_summary %>% t() [,1]
alpha 0.80000000
lambda 0.03321672
auc 0.83344259
auc_czech 0.78312629
auc_no 0.92383107
auc_optimism_corrected 0.65974941
auc_optimism_corrected_CIL 0.49897616
auc_optimism_corrected_CIU 0.79007241
accuracy 0.79714286
accuracy_czech NaN
accuracy_no 0.82269504
accuracy_optimism_corrected 0.73877807
accuracy_optimism_corrected_CIL 0.63177895
accuracy_optimism_corrected_CIU 0.82893728
enet_model$conf_matrices$original
Predicted
True 0 1
0 258 5
1 66 21
$czech
Predicted
True 0 1
0 156 5
1 41 7
$no
Predicted
True 0 1
0 102 0
1 25 14
enet_model$plot`geom_smooth()` using formula = 'y ~ x'
`geom_smooth()` using formula = 'y ~ x'
roc_c3.5.1.2 Saving results
models_summ_df_colon <- do.call(rbind,
models_summ[grep(segment,names(models_summ),value = TRUE)])
write.csv(models_summ_df_colon,file.path(path,paste0("elastic_net_",segment,".csv")))3.6 Results overview
3.6.0.1 Alpha diversity
Linear models
knitr::kable(pc_observed[[segment]],
digits = 3,
caption = "Results of linear model testing ASV Richness")| Estimate | Std..Error | df | t.value | Pr…t.. | p.adj | sig | |
|---|---|---|---|---|---|---|---|
| non-rPSC vs GrouprPSC | -17.252 | 12.469 | 156.883 | -1.384 | 0.168 | 0.192 | |
| non-rPSC vs rPSC - CZ vs NO | -17.941 | 10.644 | 153.303 | -1.686 | 0.094 | 0.192 | |
| non-rPSC vs GrouprPSC:CountryNO | 3.483 | 21.371 | 152.108 | 0.163 | 0.871 | 0.871 | |
| healthy vs GrouprPSC | -40.307 | 11.924 | 125.516 | -3.380 | 0.001 | 0.009 | ** |
| healthy vs rPSC - CZ vs NO | 14.783 | 10.544 | 126.613 | 1.402 | 0.163 | 0.192 | |
| healthy vs GrouprPSC:CountryNO | -29.232 | 19.769 | 123.028 | -1.479 | 0.142 | 0.192 | |
| healthy vs Groupnon-rPSC | -22.976 | 8.905 | 211.354 | -2.580 | 0.011 | 0.047 | * |
| healthy vs non-rPSC - CZ vs NO | 14.961 | 10.892 | 212.956 | 1.374 | 0.171 | 0.192 | |
| healthy vs Groupnon-rPSC:CountryNO | -32.911 | 14.701 | 208.353 | -2.239 | 0.026 | 0.079 |
knitr::kable(pc_shannon[[segment]],
digits = 3,
caption = "Results of linear model testing Shannon index")| Estimate | Std..Error | df | t.value | Pr…t.. | p.adj | sig | |
|---|---|---|---|---|---|---|---|
| non-rPSC vs GrouprPSC | -0.146 | 0.161 | 154.485 | -0.907 | 0.366 | 0.549 | |
| non-rPSC vs rPSC - CZ vs NO | -0.355 | 0.138 | 151.788 | -2.581 | 0.011 | 0.051 | |
| non-rPSC vs GrouprPSC:CountryNO | 0.008 | 0.277 | 150.856 | 0.030 | 0.976 | 0.976 | |
| healthy vs GrouprPSC | -0.330 | 0.128 | 126.201 | -2.572 | 0.011 | 0.051 | |
| healthy vs rPSC - CZ vs NO | 0.017 | 0.113 | 127.217 | 0.146 | 0.884 | 0.976 | |
| healthy vs GrouprPSC:CountryNO | -0.364 | 0.213 | 123.907 | -1.709 | 0.090 | 0.162 | |
| healthy vs Groupnon-rPSC | -0.182 | 0.101 | 207.051 | -1.795 | 0.074 | 0.162 | |
| healthy vs non-rPSC - CZ vs NO | 0.018 | 0.124 | 208.617 | 0.145 | 0.884 | 0.976 | |
| healthy vs Groupnon-rPSC:CountryNO | -0.375 | 0.168 | 204.129 | -2.240 | 0.026 | 0.078 |
knitr::kable(pc_simpson[[segment]],
digits = 3,
caption = "Results of linear model testing Simpson index")| Estimate | Std..Error | df | t.value | Pr…t.. | p.adj | sig | |
|---|---|---|---|---|---|---|---|
| non-rPSC vs GrouprPSC | -0.005 | 0.029 | 153.608 | -0.169 | 0.866 | 0.866 | |
| non-rPSC vs rPSC - CZ vs NO | -0.032 | 0.024 | 151.251 | -1.296 | 0.197 | 0.443 | |
| non-rPSC vs GrouprPSC:CountryNO | -0.027 | 0.049 | 150.427 | -0.554 | 0.581 | 0.806 | |
| healthy vs GrouprPSC | -0.031 | 0.020 | 127.660 | -1.523 | 0.130 | 0.391 | |
| healthy vs rPSC - CZ vs NO | -0.007 | 0.018 | 128.411 | -0.402 | 0.688 | 0.806 | |
| healthy vs GrouprPSC:CountryNO | -0.052 | 0.034 | 125.982 | -1.540 | 0.126 | 0.391 | |
| healthy vs Groupnon-rPSC | -0.026 | 0.016 | 201.924 | -1.600 | 0.111 | 0.391 | |
| healthy vs non-rPSC - CZ vs NO | -0.007 | 0.020 | 203.469 | -0.363 | 0.717 | 0.806 | |
| healthy vs Groupnon-rPSC:CountryNO | -0.025 | 0.026 | 199.046 | -0.952 | 0.342 | 0.616 |
knitr::kable(pc_pielou[[segment]],
digits = 3,
caption = "Results of linear model testing Pielou index")| Estimate | Std..Error | df | t.value | Pr…t.. | p.adj | sig | |
|---|---|---|---|---|---|---|---|
| non-rPSC vs GrouprPSC | -0.007 | 0.026 | 153.863 | -0.292 | 0.771 | 0.843 | |
| non-rPSC vs rPSC - CZ vs NO | -0.052 | 0.022 | 150.735 | -2.372 | 0.019 | 0.171 | |
| non-rPSC vs GrouprPSC:CountryNO | -0.009 | 0.044 | 149.674 | -0.198 | 0.843 | 0.843 | |
| healthy vs GrouprPSC | -0.023 | 0.020 | 127.665 | -1.183 | 0.239 | 0.538 | |
| healthy vs rPSC - CZ vs NO | -0.008 | 0.017 | 129.013 | -0.457 | 0.648 | 0.843 | |
| healthy vs GrouprPSC:CountryNO | -0.052 | 0.033 | 124.565 | -1.605 | 0.111 | 0.333 | |
| healthy vs Groupnon-rPSC | -0.015 | 0.015 | 204.651 | -1.006 | 0.316 | 0.568 | |
| healthy vs non-rPSC - CZ vs NO | -0.008 | 0.019 | 206.473 | -0.416 | 0.678 | 0.843 | |
| healthy vs Groupnon-rPSC:CountryNO | -0.044 | 0.025 | 201.171 | -1.745 | 0.083 | 0.333 |
Plots
alpha_div_plots[[paste(segment,"Country")]]3.6.0.2 Beta diversity
Main results
knitr::kable(pairwise_aitchison_raw[[paste("genus", segment)]],
digits = 3,
caption = "Results of PERMANOVA - robust aitchison distance")| pairs | Df | SumsOfSqs | F.Model | R2 | p.value | p.adjusted | sig |
|---|---|---|---|---|---|---|---|
| rPSC vs non-rPSC | 1 | 419.973 | 2.475 | 0.007 | 0.415 | 0.415 | |
| rPSC vs healthy | 1 | 1473.966 | 8.988 | 0.035 | 0.001 | 0.002 | ** |
| non-rPSC vs healthy | 1 | 1742.331 | 10.401 | 0.024 | 0.001 | 0.002 | ** |
| rPSC vs non-rPSC , Country | 1 | 1556.933 | 9.177 | 0.026 | 0.001 | 0.001 | *** |
| rPSC vs healthy , Country | 1 | 883.284 | 5.386 | 0.021 | 0.001 | 0.001 | *** |
| non-rPSC vs healthy , Country | 1 | 1558.672 | 9.304 | 0.021 | 0.001 | 0.001 | *** |
| rPSC vs non-rPSC : Country | 1 | 318.175 | 1.880 | 0.005 | 0.970 | 0.970 | |
| rPSC vs healthy : Country | 1 | 319.594 | 1.956 | 0.008 | 0.820 | 0.970 | |
| non-rPSC vs healthy : Country | 1 | 407.191 | 2.439 | 0.006 | 0.163 | 0.489 |
PCA
pca_plots_list[[paste(segment,"genus custom")]]3.6.0.3 Univariate analysis
Number of significant taxa
knitr::kable(as.data.frame(lapply(list_intersections,nrow)) %>% t() %>%
`colnames<-`("Count") )| Count | |
|---|---|
| terminal_ileum.genus.non.rPSC.vs.rPSC | 0 |
| terminal_ileum.genus.healthy.vs.rPSC | 34 |
| terminal_ileum.genus.healthy.vs.non.rPSC | 25 |
| colon.genus.non.rPSC.vs.rPSC | 0 |
| colon.genus.healthy.vs.rPSC | 56 |
| colon.genus.healthy.vs.non.rPSC | 30 |
3.6.0.4 Machine learning
Main models
knitr::kable(models_summ_df_colon %>% dplyr::select(
"alpha","lambda",
"auc_optimism_corrected",
"auc_optimism_corrected_CIL",
"auc_optimism_corrected_CIU"),
digits=3,caption="Elastic net results")| alpha | lambda | auc_optimism_corrected | auc_optimism_corrected_CIL | auc_optimism_corrected_CIU | |
|---|---|---|---|---|---|
| rPSC vs non-rPSC genus colon | 0.0 | 0.036 | 0.628 | 0.495 | 0.76 |
| rPSC vs non-rPSC subset genus colon | 0.8 | 0.033 | 0.660 | 0.499 | 0.79 |
ROC - Genus level
p <- roc_curve_all_custom(roc_cs[3:4],Q="Q2",
model_name="enet_model",legend=FALSE)4 Paper-ready visualizations
Alpha diversity
p_A <- alpha_div_plots$`terminal_ileum Country` +
ggtitle("Terminal ileum")+
theme(plot.title = element_text(hjust=0.5,face = "bold",size = 15))
p_B <- alpha_div_plots$`colon Country` +
ggtitle("Colon") +
theme(plot.title = element_text(hjust=0.5,face = "bold",size = 15))
Q2_alpha <- ggarrange(p_A,ggplot() + theme_minimal(),p_B,nrow=1, ncol=3,
widths = c(1,0.1,1))
Q2_alphaBeta diversity
pca_ti <- pca_plots_list$`terminal_ileum genus custom`
pca_colon <- pca_plots_list$`colon genus custom`
genus_Q2_beta <- ggarrange(pca_ti,
ggplot() + theme_minimal(),
pca_colon,ncol=3,
widths = c(1,0.1,1))
genus_Q2_betaAlpha + Beta diversity
alpha_beta <- ggarrange(Q2_alpha,genus_Q2_beta,
ncol = 1,nrow=2,labels = c("A","B"))
alpha_betaElastic net
models_to_plot <- c("knn_model","rf_model","gbm_model","enet_model")
names(models_to_plot) <- c("kNN","RF","GBoost","ENet")
models_to_plot <- c("enet_model")
rocs_list <- list()
rocs_list[["enet_model"]] <- roc_cs
#names(models_to_plot) <- c("kNN","RF","GBoost","ENet")
# ILEUM
plot_list_ileum <- list()
for (model_name in models_to_plot) {
plot_list_ileum[[model_name]] <-
roc_curve_all_custom(rocs_list[[model_name]][c(1:2)],
Q="Q2",
model_name=model_name,legend = FALSE) +
#ggtitle(names(models_to_plot)[which(model_name==models_to_plot)]) +
theme(plot.title = element_text(face = "bold",size = 10))
}
roc_curve_ti <- ggarrange(plotlist = plot_list_ileum)
# COLON
plot_list_colon <- list()
for (model_name in models_to_plot) {
plot_list_colon[[model_name]] <-
roc_curve_all_custom(rocs_list[[model_name]][c(3:4)],
Q="Q2",
model_name=model_name,legend = FALSE) +
#ggtitle(names(models_to_plot)[which(model_name==models_to_plot)]) +
theme(plot.title = element_text(face = "bold",size = 10))
}
roc_curve_colon <- ggarrange(plotlist = plot_list_colon)
roc_curve_plot <- ggarrange(roc_curve_ti,
ggplot() + theme_minimal(),
roc_curve_colon,
ncol=3, widths = c(1,0.1,1))
roc_curve_plot4.1 Figure 4
alpha_beta_elastic <- ggarrange(Q2_alpha,genus_Q2_beta,roc_curve_plot,
ncol = 1,nrow=3,labels = LETTERS,heights = c(1,1,0.8))
alpha_beta_elasticalpha_beta_elastic <- ggarrange(alpha_beta_elastic,ggplot() + theme_minimal(),ncol=2,
widths = c(1,0.15))
alpha_beta_elastic4.2 Supplementary Figure - DAA
p_ileum <- dot_heatmap_ileum +
ggtitle("Terminal ileum") +
theme(plot.title = element_text(hjust=0.5,face = "bold",size = 15),
legend.position = "none")
p_colon <- dot_heatmap_colon +
ggtitle("Colon") +
theme(plot.title = element_text(hjust=0.5,face = "bold",size = 15),
legend.position = "none")
heatmap_plot <- ggarrange(p_ileum,
p_colon,
ncol = 2,labels=c("A","B"))
heatmap_plot5 Session info
sessionInfo()R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 11 x64 (build 26100)
Matrix products: default
locale:
[1] LC_COLLATE=Czech_Czechia.utf8 LC_CTYPE=Czech_Czechia.utf8
[3] LC_MONETARY=Czech_Czechia.utf8 LC_NUMERIC=C
[5] LC_TIME=Czech_Czechia.utf8
time zone: Europe/Prague
tzcode source: internal
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] mice_3.17.0 picante_1.8.2 ape_5.8-1
[4] lubridate_1.9.3 forcats_1.0.0 stringr_1.5.1
[7] tidyverse_2.0.0 kableExtra_1.4.0 tidyr_1.3.1
[10] gbm_2.2.2 doParallel_1.0.17 iterators_1.0.14
[13] foreach_1.5.2 ranger_0.17.0 ggvenn_0.1.15
[16] Maaslin2_1.16.0 purrr_1.0.2 pROC_1.18.5
[19] glmnet_4.1-8 MicrobiomeStat_1.2 caret_6.0-94
[22] openxlsx_4.2.8 magrittr_2.0.3 emmeans_1.11.2-8
[25] lmerTest_3.1-3 robustlmm_3.3-1 lme4_1.1-37
[28] Matrix_1.6-5 mgcv_1.9-1 nlme_3.1-164
[31] pheatmap_1.0.13 reshape2_1.4.4 vegan_2.6-10
[34] lattice_0.22-6 permute_0.9-8 ggplotify_0.1.2
[37] ggrepel_0.9.6 ggpubr_0.6.1 MicrobiotaProcess_1.14.1
[40] phyloseq_1.46.0 ggplot2_3.5.2 tibble_3.2.1
[43] dplyr_1.1.4 cowplot_1.2.0 readr_2.1.5
[46] igraph_2.1.4 ccrepe_1.38.1 data.table_1.15.4
loaded via a namespace (and not attached):
[1] fs_1.6.5 matrixStats_1.5.0
[3] bitops_1.0-9 RColorBrewer_1.1-3
[5] numDeriv_2016.8-1.1 tools_4.3.1
[7] backports_1.4.1 R6_2.6.1
[9] lazyeval_0.2.2 jomo_2.7-6
[11] rhdf5filters_1.14.1 withr_3.0.2
[13] gridExtra_2.3 cli_3.6.2
[15] Biobase_2.62.0 logging_0.10-108
[17] biglm_0.9-3 sandwich_3.1-1
[19] labeling_0.4.3 mvtnorm_1.3-3
[21] robustbase_0.99-4-1 pbapply_1.7-4
[23] systemfonts_1.2.2 yulab.utils_0.2.1
[25] svglite_2.1.3 parallelly_1.38.0
[27] rstudioapi_0.17.1 generics_0.1.4
[29] gridGraphics_0.5-1 shape_1.4.6.1
[31] car_3.1-3 zip_2.3.2
[33] biomformat_1.30.0 S4Vectors_0.40.2
[35] abind_1.4-8 infotheo_1.2.0.1
[37] lifecycle_1.0.4 multcomp_1.4-28
[39] yaml_2.3.10 carData_3.0-5
[41] SummarizedExperiment_1.32.0 Rtsne_0.17
[43] rhdf5_2.46.1 recipes_1.3.1
[45] SparseArray_1.2.4 grid_4.3.1
[47] mitml_0.4-5 crayon_1.5.3
[49] pillar_1.11.0 knitr_1.50
[51] optparse_1.7.5 GenomicRanges_1.54.1
[53] statip_0.2.3 boot_1.3-32
[55] estimability_1.5.1 future.apply_1.11.3
[57] codetools_0.2-20 pan_1.9
[59] glue_1.7.0 ggfun_0.2.0
[61] vctrs_0.6.5 treeio_1.26.0
[63] Rdpack_2.6.4 gtable_0.3.6
[65] maditr_0.8.6 gower_1.0.1
[67] xfun_0.52 rbibutils_2.3
[69] S4Arrays_1.2.1 prodlim_2024.06.25
[71] libcoin_1.0-10 coda_0.19-4.1
[73] reformulas_0.4.1 pcaPP_2.0-5
[75] modeest_2.4.0 survival_3.5-8
[77] timeDate_4041.110 hardhat_1.4.2
[79] lava_1.8.1 statmod_1.5.0
[81] TH.data_1.1-4 ipred_0.9-15
[83] ggtree_3.10.1 GenomeInfoDb_1.38.8
[85] fBasics_4041.97 rpart_4.1.23
[87] colorspace_2.1-1 BiocGenerics_0.48.1
[89] DBI_1.2.3 nnet_7.3-19
[91] ade4_1.7-23 tidyselect_1.2.1
[93] timeSeries_4041.111 compiler_4.3.1
[95] microbiome_1.24.0 xml2_1.3.8
[97] DelayedArray_0.28.0 scales_1.4.0
[99] DEoptimR_1.1-4 spatial_7.3-18
[101] rappdirs_0.3.3 digest_0.6.35
[103] minqa_1.2.8 rmarkdown_2.29
[105] XVector_0.42.0 htmltools_0.5.8.1
[107] pkgconfig_2.0.3 MatrixGenerics_1.14.0
[109] stabledist_0.7-2 fastmap_1.2.0
[111] rlang_1.1.3 htmlwidgets_1.6.4
[113] farver_2.1.2 zoo_1.8-13
[115] jsonlite_2.0.0 ModelMetrics_1.2.2.2
[117] RCurl_1.98-1.17 modeltools_0.2-24
[119] Formula_1.2-5 GenomeInfoDbData_1.2.11
[121] patchwork_1.3.2 Rhdf5lib_1.24.2
[123] Rcpp_1.0.12 ggnewscale_0.5.2
[125] fastGHQuad_1.0.1 stringi_1.8.3
[127] ggstar_1.0.4 stable_1.1.6
[129] zlibbioc_1.48.2 MASS_7.3-60.0.1
[131] plyr_1.8.9 listenv_0.9.1
[133] Biostrings_2.70.3 splines_4.3.1
[135] hash_2.2.6.3 multtest_2.58.0
[137] hms_1.1.3 ggtreeExtra_1.12.0
[139] ggsignif_0.6.4 stats4_4.3.1
[141] rmutil_1.1.10 evaluate_1.0.5
[143] nloptr_2.2.1 tzdb_0.5.0
[145] getopt_1.20.4 future_1.34.0
[147] clue_0.3-66 coin_1.4-3
[149] broom_1.0.9 xtable_1.8-4
[151] tidytree_0.4.6 rstatix_0.7.2
[153] viridisLite_0.4.2 class_7.3-22
[155] aplot_0.2.8 IRanges_2.36.0
[157] cluster_2.1.8.1 timechange_0.3.0
[159] globals_0.18.0