====== 中介效应分析 ====== 中介效应的定义 如果自变量X通过某一变量M对因变量Y产生一定影响,则称M为X和Y的中介变量。研究中介作用的目的是在已知X和Y关系的基础上,探索产生这个关系的内部作用机制。作用关系图如下: {{:yuxi:med.png?400|}} X对Y的总效应分为直接效应(direct effect)和间接效应(indirect effect),直接效应是指当中介变量(M)固定在某一水平时,自变量X对结局变量Y的效应。间接效应是指自变量X通过中介变量M对结局变量Y施加的影响。按上图所示则c为直接效应,间接效应则为a与b的结合。 根据Baron和Kenny等人的研究,符合以下标准的变量可以判定为中介变量: 1)暴露水平的变化显著影响中介变量水平的变化(即X与M的关联具有统计学意义)。 2)中介变量水平的变化显著影响结局变量(即M与Y的关联具有统计学意义)。 3)暴露因素水平的变化显著影响结局变量(即X与Y的关联具有统计学意义)。 通常情况下标准1)和标准2)被普遍接受,标准3)未被接收,理由是当直接效应和中介效应的作用方向相反时,X与Y间的关联可能无统计学意义,但不能否认中介效应的存在。 2.中介效应的R实现 #mediation包中的mediate函数 install.packages("mediation") library(mediation) data(jobs) b <- lm(job_seek ~ treat + econ_hard + sex + age, data=jobs) c <- lm(depress2 ~ treat + job_seek + econ_hard + sex + age, data=jobs) contcont <- mediate(b, c, sims=50, treat="treat", mediator="job_seek") summary(contcont) plot(contcont) {{:yuxi:med2.png?400|}} ===== 结果说明 ===== ACME: stands for average causal mediation effects.间接因果效应,表示X通过M对Y的效应大小 通过med.sum$d0和med.sum$d0.p可以获得ACME的效应和p值 ADE:stands for average direct effects.直接效应,表示X直接对Y的作用大小 通过med.sum$z0和med.sum$z0.p可以获得ADE的效应和p值 Total Effect: stands for the total effect (direct + indirect) of the IV on the DV. X对Y的直接和间接作用总和 Prop. Mediated: describes the proportion of the effect of the IV on the DV that goes through the mediator. X通过M对Y的作用的比例 ====== 对很多的因子进行中介效应分析循环 ====== library(mediation) library(ggplot2) library(dplyr) #结果说明 #ACME stands for average causal mediation effects.间接因果效应,表示X通过M对Y的效应大小 #通过med.sum$d0和med.sum$d0.p可以获得ACME的效应和p值 #ADE stands for average direct effects.直接效应,表示X直接对Y的作用大小 #通过med.sum$z0和med.sum$z0.p可以获得ADE的效应和p值 #Total Effect stands for the total effect (direct + indirect) of the IV on the DV. X对Y的直接和间接作用总和 #Prop. Mediated describes the proportion of the effect of the IV on the DV that goes through the mediator. X通过M对Y的作用的比例 setwd("/Users/vincent/Documents/Rscript/mediation/project") data = read.table("data.txt",header = T,sep = "\t" ,fileEncoding = "UTF-8") #View(data) names(data) alphadata <- read.table("alpha_diversity_index.sample.txt",header = T,sep = "\t") # 假设data1是第一个数据集,data2是第二个数据集 # 为了统一列名,将第二个数据集的"结题样本名称(中肛)"列重命名为"Sample_Name" data <- data %>% rename(Sample_Name = "结题样本名称.早唾.") # 使用left_join()函数将两个数据集合并在一起,基于"Sample_Name"列 merged_data <- left_join(data, alphadata, by = "Sample_Name") View(merged_data) names(merged_data) # merged_data 中包含了合并后的数据 #总体思路:生活方式数据为自变量、扩增子为中介变量(chao1 dominance goods_coverage observed_features pielou_e shannon simpson)、结局为因变量(GDM BMI LGA PROM),基线和检验指标作为调节变量(协变量) #中介变量:α指数、差异菌群都分析 #生活方式 smoking2 alcohol2 physical2 EPDS2 depression2 PASQ2 sleep2 calorie2 animal2 plant2 vitamin2 beverages2 bean2 folate2 multivit2 VitD2 Ga2 Fe2 DHA2 DF2 probiotics2 # 中间变量 chao1 dominance goods_coverage observed_features pielou_e shannon simpson # 结局为因变量(GDM BMI LGA PROM) # 基线和检验指标作为调节变量 # 基线 :week sex GWG BMI height education register parity race unplanned unemployment # 检验指标 :FBG TC HDL TG LDL HCY TT3 TT4 FT3 FT4 ATPa TSH A E 25-OH-D3 merged_data$GDM <- ifelse(merged_data$GDM == "control", 0, 1) merged_data$BMI <- ifelse(merged_data$BMI == "control", 0, 1) merged_data$LGA <- ifelse(merged_data$LGA == "control", 0, 1) merged_data$PROM <- ifelse(merged_data$PROM == "control", 0, 1) mlist <- c("smoking1", "alcohol1", "physical1", "depression1", "sleep1", "calorie1", "animal1", "plant1", "vitamin1", "beverages1", "bean1", "diet1", "health1") length(mlist) xlist <- c("chao1", "dominance", "goods_coverage", "observed_features", "pielou_e", "shannon", "simpson") ylist <- c("GDM") covlist <- c("GWG", "BMI", "height", "education", "register", "parity", "race", "unplanned", "unemployment") output_path <- "/Users/vincent/Documents/Rscript/mediation/project/X101SC21124468-Z01-F003-早唾GDM/result/" if (!file.exists(output_path)) { # 如果目录不存在,创建它 dir.create(output_path, recursive = TRUE) cat("目录已创建:", output_path, "\n") } else { cat("目录已经存在:", output_path, "\n") } # 循环执行中介效应分析 for (x_var in xlist) { for (m_var in mlist) { for (y_var in ylist) { tryCatch({ # 创建文件名 output_file <- paste(paste(output_path, "自变量",x_var,sep = ""), paste("中间变量",m_var,sep = ""), paste("因变量",y_var,sep = ""), ".txt", sep = "_") # 构建模型 b_int <- lm(paste(m_var, "~", x_var, "+", paste(covlist, collapse = " + "), sep = " "), data = merged_data) d_int <- lm(paste(y_var, "~", x_var, "+", m_var, "+", paste(covlist, collapse = " + "), sep = " "), data = merged_data) # 进行中介效应分析 contcont_result <- mediate(b_int, d_int, sims = 50, treat = x_var, mediator = m_var) output_file_name <- paste(paste(output_path, "自变量",x_var,sep = ""), paste("中间变量",m_var,sep = ""), paste("因变量",y_var,sep = ""), ".pdf", sep = "_") #png(output_file) # 设置图像的宽度和高度 pdf(output_file_name, width = 8, height = 6) plot(contcont_result) dev.off() # 将结果输出到文件 result_text <- suppressWarnings(capture.output(summary(contcont_result))) writeLines(result_text, con = output_file) #write.table(summary(contcont_result), file = output_file) }, error = function(e) { # 处理异常的代码块 cat("An error occurred:", conditionMessage(e), "y is ", y_var , "x is ", x_var, "m is ", m_var ,"\n") # 这里可以添加其他处理逻辑,或者略过错误 }) } } }