====== 中介效应分析 ======
中介效应的定义
如果自变量X通过某一变量M对因变量Y产生一定影响,则称M为X和Y的中介变量。研究中介作用的目的是在已知X和Y关系的基础上,探索产生这个关系的内部作用机制。作用关系图如下:
{{:yuxi:med.png?400|}}
X对Y的总效应分为直接效应(direct effect)和间接效应(indirect effect),直接效应是指当中介变量(M)固定在某一水平时,自变量X对结局变量Y的效应。间接效应是指自变量X通过中介变量M对结局变量Y施加的影响。按上图所示则c为直接效应,间接效应则为a与b的结合。
根据Baron和Kenny等人的研究,符合以下标准的变量可以判定为中介变量:
1)暴露水平的变化显著影响中介变量水平的变化(即X与M的关联具有统计学意义)。
2)中介变量水平的变化显著影响结局变量(即M与Y的关联具有统计学意义)。
3)暴露因素水平的变化显著影响结局变量(即X与Y的关联具有统计学意义)。
通常情况下标准1)和标准2)被普遍接受,标准3)未被接收,理由是当直接效应和中介效应的作用方向相反时,X与Y间的关联可能无统计学意义,但不能否认中介效应的存在。
2.中介效应的R实现
#mediation包中的mediate函数
install.packages("mediation")
library(mediation)
data(jobs)
b <- lm(job_seek ~ treat + econ_hard + sex + age, data=jobs)
c <- lm(depress2 ~ treat + job_seek + econ_hard + sex + age, data=jobs)
contcont <- mediate(b, c, sims=50, treat="treat", mediator="job_seek")
summary(contcont)
plot(contcont)
{{:yuxi:med2.png?400|}}
===== 结果说明 =====
ACME: stands for average causal mediation effects.间接因果效应,表示X通过M对Y的效应大小
通过med.sum$d0和med.sum$d0.p可以获得ACME的效应和p值
ADE:stands for average direct effects.直接效应,表示X直接对Y的作用大小
通过med.sum$z0和med.sum$z0.p可以获得ADE的效应和p值
Total Effect: stands for the total effect (direct + indirect) of the IV on the DV. X对Y的直接和间接作用总和
Prop. Mediated: describes the proportion of the effect of the IV on the DV that goes through the mediator. X通过M对Y的作用的比例
====== 对很多的因子进行中介效应分析循环 ======
library(mediation)
library(ggplot2)
library(dplyr)
#结果说明
#ACME stands for average causal mediation effects.间接因果效应,表示X通过M对Y的效应大小
#通过med.sum$d0和med.sum$d0.p可以获得ACME的效应和p值
#ADE stands for average direct effects.直接效应,表示X直接对Y的作用大小
#通过med.sum$z0和med.sum$z0.p可以获得ADE的效应和p值
#Total Effect stands for the total effect (direct + indirect) of the IV on the DV. X对Y的直接和间接作用总和
#Prop. Mediated describes the proportion of the effect of the IV on the DV that goes through the mediator. X通过M对Y的作用的比例
setwd("/Users/vincent/Documents/Rscript/mediation/project")
data = read.table("data.txt",header = T,sep = "\t" ,fileEncoding = "UTF-8")
#View(data)
names(data)
alphadata <- read.table("alpha_diversity_index.sample.txt",header = T,sep = "\t")
# 假设data1是第一个数据集,data2是第二个数据集
# 为了统一列名,将第二个数据集的"结题样本名称(中肛)"列重命名为"Sample_Name"
data <- data %>%
rename(Sample_Name = "结题样本名称.早唾.")
# 使用left_join()函数将两个数据集合并在一起,基于"Sample_Name"列
merged_data <- left_join(data, alphadata, by = "Sample_Name")
View(merged_data)
names(merged_data)
# merged_data 中包含了合并后的数据
#总体思路:生活方式数据为自变量、扩增子为中介变量(chao1 dominance goods_coverage observed_features pielou_e shannon simpson)、结局为因变量(GDM BMI LGA PROM),基线和检验指标作为调节变量(协变量)
#中介变量:α指数、差异菌群都分析
#生活方式 smoking2 alcohol2 physical2 EPDS2 depression2 PASQ2 sleep2 calorie2 animal2 plant2 vitamin2 beverages2 bean2 folate2 multivit2 VitD2 Ga2 Fe2 DHA2 DF2 probiotics2
# 中间变量 chao1 dominance goods_coverage observed_features pielou_e shannon simpson
# 结局为因变量(GDM BMI LGA PROM)
# 基线和检验指标作为调节变量
# 基线 :week sex GWG BMI height education register parity race unplanned unemployment
# 检验指标 :FBG TC HDL TG LDL HCY TT3 TT4 FT3 FT4 ATPa TSH A E 25-OH-D3
merged_data$GDM <- ifelse(merged_data$GDM == "control", 0, 1)
merged_data$BMI <- ifelse(merged_data$BMI == "control", 0, 1)
merged_data$LGA <- ifelse(merged_data$LGA == "control", 0, 1)
merged_data$PROM <- ifelse(merged_data$PROM == "control", 0, 1)
mlist <- c("smoking1", "alcohol1", "physical1", "depression1", "sleep1", "calorie1", "animal1", "plant1", "vitamin1", "beverages1", "bean1", "diet1", "health1")
length(mlist)
xlist <- c("chao1", "dominance", "goods_coverage", "observed_features", "pielou_e", "shannon", "simpson")
ylist <- c("GDM")
covlist <- c("GWG", "BMI", "height", "education", "register", "parity", "race", "unplanned", "unemployment")
output_path <- "/Users/vincent/Documents/Rscript/mediation/project/X101SC21124468-Z01-F003-早唾GDM/result/"
if (!file.exists(output_path)) {
# 如果目录不存在,创建它
dir.create(output_path, recursive = TRUE)
cat("目录已创建:", output_path, "\n")
} else {
cat("目录已经存在:", output_path, "\n")
}
# 循环执行中介效应分析
for (x_var in xlist) {
for (m_var in mlist) {
for (y_var in ylist) {
tryCatch({
# 创建文件名
output_file <- paste(paste(output_path, "自变量",x_var,sep = ""), paste("中间变量",m_var,sep = ""), paste("因变量",y_var,sep = ""), ".txt", sep = "_")
# 构建模型
b_int <- lm(paste(m_var, "~", x_var, "+", paste(covlist, collapse = " + "), sep = " "), data = merged_data)
d_int <- lm(paste(y_var, "~", x_var, "+", m_var, "+", paste(covlist, collapse = " + "), sep = " "), data = merged_data)
# 进行中介效应分析
contcont_result <- mediate(b_int, d_int, sims = 50, treat = x_var, mediator = m_var)
output_file_name <- paste(paste(output_path, "自变量",x_var,sep = ""), paste("中间变量",m_var,sep = ""), paste("因变量",y_var,sep = ""), ".pdf", sep = "_")
#png(output_file) # 设置图像的宽度和高度
pdf(output_file_name, width = 8, height = 6)
plot(contcont_result)
dev.off()
# 将结果输出到文件
result_text <- suppressWarnings(capture.output(summary(contcont_result)))
writeLines(result_text, con = output_file)
#write.table(summary(contcont_result), file = output_file)
}, error = function(e) {
# 处理异常的代码块
cat("An error occurred:", conditionMessage(e), "y is ", y_var , "x is ", x_var, "m is ", m_var ,"\n")
# 这里可以添加其他处理逻辑,或者略过错误
})
}
}
}