====等位基因偏向性表达鉴定==== 脚本路径:/TJPROJ6/RNA_SH/TJPROJ1/RNA/shouhou/script_dir/script_dir/ref/allele/allele.pl \\ 新脚本路径: /NJPROJ3/RNA_SH/personal_dir/zhanghailei/SNP_ase/allele/allele.pl \\ 修正点:\\ 数据矫正:原始的数据矫正是不合理的,修改后的数据矫正模仿FPKM的矫正方式。\\ 偏向性: 原始的数据使用R计算的时候,pvalue 数据的计算是不正确的,偏向性的筛选也不是按照padj值。新版本修改了以上几处问题。\\ ============================================================================== Description : allele specific expression writer:liuxunbiao\@novogene.com Options -snp : snp file -outdir : pathway of outdoor -sample : C_D_1:C_D_2:C_D_3,C_J_1:C_J_2:C_J_3,C_T_1:C_T_2:C_T_3 .....split by "," -group : C_D,C_J,C_T -pair :split by ",",example: C_D:C_J:C_T,male2:female2:F1 ..... split by "," 【顺序:父(母):母(父):子代】 -m :min coverage of reads [10] -n :min snp in gene [1] -h|?|help : Show this help ============================================================================== 输入snp文件格式如下: 适用于转录组GATK流程call出的snp结果,后面三列为基因注释信息 CHROM POS REF ALT C_D_1 C_D_2 C_D_3 C_J_1 C_J_2 C_J_3 C_T_1 C_T_2 C_T_3 GeneId Gene Name Description scaffold35 46 C G 0,1 0,1 0,1 3,1 3,1 1,1 1,2 0,1 0,2 -- -- -- scaffold35 47 G A 0,1 0,1 0,1 3,1 3,1 1,1 1,2 0,1 0,2 -- -- -- scaffold35 460 G A 2,0 3,0 1,0 1,1 3,0 2,0 NA 2,0 4,0 gene1 -- -//- scaffold17 46 A G 0,1 NA NA NA NA NA 0,1 1,0 NA -- -- -- scaffold17 50 A C 1,0 NA NA NA NA NA 1,1 1,0 NA -- -- -- scaffold17 200 A T NA NA 0,2 NA NA NA 0,4 NA 0,2 -- -- -- scaffold17 312 G T NA NA 0,1 NA NA NA 2,1 0,1 2,1 -- -- -- scaffold17 379 C T NA NA NA NA NA NA 3,1 0,2 3,0 -- -- -- ===输出结果=== merged_snp.xls 各组(pair)目录 ===输出结果展示=== gene_id DY_6h_snp.gene.normalized(TW_6h) DY_6h_snp.gene.normalized(JP_6h) pvalue qvalue signature bias GeneId Gene Name Description evm.model.scaffold100157.2 313,588.387060667114 274,417.272443366084 3.573e-20 3.615e-19 TRUE TW_6h evm.model.scaffold100157.2 -- -//- evm.model.scaffold100229.8 12,28.0184314603388 14,42.5508083695678 0.01659 0.02706 TRUE JP_6h evm.model.scaffold100229.8 -- -//- evm.model.scaffold100239.10 16,24.3638534437729 13,43.9234150911668 0.2682 0.3245 FALSE None evm.model.scaffold100239.10 -- sp|Q6AY55|DCAKD_RAT Dephospho-CoA kinase domain-cont evm.model.scaffold10061.16 68,82.8371017088277 15,100.200290676724 0.2885 0.3464 FALSE None evm.model.scaffold10061.16 -- sp|Q91ZN5|S35B2_MOUSE Adenosine 3'-phospho 5&ap evm.model.scaffold100813.10 57,29.2366241325274 25,166.085413313474 0.003353 0.006133 TRUE TW_6h evm.model.scaffold100813.10 -- sp|Q9H0J9|PAR12_HUMAN Poly [ADP-ribo ===提供给老师的结果=== merged_snp.xls 各组(pair)目录 readme.txt ===技术路线=== ①过滤掉父母本中SNP类型一致的位点 ②保留SNP位点要求父母本在该位点纯合 ③该位点reads覆盖度>N(可由客户来定) ④将reads的比对覆盖深度进行标准化处理 ⑤偏向性表达SNP位点鉴定(二项分布检验) ⑥计算杂交种中来自父母本的等位基因表达水平及亲本自交系的表达量 ⑦筛选等位特异(偏)表达的基因及偏向父母本的基因(二项分布检验) ===南京=== 南京脚本:/NJPROJ2/RNA/shouhou/script_dir/other/allele/allele.pl \\ 备注:修改了padj<0.05,但不偏向 \\