刷GSEA脚本

用fpkm输入的老流程
/TJPROJ6/RNA_SH/personal_dir/wangjun/person_script/get_gsea_old.py
用差异矫正后的readcount输入的新流程
/TJPROJ6/RNA_SH/personal_dir/wangjun/person_script/get_gsea.py

参数说明

fpkm输入
Usage
-T TYPE, --type TYPE  type of analysis, go or kegg, default is both(go,kegg), 分析类型<go or kegg or both>
-O OUTDIR, --outdir OUTDIR
                      Output directory of script and result
-D DEGDIR, --degdir DEGDIR
                      deglist directory, used to get compare groups. 非必需,有差异deglist路径时可填,默认读取所有比较组
-C COMPARE, --compare COMPARE
                      Compare group <AvsB,CvsD>, default is all compare in
                      degdir. 
-c CONDITION, --condition CONDITION
                      The group condition file.
-G GO, --go GO        go.xls, if go.gmt file is provided, then go.xls is not
                      required. 生成gmt用,有gmt就不填
-K KEGG, --kegg KEGG  kegg.xls, if kegg.gmt file is provided, then kegg.xls
                      is not required. 生成gmt用,有gmt就不填
-P PJCODE, --pjcode PJCODE
                      project code
-g GENE, --gene GENE  gene.xls, if gmt file is provided, then gene.xls is
                      not required
-X GMT, --gmt GMT     gmt file <go.gmt,kegg.gmt>, gmt filename must end with
                      "go.gmt" or "kegg.gmt", if projcet type is med, then
                      use hsa/mmu gmt in db: /TJPROJ6/GB_TR/reference_data/n
                      ew_pip/Animal/Homo_sapiens/Homo_sapiens_Ensemble_94/Ho
                      mo_sapiens_Ensemble_94_go.gmt ; /TJPROJ6/GB_TR/referen
                      ce_data/new_pip/Animal/Homo_sapiens/Homo_sapiens_Ensem
                      ble_94/Homo_sapiens_Ensemble_94_hsa_kegg.gmt ; /TJPROJ
                      6/GB_TR/reference_data/new_pip/Animal/Mus_musculus/Mus
                      _musculus_Ensemble_94/Mus_musculus_Ensemble_94_go.gmt
                      ; /TJPROJ6/GB_TR/reference_data/new_pip/Animal/Mus_mus
                      culus/Mus_musculus_Ensemble_94/Mus_musculus_Ensemble_9
                      4_mmu_kegg.gmt
-F FPKM, --fpkm FPKM  gene_fpkm.xls file
--pjtype PJTYPE       project type, med or ref, default is ref
-N PLOT_NUM, --plot_num PLOT_NUM
                      plot number, default is 50 , 画图个数,默认50
-S SVG2PDF, --svg2pdf SVG2PDF
                      output svg and pdf format picture, <true or false> , 输出svg与pdf的图片

运行

python /TJPROJ6/RNA_SH/personal_dir/wangjun/person_script/get_gsea_old.py \
--outdir /TJPROJ6/RNA_SH/personal_dir/wangjun/person_script/gsea_test \
--condition /TJPROJ6/RNA_SH/personal_dir/wangjun/person_script/gsea_test/condition.xls \
--pjcode X101SC21110453-Z01-J001 \
--gmt /TJPROJ6/GB_TR/reference_data/new_pip/Animal/Homo_sapiens/Homo_sapiens_Ensemble_94/Homo_sapiens_Ensem\
ble_94_go.gmt,/TJPROJ6/GB_TR/reference_data/new_pip/Animal/Homo_sapiens/Homo_sapiens_Ensemble_94/Homo_sapien\
s_Ensemble_94_hsa_kegg.gmt \  # 数据库中人的gmt
--gene /TJPROJ6/RNA_SH/personal_dir/wangjun/person_script/gsea_test/gene.xls \
--compare HNK_20uMvsControl,HNK_100uMvsControl \
--fpkm /TJPROJ6/RNA_SH/personal_dir/wangjun/person_script/gsea_test/ch_gene_fpkm.xls \
--pjtype med \
--N 100 \
--svg2pdf true

注释:gsea默认输出的svg是gz压缩的,会链接到html中,解压转pdf后就是svg格式的了,不会在报告中链接,有需要再优化,正常结果是png,svg,pdf的文件,html报告中链接的只有png。