用户工具

站点工具


个性化条目:转座子_te_ltr_erv分析

转座子、TE、LTR、ERV分析

转座子(Transposable element,TE),也被称为“跳跃基因”或“转座元件”,其本质是一类DNA片段,能够从基因组的一个位置转移到另一个位置。转座子及其可识别的残余成分在原核和真核生物中均有分布,并且在不同物种中对基因组和转录组等在不同层次的影响不断被报道,这些发现改变了研究人员对转座子的看法。目前,转座子相关的研究已成为后基因组时代的研究热点。

转座子定量差异分析流程:

/TJPROJ6/RNA_SH/script_dir/TEs/TE_pipline.py -h
usage: TE_pipline.py [-h] {RepEnrich,RSEM,TEtranscripts} ...

use RepeatMasker analysis TE and quant

positional arguments:
  {RepEnrich,RSEM,TEtranscripts}
                        sub-command help
    RepEnrich           RepEnrich quant pipline
    RSEM                RSEM quant pipline
    TEtranscripts       TEtranscripts quant pipline

optional arguments:
  -h, --help            show this help message and exit

该流程涉及3款定量软件,RepEnrich,RSEM,TEtranscripts,其中RepEnrich和TEtranscripts,均为有文献支持的TE定量软件,RSEM是根据老师的需求串写的流程。

RepEnrich软件

/TJPROJ6/RNA_SH/script_dir/TEs/TE_pipline.py RepEnrich -h
usage: TE_pipline.py RepEnrich [-h] [-R RAW] [-C CLEAN] [-A {yes,no}]
                               [-RTC {fastp,ngqc}] -F FA [-SP SP] [-S SAMPLE]
                               [-S2G S2G] [-G GROUP] [-CD CONDITION]
                               [-CP COMPARE] [-PJ PADJ] [-PV PVALUE] [-FC FC]

optional arguments:
  -h, --help            show this help message and exit
  -R RAW, --raw RAW     the raw dir
  -C CLEAN, --clean CLEAN
                        the clean dir
  -A {yes,no}, --adapter {yes,no}
                      get adapter
  -RTC {fastp,ngqc}, --rawtoclean_soft {fastp,ngqc}
                        get clean soft ware
  -F FA, --fa FA        the fa file
  -SP SP, --sp SP       the RepeatMasker species, find /PUBLIC/software/public
                        /Repeat/RepeatMasker/Libraries/Species.txt
  -S SAMPLE, --sample SAMPLE
                        the sample name
  -S2G S2G, --s2g S2G   the relation for sample and group
  -G GROUP, --group GROUP
                        the group name
  -CD CONDITION, --condition CONDITION
                        the condition file
  -CP COMPARE, --compare COMPARE
                        the compare group name
  -PJ PADJ, --padj PADJ
                        the padj
  -PV PVALUE, --pvalue PVALUE
                        the pvalue
  -FC FC, --fc FC       the foldchange

该软件的测试结果见/TJPROJ6/RNA_SH/script_dir/TEs/RepEnrich2-master/result-RepEnrich2

TEtranscripts软件

/TJPROJ6/RNA_SH/script_dir/TEs/TE_pipline.py TEtranscripts -h
usage: TE_pipline.py TEtranscripts [-h] [-R RAW] [-C CLEAN] [-A {yes,no}]
                                   [-RTC {fastp,ngqc}] [-F FA] [-SP SP] -GTF
                                   GTF [-S SAMPLE] [-S2G S2G] [-G GROUP]
                                   [-CD CONDITION] [-CP COMPARE] [-PJ PADJ]
                                   [-PV PVALUE] [-FC FC]
                                   [-SD {no,forward,reverse}]
                                   [-M {uniq,multi}] [-TE TE_GTF] [-B BAM]

optional arguments:
  -h, --help            show this help message and exit
  -R RAW, --raw RAW     the raw dir
  -C CLEAN, --clean CLEAN
                        the clean dir
  -A {yes,no}, --adapter {yes,no}
                        get adapter
  -RTC {fastp,ngqc}, --rawtoclean_soft {fastp,ngqc}
                        get clean soft ware
  -F FA, --fa FA        the fa file
  -SP SP, --sp SP       the RepeatMasker species, find /PUBLIC/software/public
                        /Repeat/RepeatMasker/Libraries/Species.txt
  -GTF GTF, --gtf GTF   the gene gtf file
  -S SAMPLE, --sample SAMPLE
                        the sample name
  -S2G S2G, --s2g S2G   the relation for sample and group
  -G GROUP, --group GROUP
                        the group name
  -CD CONDITION, --condition CONDITION
                        the condition file
  -CP COMPARE, --compare COMPARE
                        the compare group name
  -PJ PADJ, --padj PADJ
                        the padj
  -PV PVALUE, --pvalue PVALUE
                        the pvalue
  -FC FC, --fc FC       the foldchange
  -SD {no,forward,reverse}, --strand {no,forward,reverse}
                        the strand
  -M {uniq,multi}, --mode {uniq,multi}
                        TE counting mode
  -TE TE_GTF, --TE_gtf TE_GTF
                        the TE gtf file
  -B BAM, --bam BAM     the bam dir

该软件测试结果见/TJPROJ6/RNA_SH/script_dir/TEs/TEtranscripts-master/result-TEtranscripts。
该软件匹配了一些物种的TE的gtf文件,如果版本能对应则可以不跑预测部分,路径见/TJPROJ6/RNA_SH/script_dir/TEs/TEtranscripts-master/database

RSEM软件

/TJPROJ6/RNA_SH/script_dir/TEs/TE_pipline.py RSEM -h
usage: TE_pipline.py RSEM [-h] [-R RAW] [-C CLEAN] [-A {yes,no}]
                          [-RTC {fastp,ngqc}] -F FA [-SP SP] [-S SAMPLE]
                          [-S2G S2G] [-G GROUP] [-CD CONDITION] [-CP COMPARE]
                          [-PJ PADJ] [-PV PVALUE] [-FC FC] [-SS {0,0.5,1}]

optional arguments:
  -h, --help            show this help message and exit
  -R RAW, --raw RAW     the raw dir
  -C CLEAN, --clean CLEAN
                        the clean dir
  -A {yes,no}, --adapter {yes,no}
                        get adapter
  -RTC {fastp,ngqc}, --rawtoclean_soft {fastp,ngqc}
                        get clean soft ware
  -F FA, --fa FA        the fa file
  -SP SP, --sp SP       the RepeatMasker species, find /PUBLIC/software/public
                        /Repeat/RepeatMasker/Libraries/Species.txt
  -S SAMPLE, --sample SAMPLE
                        the sample name
  -S2G S2G, --s2g S2G   the relation for sample and group
  -G GROUP, --group GROUP
                        the group name
  -CD CONDITION, --condition CONDITION
                        the condition file
  -CP COMPARE, --compare COMPARE
                        the compare group name
  -PJ PADJ, --padj PADJ
                        the padj
  -PV PVALUE, --pvalue PVALUE
                        the pvalue
  -FC FC, --fc FC       the foldchange
  -SS {0,0.5,1}, --ss {0,0.5,1}
                        for RSEM: fr-unstranded:0.5,fr-firststrand:1,fr-
                        secondstrand:0

测试结果见/TJPROJ6/RNA_SH/script_dir/TEs/RSEM-master/result-example,仅有TE的预测结果,定量结果与无参的count一致。

个性化条目/转座子_te_ltr_erv分析.txt · 最后更改: 2023/05/19 05:04 由 zhangxin