=====参考基因组中toplevel sm rm 的区别===== dna - unmasked genomic DNA sequences. dna_rm - masked genomic DNA. Interspersed repeats and low complexity regions are detected with the RepeatMasker tool and masked by replacing repeats with 'N's. dna_sm' - soft-masked genomic DNA. All repeats and low complexity regions have been replaced with lowercased versions of their nucleic base 简单来说,toplevel为原始测序得到的序列,其中含N碱基,是测序时,没有测清楚导致的; rm版本,则是使用N碱基标记了一些简单的重复序列; sm版本,则是使用小写子母标记一些简单的重复序列; 所以,使用rm版本可能会丢失一些碱基信息,但是多比对的情况可能会有所下降(没有测试过,猜测)。