dna - unmasked genomic DNA sequences. dna_rm - masked genomic DNA. Interspersed repeats and low complexity regions are detected with the RepeatMasker tool and masked by replacing repeats with 'N's. dna_sm' - soft-masked genomic DNA. All repeats and low complexity regions have been replaced with lowercased versions of their nucleic base
简单来说,toplevel为原始测序得到的序列,其中含N碱基,是测序时,没有测清楚导致的; rm版本,则是使用N碱基标记了一些简单的重复序列; sm版本,则是使用小写子母标记一些简单的重复序列;