samtools和picard统计的话都会强制忽略dup标记的reads,统计结果与igv软件查看的bam文件不符
需求统计bam文件的各个位点真实碱基分布情况
使用软件: bam-readcount,统计bam文件得到真实的碱基分布
1 :参考基因组序列,fasta格式即可 2 :bwa比对的bam结果文件,比对使用的参考基因组需要保持一致
示例路径: /TJPROJ6/AFS_RESEQ/Proj/liangjifeng/software/jiftools/bam-readcount/example
天津集群脚本以及流程:
/bin/rm -rf /TJPROJ6/AFS_RESEQ/Proj/liangjifeng/software/jiftools/bam-readcount/example/95-04___9501full.txt mkdir -p /TJPROJ6/AFS_RESEQ/Proj/liangjifeng/software/jiftools/bam-readcount/example/95-04___9501full.txt ln -sf /TJPROJ6/AFS_RESEQ/Proj/liangjifeng/3.pag/X101SC23065193-Z01-J009.20231009.pagpipline.zidingyi9batchs/ref_9501full/9501full.txt /TJPROJ6/AFS_RESEQ/Proj/liangjifeng/software/jiftools/bam-readcount/example/95-04___9501full.txt genome=`ls /TJPROJ6/AFS_RESEQ/Proj/liangjifeng/software/jiftools/bam-readcount/example/95-04___9501full.txt/*` /PUBLIC/source/HW/Disease/bin/samtools faidx $genome awk '{print $1, 1, $2}' ${genome}.fai | sed -e 's/ /\t/g' > ${genome}.fai.bed python /TJPROJ6/AFS_RESEQ/Proj/liangjifeng/software/jiftools/bam-readcount/bam-readcount-statbam.py --fa $genome --bam /TJPROJ6/AFS_RESEQ/Proj/liangjifeng/3.pag/X101SC23065193-Z01-J009.20231009.pagpipline.zidingyi9batchs/ref_9501full/Mapping/95-04/95-04.final.bam --bed ${genome}.fai.bed --outfile /TJPROJ6/AFS_RESEQ/Proj/liangjifeng/software/jiftools/bam-readcount/example/95-04___9501full.txt/bam-readcount-stat.tmp python /TJPROJ6/AFS_RESEQ/Proj/liangjifeng/software/jiftools/bam-readcount/stat_tmpfile.py --infile /TJPROJ6/AFS_RESEQ/Proj/liangjifeng/software/jiftools/bam-readcount/example/95-04___9501full.txt/bam-readcount-stat.tmp --outfile /TJPROJ6/AFS_RESEQ/Proj/liangjifeng/software/jiftools/bam-readcount/example/95-04___9501full.txt/bam-readcount-stat.xls