计算panel bed编码区长度
Bash |
---|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30 | #!/bin/bash
#@File : run.sh
#@Time : 2023/11/21 09:32:17
#@Author : biolxy
#@Version : 1.0
#@Contact : biolxy@aliyun.com
#@Desc : None
inputbed=$1
# export PATH=/data/biogonco/lixy/bedtools/bedtools2/bin:$PATH # v2.30.1 不行
export PATH=/data/bioinfo_project/bioinfo_miniconda3/bin:$PATH # v2.30.0 不行
# /data/biogonco/lixy/bedtools/bedtools2/bin/bedtools
bedtools merge -i ${inputbed} > merge.bed
bedtools summary -i merge.bed -g /mnt/nas_001/project/oncodemo/genome_and_annotation/hg19/gatk/b37/chrom.sizes > merge.bed.summary # 统计的是 bed 区间的长度
# 去 gencode 下载 gencode.v41.annotation.gff3
# wget https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_44/GRCh37_mapping/gencode.v44lift37.annotation.gff3.gz
# /mnt/nas_001/project/oncodemo/genome_and_annotation/genome/gffread-0.12.7.Linux_x86_64/gffread gencode.v44lift37.annotation.gff3 -T -o gencode.v44lift37.annotation.bed
# awk '$3 == "CDS"' gencode.v44lift37.annotation.bed | sed 's/^chr//g' > coding_CDS.bed
# coding_CDS.bed 为编码区bed
bedtools intersect -a merge.bed -b /mnt/nas_001/project/oncodemo/genome_and_annotation/genome/coding_CDS.bed | sort -k1,1 -k2,2n | bedtools merge -i - > panel.cds.bed
# panel.cds.bed 为 pane 的 编码区 bed
bedtools summary -i panel.cds.bed -g /mnt/nas_001/project/oncodemo/genome_and_annotation/hg19/gatk/b37/chrom.sizes > panel.cds.summary
|
注意
要注意bedtools的版本问题,我测试的 v2.30.0可以,v2.30.1 不行