跳转至

肿瘤基因组

PyClone的安装报错及解决

1、 PyClone 安装,报错及解决

1.1、 安装
Bash
1
conda install PyClone

命令如下:

Bash
1
PyClone run_analysis_pipeline --in_files SRR385938.tsv SRR385939.tsv SRR385940.tsv SRR385941.tsv --working_dir ./tt
1.2、 报错:
Bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Traceback (most recent call last):
  File "/home/lixy/miniconda2/envs/ngs2/bin/PyClone", line 11, in <module>
    load_entry_point('PyClone==0.13.1', 'console_scripts', 'PyClone')()
  File "/home/lixy/miniconda2/envs/ngs2/lib/python2.7/site-packages/pyclone/cli.py", line 78, in main
    args.func(args)
  File "/home/lixy/miniconda2/envs/ngs2/lib/python2.7/site-packages/pyclone/run.py", line 97, in run_analysis_pipeline
    args.thin
  File "/home/lixy/miniconda2/envs/ngs2/lib/python2.7/site-packages/pyclone/run.py", line 409, in _cluster_plot
    samples=samples,
  File "/home/lixy/miniconda2/envs/ngs2/lib/python2.7/site-packages/pyclone/post_process/plot/clusters.py", line 58, in density_plot
    fig = pp.figure(figsize=(width, height))
  File "/home/lixy/miniconda2/envs/ngs2/lib/python2.7/site-packages/matplotlib/pyplot.py", line 534, in figure
    **kwargs)
  File "/home/lixy/miniconda2/envs/ngs2/lib/python2.7/site-packages/matplotlib/backend_bases.py", line 170, in new_figure_manager
    return cls.new_figure_manager_given_figure(num, fig)
  File "/home/lixy/miniconda2/envs/ngs2/lib/python2.7/site-packages/matplotlib/backend_bases.py", line 176, in new_figure_manager_given_figure
    canvas = cls.FigureCanvas(figure)
  File "/home/lixy/miniconda2/envs/ngs2/lib/python2.7/site-packages/matplotlib/backends/backend_qt5agg.py", line 35, in __init__
    super(FigureCanvasQTAggBase, self).__init__(figure=figure)
  File "/home/lixy/miniconda2/envs/ngs2/lib/python2.7/site-packages/matplotlib/backends/backend_qt5.py", line 235, in __init__
    _create_qApp()
  File "/home/lixy/miniconda2/envs/ngs2/lib/python2.7/site-packages/matplotlib/backends/backend_qt5.py", line 122, in _create_qApp
    raise RuntimeError('Invalid DISPLAY variable')
RuntimeError: Invalid DISPLAY variabl
1.3、 解决:

aroth85 :

Please check user group (https://groups.google.com/forum/#!forum/pyclone-user-group) for threads on this.

There is another bit in the links about changing your "matplotlibrc" file, instead of the code. This will globally set the backend for all Python scripts to Agg which is what you likely want for a remote server. You should not need to edit the code if you set this file up correctly. See the matplotlib help page with search terms matplotlibrc for details.

具体:

  1. 报错的原因是 matplotlib的配置文件没有设置好:
  2. 找到配置文件:

Bash
1
2
3
>>> import matplotlib
>>> matplotlib.matplotlib_fname()
'/home/foo/.config/matplotlib/matplotlibrc'
参考 1. 编辑你的 matplotlibrc (第44行) :

Bash
1
2
vim /home/foo/.config/matplotlib/matplotlibrc :
backend      : Agg

再次执行命令,成功!

初识Gene signature

什么是Gene signature

https://cancer.sanger.ac.uk/cosmic/signatures

Gene signature 的原理

人类的参考基因组序列是固定的,可以由此计算出任意三个连续的碱基序列的先的频率

  • 这样的三个连续的碱基ATG,TCG等共有96种
    • 密码子有64个,但是碱基组合有96(4 * 4 * 4 = 96)种
  • 参考基因组的这96中序列的出现的频率也是固定的,以柱状图显示出来,即使原始signature
    • 图为cosmic数据计算出的一种signature1

signature

  • 期吸烟的患者,体内基因发生变异,导致其测序统计后的signature与原始signature出现差异
  • 统计大量上述数据,得出的signature即可视为吸烟患者的signaure,cosmic数据给出了30种signature1-30
  • 测序获得一个新的signature-new,即可与signature1-30进行比较,从而解读该signature-new

Gene signature的用途

预后诊断

如何提取Gene signature

r语言,有两个两个包可以实现:

  • maftools
    • 这个包功能比较全面,不局限在signature上
  • deconstrucSigs
    • 仅仅实现signature作图,但是可以把signature的权重提取出来,示例如下

deconstrucSigs的安装

S
1
2
source("http://bioconductor.org/biocLite.R")
biocLite("")

输入数据格式

由病人的vcf文件中提取,整理为deconstrucSigs的格式

Text Only
1
2
3
4
5
6
7
Sample  chr      pos ref alt
1 chr1   905907   A   T
1 chr1  1192480   C   A
1 chr1  1854885   G   C
1 chr1  9713992   G   A
1 chr1 12908093   C   A
1 chr1 17257855   C   T
+ 注意decon.maf 中chr那一列为 chr1 ,不是 Chr1 + sep = "\t" + 多个病人的数据cat 在一起,用sampleID做区分

批量保存图片

其中 'XH01', 'XH11', 'XH12', 'XH19', 'XH28', 'XH31', 'XH33', 'XH34', 'XH36', 'XH38', 'XH39', 'XH40', 'XH43', 'XH44', 'XH45', 'XH47', 'XH49', 'XH51', 'XH52', 'XH54', 'XH56', 'XH58' 为我的数据中SampleID

S
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
suppressPackageStartupMessages(library("deconstructSigs"))
sigs.input <- mut.to.sigs.input(mut.ref="decon.maf",sample.id = "Sample", chr = "chr", pos = "pos", ref = "ref", alt = "alt")
l = list('XH01', 'XH11', 'XH12', 'XH19', 'XH28', 'XH31', 'XH33', 'XH34', 'XH36', 'XH38', 'XH39', 'XH40', 'XH43', 'XH44', 'XH45', 'XH47', 'XH49', 'XH51', 'XH52', 'XH54', 'XH56', 'XH58')
pdf("plot.cosmic.pdf")
for(i in l){
test = whichSignatures(tumor.ref = sigs.input,signatures.ref = signatures.cosmic,sample.id=i,signature.cutoff=0,contexts.needed = TRUE,tri.counts.method = 'default')
plotSignatures(test)
makePie(test)
}
dev.off()
+ 该方法有个缺陷,图片输出的title字体太大,当signature较多时显示不完全

批量输出各signature 的weights
S
1
2
3
4
5
for(i in l){
    test <- whichSignatures(tumor.ref = sigs.input,signatures.ref = signatures.cosmic,sample.id=i,signature.cutoff=0,contexts.needed = TRUE,tri.counts.method = 'default')
    res <- test$weights
    write.table(res,"cosmic.weight.txt",sep="\t",col.names=F,row.names=T,quote=F,append=T)
}
  • 可以将 signatures.ref = signatures.cosmic 改为 signatures.ref = signatures.signatures.nature2013 ,这是两个不同来源的数据集算出的signature,我们用计算出的signature 与该数据的signature进行比较,为我们的数据做注释
    • 得出的结果是:signature_sample1 = signature1 * weight1 + signature2 * weight2 + ··· + signature30 * weight30
    • 通过权重的比值对我们算出的signature进行注释