- Bioinformatics with Python Cookbook
- Tiago Antao
- 314字
- 2021-06-10 19:01:46
There's more...
Although it's impossible to discuss all the variations of output coming from sequencer files, paired-end reads are worth mentioning because they are common and require a different processing approach. With paired-end sequencing, both ends of a DNA fragment are sequenced with a gap in the middle (called the insert). In this case, two files will be produced: X_1.FASTQ and X_2.FASTQ. Both files will have the same order and exact same number of sequences. The first sequence will be in X_1 pairs with the first sequence of X_2, and so on. With regards to the programming technique, if you want to keep the pairing information, you might perform something like this:
f1 = gzip.open('X_1.filt.fastq.gz', 'rt, enconding='utf-8')
f2 = gzip.open('X_2.filt.fastq.gz', 'rt, enconding='utf-8')
recs1 = SeqIO.parse(f1, 'fastq')
recs2 = SeqIO.parse(f2, 'fastq')
cnt = 0
for rec1, rec2 in zip(recs1, recs2):
cnt +=1
print('Number of pairs: %d' % cnt)
The preceding code reads all pairs in order and just counts the number of pairs. You will probably want to do something more, but this exposes a dialect that is based on the Python zip function that allows you to iterate through both files simultaneously. Remember to replace X for your FASTQ prefix.
Finally, if you are sequencing human genomes, you may want to use sequencing data from Complete Genomics. In this case, read the There's more section in the next recipe, where we briefly discuss Complete Genomics data.
- Flask Web全棧開發(fā)實(shí)戰(zhàn)
- 流量的秘密:Google Analytics網(wǎng)站分析與優(yōu)化技巧(第2版)
- Python科學(xué)計(jì)算(第2版)
- DevOps for Networking
- C#編程入門指南(上下冊(cè))
- Learning Flask Framework
- Web Application Development with R Using Shiny(Second Edition)
- SharePoint Development with the SharePoint Framework
- Python編程從0到1(視頻教學(xué)版)
- Python機(jī)器學(xué)習(xí)算法: 原理、實(shí)現(xiàn)與案例
- AutoCAD 2009實(shí)訓(xùn)指導(dǎo)
- Spring+Spring MVC+MyBatis從零開始學(xué)
- Training Systems Using Python Statistical Modeling
- Python預(yù)測(cè)分析實(shí)戰(zhàn)
- C語言程序設(shè)計(jì)教程