官术网_书友最值得收藏!

Processing NGS data with HTSeq

HTSeq (https://htseq.readthedocs.io) is an alternative library that's used for processing NGS data. Most of the functionality made available by HTSeq is actually available in other libraries covered in this book, but you should be aware of it as an alternative way of processing NGS data. HTSeq supports, among others, FASTA, FASTQ, SAM (via pysam), VCF, GFF, and Browser Extensible Data (BED) file formats. It also includes a set of abstractions for processing (mapped) genomic data, encompassing concepts like genomic positions and intervals or alignments. A complete examination of the features of this library is beyond our scope, so we will concentrate on a small subset of features. We will take this opportunity to also introduce the BED file format.

The BED format allows for the specification of features for annotations tracks. It has many uses, but it's common to load BED files into genome browsers to visualize features. Each line includes information about at least the position (chromosome, start and end) and also optional fields such as name or strand. Full details about the format can be found at https://genome.ucsc.edu/FAQ/FAQformat.html#format1.

主站蜘蛛池模板: 天全县| 正蓝旗| 乐清市| 黄骅市| 宝丰县| 巴青县| 固原市| 民勤县| 腾冲县| 涪陵区| 涪陵区| 龙胜| 江孜县| 长子县| 读书| 衡水市| 皋兰县| 青河县| 清原| 镇坪县| 望谟县| 丰城市| 大城县| 尼玛县| 齐河县| 枣强县| 丹寨县| 伊宁市| 台前县| 东丽区| 开化县| 都兰县| 虎林市| 青州市| 双流县| 黄龙县| 融水| 沁水县| 清水河县| 南充市| 无极县|