globalchange  > 过去全球变化的重建
DOI: 10.1371/journal.pone.0130821
论文题名:
QuorUM: An Error Corrector for Illumina Reads
作者: Guillaume Marçais; James A. Yorke; Aleksey Zimin
刊名: PLOS ONE
ISSN: 1932-6203
出版年: 2015
发表日期: 2015-6-17
卷: 10, 期:6
语种: 英语
英文关键词: Mammalian genomics ; Staphylococcus ; Corals ; Sequence assembly tools ; Genome sequencing ; Genomics statistics ; Computer software ; Genome complexity
英文摘要: Motivation Illumina Sequencing data can provide high coverage of a genome by relatively short (most often 100 bp to 150 bp) reads at a low cost. Even with low (advertised 1%) error rate, 100 × coverage Illumina data on average has an error in some read at every base in the genome. These errors make handling the data more complicated because they result in a large number of low-count erroneous k-mers in the reads. However, there is enough information in the reads to correct most of the sequencing errors, thus making subsequent use of the data (e.g. for mapping or assembly) easier. Here we use the term “error correction” to denote the reduction in errors due to both changes in individual bases and trimming of unusable sequence. We developed an error correction software called QuorUM. QuorUM is mainly aimed at error correcting Illumina reads for subsequent assembly. It is designed around the novel idea of minimizing the number of distinct erroneous k-mers in the output reads and preserving the most true k-mers, and we introduce a composite statistic π that measures how successful we are at achieving this dual goal. We evaluate the performance of QuorUM by correcting actual Illumina reads from genomes for which a reference assembly is available. Results We produce trimmed and error-corrected reads that result in assemblies with longer contigs and fewer errors. We compared QuorUM against several published error correctors and found that it is the best performer in most metrics we use. QuorUM is efficiently implemented making use of current multi-core computing architectures and it is suitable for large data sets (1 billion bases checked and corrected per day per core). We also demonstrate that a third-party assembler (SOAPdenovo) benefits significantly from using QuorUM error-corrected reads. QuorUM error corrected reads result in a factor of 1.1 to 4 improvement in N50 contig size compared to using the original reads with SOAPdenovo for the data sets investigated. Availability QuorUM is distributed as an independent software package and as a module of the MaSuRCA assembly software. Both are available under the GPL open source license at http://www.genome.umd.edu. Contact gmarcais@umd.edu.
URL: http://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0130821&type=printable
Citation statistics:
资源类型: 期刊论文
标识符: http://119.78.100.158/handle/2HF3EXSE/21213
Appears in Collections:过去全球变化的重建
影响、适应和脆弱性
科学计划与规划
气候变化与战略
全球变化的国际研究计划
气候减缓与适应
气候变化事实与影响

Files in This Item:
File Name/ File Size Content Type Version Access License
journal.pone.0130821.PDF(436KB)期刊论文作者接受稿开放获取View Download

作者单位: IPST, University of Maryland, College Park, MD, USA;IPST, University of Maryland, College Park, MD, USA;IPST, University of Maryland, College Park, MD, USA

Recommended Citation:
Guillaume Marçais,James A. Yorke,Aleksey Zimin. QuorUM: An Error Corrector for Illumina Reads[J]. PLOS ONE,2015-01-01,10(6)
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[Guillaume Marçais]'s Articles
[James A. Yorke]'s Articles
[Aleksey Zimin]'s Articles
百度学术
Similar articles in Baidu Scholar
[Guillaume Marçais]'s Articles
[James A. Yorke]'s Articles
[Aleksey Zimin]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[Guillaume Marçais]‘s Articles
[James A. Yorke]‘s Articles
[Aleksey Zimin]‘s Articles
Related Copyright Policies
Null
收藏/分享
文件名: journal.pone.0130821.PDF
格式: Adobe PDF
此文件暂不支持浏览
所有评论 (0)
暂无评论
 

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.