Author(s) |
Brankovic, Ljiljana
Iliopoulos, Costas S
Kundu, Ritu
Mohamad, Manal
Pissis, Solon P
Vayani, Fatima
|
Publication Date |
2016-01-04
|
Abstract |
DNA sequencing is the process of determining the exact order of the nucleotide bases of an individual's genome in order to catalogue sequence variation and understand its biological implications. Whole-genome sequencing techniques produce masses of data in the form of short sequences known as reads. Assembling these reads into a whole genome constitutes a major algorithmic challenge. Most assembly algorithms utilise de Bruijn graphs constructed from reads for this purpose. A critical step of these algorithms is to detect typical motif structures in the graph caused by sequencing errors and genome repeats, and filter them out; one such complex subgraph class is a so-called <i>superbubble</i>. In this paper, we propose an O(<i>n</i> + <i>m</i>)-time algorithm to detect all superbubbles in a directed acyclic graph with <i>n</i> vertices and <i>m</i> (directed) edges, improving the best-known O(<i>m</i> log <i>m</i>)-time algorithm by Sung et al.
|
Citation |
Theoretical Computer Science, 609(2), p. 374-383
|
ISSN |
1879-2294
0304-3975
|
Link | |
Publisher |
Elsevier BV
|
Rights |
Attribution 4.0 International
|
Title |
Linear-time superbubble identification algorithm for genome assembly
|
Type of document |
Journal Article
|
Entity Type |
Publication
|
Name | Size | format | Description | Link |
---|---|---|---|---|
openpublished/LinearTimeBrankovic2016JournalArticle.pdf | 360.343 KB | application/pdf | Published version | View document |