Title |
Linear-time superbubble identification algorithm for genome assembly |
|
|
Publication Date |
|
Author(s) |
|
Type of document |
|
Language |
|
Entity Type |
|
Publisher |
|
Place of publication |
|
DOI |
10.1016/j.tcs.2015.10.021 |
|
|
UNE publication id |
|
Abstract |
DNA sequencing is the process of determining the exact order of the nucleotide bases of an individual's genome in order to catalogue sequence variation and understand its biological implications. Whole-genome sequencing techniques produce masses of data in the form of short sequences known as reads. Assembling these reads into a whole genome constitutes a major algorithmic challenge. Most assembly algorithms utilise de Bruijn graphs constructed from reads for this purpose. A critical step of these algorithms is to detect typical motif structures in the graph caused by sequencing errors and genome repeats, and filter them out; one such complex subgraph class is a so-called superbubble. In this paper, we propose an O(n + m)-time algorithm to detect all superbubbles in a directed acyclic graph with n vertices and m (directed) edges, improving the best-known O(m log m)-time algorithm by Sung et al. |
|
|
Link |
|
Citation |
Theoretical Computer Science, 609(2), p. 374-383 |
|
|
ISSN |
|
Start page |
|
End page |
|
Rights |
Attribution 4.0 International |
|
|