| Title |
|
Linear-time superbubble identification algorithm for genome assembly |
|
|
| Publication Date |
|
| Author(s) |
|
| Type of document |
|
| Language |
|
| Entity Type |
|
| Publisher |
|
| Place of publication |
|
| DOI |
|
10.1016/j.tcs.2015.10.021 |
|
|
| UNE publication id |
|
| Abstract |
|
DNA sequencing is the process of determining the exact order of the nucleotide bases of an individual's genome in order to catalogue sequence variation and understand its biological implications. Whole-genome sequencing techniques produce masses of data in the form of short sequences known as reads. Assembling these reads into a whole genome constitutes a major algorithmic challenge. Most assembly algorithms utilise de Bruijn graphs constructed from reads for this purpose. A critical step of these algorithms is to detect typical motif structures in the graph caused by sequencing errors and genome repeats, and filter them out; one such complex subgraph class is a so-called superbubble. In this paper, we propose an O(n + m)-time algorithm to detect all superbubbles in a directed acyclic graph with n vertices and m (directed) edges, improving the best-known O(m log m)-time algorithm by Sung et al. |
|
|
| Link |
|
| Citation |
|
Theoretical Computer Science, 609(2), p. 374-383 |
|
|
| ISSN |
|
| Start page |
|
| End page |
|
| Rights |
|
Attribution 4.0 International |
|
|