Please use this identifier to cite or link to this item: https://hdl.handle.net/1959.11/57867
Full metadata record
DC FieldValueLanguage
dc.contributor.authorFerguson, Scotten
dc.contributor.authorMcLay, Todden
dc.contributor.authorAndrew, Roseen
dc.contributor.authorBruhl, Jeremyen
dc.contributor.authorSchwessinger, Benjaminen
dc.contributor.authorBorevitz, Justinen
dc.contributor.authorJones, Ashleyen
dc.date.accessioned2024-03-21T22:42:40Z-
dc.date.available2024-03-21T22:42:40Z-
dc.date.issued2022-12-14-
dc.identifier.citationPlant Methods, 18(1), p. 1-11en
dc.identifier.issn1746-4811en
dc.identifier.urihttps://hdl.handle.net/1959.11/57867-
dc.description.abstract<p><b>Background: </b>Long-read sequencing platforms offered by Oxford Nanopore Technologies (ONT) allow native DNA containing epigenetic modifications to be directly sequenced, but can be limited by lower per-base accuracies. A key step post-sequencing is basecalling, the process of converting raw electrical signals produced by the sequencing device into nucleotide sequences. This is challenging as current basecallers are primarily based on mixtures of model species for training. Here we utilise both ONT PromethION and higher accuracy PacBio Sequel II HiFi sequencing on two plants, <i>Phebalium stellatum</i> and <i>Xanthorrhoea johnsonii</i>, to train species-specifc basecaller models with the aim of improving per-base accuracy. We investigate sequencing accuracies achieved by ONT basecallers and assess accuracy gains by training single-species and species-specifc basecaller models. We also evaluate accuracy gains from ONT’s improved fowcells (R10.4, FLO-PRO112) and sequencing kits (SQK-LSK112). For the truth dataset for both model training and accuracy assessment, we developed highly accurate, contiguous diploid reference genomes with PacBio Sequel II HiFi reads.</p> <p><b>Results:</b> Basecalling with ONT Guppy 5 and 6 super-accurate gave almost identical results, attaining read accuracies of 91.96% and 94.15%. Guppy’s plant-specifc model gave highly mixed results, attaining read accuracies of 91.47% and 96.18%. Species-specifc basecalling models improved read accuracy, attaining 93.24% and 95.16% read accuracies. R10.4 sequencing kits also improve sequencing accuracy, attaining read accuracies of 95.46% (super-accurate) and 96.87% (species-specifc).</p> <p><b>Conclusions:</b> The use of a single mixed-species basecaller model, such as ONT Guppy super-accurate, may be reducing the accuracy of nanopore sequencing, due to conflicting genome biology within the training dataset and study species. Training of single-species and genome-specifc basecaller models improves read accuracy. Studies that aim to do large-scale long-read genotyping would primarily benefit from training their own basecalling models. Such studies could use sequencing accuracy gains and improving bioinformatics tools to improve study outcomes.</p>en
dc.languageenen
dc.publisherBioMed Central Ltden
dc.relation.ispartofPlant Methodsen
dc.rightsAttribution 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.titleSpecies-specific basecallers improve actual accuracy of nanopore sequencing in plantsen
dc.typeJournal Articleen
dc.identifier.doi10.1186/s13007-022-00971-2en
dcterms.accessRightsUNE Greenen
dc.subject.keywordsRutaceae basecaller modelen
dc.subject.keywordsLong-read sequencingen
dc.subject.keywordsBasecaller trainingen
dc.subject.keywordsSequencing accuracyen
dc.subject.keywordsAsphodelaceae basecaller modelen
dc.subject.keywordsBiochemical Research Methodsen
dc.subject.keywordsPlant Sciencesen
dc.subject.keywordsBiochemistry & Molecular Biologyen
dc.subject.keywordsOxford Nanopore Technologiesen
dc.subject.keywordsPacBioen
local.contributor.firstnameScotten
local.contributor.firstnameTodden
local.contributor.firstnameRoseen
local.contributor.firstnameJeremyen
local.contributor.firstnameBenjaminen
local.contributor.firstnameJustinen
local.contributor.firstnameAshleyen
local.profile.schoolSchool of Environmental and Rural Scienceen
local.profile.schoolSchool of Environmental and Rural Scienceen
local.profile.emailrandre20@une.edu.auen
local.profile.emailjbruhl@une.edu.auen
local.output.categoryC1en
local.record.placeauen
local.record.institutionUniversity of New Englanden
local.publisher.placeUnited Kingdomen
local.identifier.runningnumber137en
local.format.startpage1en
local.format.endpage11en
local.peerreviewedYesen
local.identifier.volume18en
local.identifier.issue1en
local.access.fulltextYesen
local.contributor.lastnameFergusonen
local.contributor.lastnameMcLayen
local.contributor.lastnameAndrewen
local.contributor.lastnameBruhlen
local.contributor.lastnameSchwessingeren
local.contributor.lastnameBorevitzen
local.contributor.lastnameJonesen
dc.identifier.staffune-id:randre20en
dc.identifier.staffune-id:jbruhlen
local.profile.orcid0000-0003-0099-8336en
local.profile.orcid0000-0001-9112-4436en
local.profile.roleauthoren
local.profile.roleauthoren
local.profile.roleauthoren
local.profile.roleauthoren
local.profile.roleauthoren
local.profile.roleauthoren
local.profile.roleauthoren
local.identifier.unepublicationidune:1959.11/57867en
dc.identifier.academiclevelAcademicen
dc.identifier.academiclevelAcademicen
dc.identifier.academiclevelAcademicen
dc.identifier.academiclevelAcademicen
dc.identifier.academiclevelAcademicen
dc.identifier.academiclevelAcademicen
dc.identifier.academiclevelAcademicen
local.title.maintitleSpecies-specific basecallers improve actual accuracy of nanopore sequencing in plantsen
local.output.categorydescriptionC1 Refereed Article in a Scholarly Journalen
local.search.authorFerguson, Scotten
local.search.authorMcLay, Todden
local.search.authorAndrew, Roseen
local.search.authorBruhl, Jeremyen
local.search.authorSchwessinger, Benjaminen
local.search.authorBorevitz, Justinen
local.search.authorJones, Ashleyen
local.open.fileurlhttps://rune.une.edu.au/web/retrieve/97993002-2fd4-45ec-9f16-fecbc48ddc0ben
local.uneassociationYesen
local.atsiresearchNoen
local.sensitive.culturalNoen
local.year.published2022en
local.fileurl.openhttps://rune.une.edu.au/web/retrieve/97993002-2fd4-45ec-9f16-fecbc48ddc0ben
local.fileurl.openpublishedhttps://rune.une.edu.au/web/retrieve/97993002-2fd4-45ec-9f16-fecbc48ddc0ben
local.subject.for2020310509 Genomicsen
local.subject.for2020310510en
local.subject.seo2020TBDen
local.codeupdate.date2024-10-02T10:54:06.648en
local.codeupdate.epersonrandre20@une.edu.auen
local.codeupdate.finalisedtrueen
local.original.for20203104 Evolutionary biologyen
local.original.seo2020TBDen
local.profile.affiliationtypeExternal Affiliationen
local.profile.affiliationtypeExternal Affiliationen
local.profile.affiliationtypeUNE Affiliationen
local.profile.affiliationtypeUNE Affiliationen
local.profile.affiliationtypeExternal Affiliationen
local.profile.affiliationtypeExternal Affiliationen
local.profile.affiliationtypeExternal Affiliationen
Appears in Collections:Journal Article
School of Environmental and Rural Science
Files in This Item:
2 files
File Description SizeFormat 
openpublished/SpeciesSpecifcAndrewBruhl2022JournalArticle.pdfPublished Version1.98 MBAdobe PDF
Download Adobe
View/Open
Show simple item record
Google Media

Google ScholarTM

Check

Altmetric


This item is licensed under a Creative Commons License Creative Commons