Selecting thresholds of occurrence in the prediction of species distributions

Title
Selecting thresholds of occurrence in the prediction of species distributions
Publication Date
2005
Author(s)
Liu, Canran
Berry, PM
Dawson, TP
Pearson, RG
Type of document
Journal Article
Language
en
Entity Type
Publication
Publisher
Wiley-Blackwell Publishing, Inc
Place of publication
United States of America
DOI
10.1111/j.0906-7590.2005.03957.x
UNE publication id
une:8729
Abstract
Transforming the results of species distribution modelling from probabilities of or suitabilities for species occurrence to presences/absences needs a specific threshold. Even though there are many approaches to determining thresholds, there is no comparative study. In this paper, twelve approaches were compared using two species in Europe and artificial neural networks, and the modelling results were assessed using four indices: sensitivity, specificity, overall prediction success and Cohen's kappa statistic. The results show that prevalence approach, average predicted probability/suitability approach, and three sensitivity-specificity-combined approaches, including sensitivity-specificity sum maximization approach, sensitivity-specificity equality approach and the approach based on the shortest distance to the top-left corner (0,1) in ROC plot, are the good ones. The commonly used kappa maximization approach is not as good as the afore-mentioned ones, and the fixed threshold approach is the worst one. We also recommend using datasets with prevalence of 50% to build models if possible since most optimization criteria might be satisfied or nearly satisfied at the same time, and therefore it's easier to find optimal thresholds in this situation. Predicting species distributions is becoming increasingly important since it is relevant to resource assessment, environmental conservation and biodiversity management (Fielding and Bell 1997, Manel et al. 1999, Austin 2002, D'heygere et al. 2003). Many modeling techniques have been used for this purpose, e.g. generalized linear models (GLM), generalized additive models (GAM), classification and regression trees (CARTs), principal components analysis (PCA), artificial neural networks (ANNs) (Guisan and Zimmermann 2000, Moisen and Frescino 2002, Guisan et al. 2002, Berg et al. 2004). And most of the techniques give the results as the probability of species presence, e.g. GLM, GAM and some algorithms of ANNs, or environmental suitability for the target species, e.g. PCA (Robertson et al. 2003) and some algorithms of ANNs. However, in conservation and environmental management practice, the information presented as species presence/absence may be more practical than presented as probability or suitability. Therefore, a threshold is needed to transform the probability or suitability data to presence/absence data. A threshold is also needed when assessing model performance using the indices derived from the confusion matrix (Manel et al. 2001), which also facilitates the interpretation of modelling results. Before reviewing threshold determination approaches, we will review these model assessment indices first because some of these indices are also the only or primary component of some threshold determination approaches.
Link
Citation
Ecography, 28(3), p. 385-393
ISSN
1600-0587
0906-7590
Start page
385
End page
393

Files:

NameSizeformatDescriptionLink