The invasion of freshwater ecosystems is a particularly alarming phenomenon in the Iberian Peninsula. Habitat suitability modelling is a proficient approach to extract knowledge about species ecology and to guide adequate management actions. Decision-trees are an interpretable modelling technique widely used in ecology, able to handle strongly nonlinear relationships with high order interactions and diverse variable types. Decision-trees recursively split the input space into two parts maximising child node homogeneity. This recursive partitioning is typically performed with axis-parallel splits in a top-down fashion. However, recent developments of the R packages oblique.tree, which allows the development of oblique split-based decision-trees, and evtree, which performs globally optimal searches with evolutionary algorithms to do so, seem to outperform the standard axisparallel top-down algorithms; CART and C5.0. To evaluate their possible use in ecology, the two new partitioning algorithms were compared with the two well-known, standard axis-parallel algorithms. The entire process was performed in R by simultaneously tuning the decision-tree parameters and the variables subset with a genetic algorithm and modelling the presence–absence of the Iberian gudgeon (Gobio lozanoi; Doadrio and Madeira, 2004), an invasive fish species that has spread across the Iberian Peninsula. The accuracy and complexity of the trees, the modelled patterns of mesohabitat selection and the variables importance were compared. None of the new R packages, namely oblique.tree and evtree, outperformed the C5.0 algorithm. They rendered almost the same decision-trees as the CART algorithm, although they were completely interpretable – they performed from four to eight partitions – in comparison with C5.0, which resulted in a more complex structure with 17 partitions. Oblique.tree proved to be affected by prevalence and it does not include the possibility of weighting the observations, which potentially discourage its actual use. Although the use of evtree did not suggest a major improvement compared with the remaining packages, it allowed the development of regression trees which may be informative for additional modelling tasks such as abundance estimation. Looking at the resulting decision-trees, the optimal habitats for the Iberian gudgeon were large pools in lowland river segments with depositional areas and aquatic vegetation present, which typically appeared in the form of scattered macrophytes clumps. Furthermore, Iberian gudgeon seems to avoid habitats characterised by scouring phenomena and limited vegetated cover availability. Accordingly, we can assume that river regulation and artificial impoundment would have favoured the spread of the Iberian gudgeon across the entire peninsula

Comparing four methods for decision-tree induction: A case study on the invasive Iberian gudgeon (Gobio lozanoi; Doadrio and Madeira, 2004) / Muñoz-mas, R.; Fukuda, S.; Vezza, P.; Martinez-Capel, F.. - In: ECOLOGICAL INFORMATICS. - ISSN 1574-9541. - ELETTRONICO. - 34:(2016), pp. 22-34. [10.1016/j.ecoinf.2016.04.011]

Comparing four methods for decision-tree induction: A case study on the invasive Iberian gudgeon (Gobio lozanoi; Doadrio and Madeira, 2004)

Vezza, P.;
2016

Abstract

The invasion of freshwater ecosystems is a particularly alarming phenomenon in the Iberian Peninsula. Habitat suitability modelling is a proficient approach to extract knowledge about species ecology and to guide adequate management actions. Decision-trees are an interpretable modelling technique widely used in ecology, able to handle strongly nonlinear relationships with high order interactions and diverse variable types. Decision-trees recursively split the input space into two parts maximising child node homogeneity. This recursive partitioning is typically performed with axis-parallel splits in a top-down fashion. However, recent developments of the R packages oblique.tree, which allows the development of oblique split-based decision-trees, and evtree, which performs globally optimal searches with evolutionary algorithms to do so, seem to outperform the standard axisparallel top-down algorithms; CART and C5.0. To evaluate their possible use in ecology, the two new partitioning algorithms were compared with the two well-known, standard axis-parallel algorithms. The entire process was performed in R by simultaneously tuning the decision-tree parameters and the variables subset with a genetic algorithm and modelling the presence–absence of the Iberian gudgeon (Gobio lozanoi; Doadrio and Madeira, 2004), an invasive fish species that has spread across the Iberian Peninsula. The accuracy and complexity of the trees, the modelled patterns of mesohabitat selection and the variables importance were compared. None of the new R packages, namely oblique.tree and evtree, outperformed the C5.0 algorithm. They rendered almost the same decision-trees as the CART algorithm, although they were completely interpretable – they performed from four to eight partitions – in comparison with C5.0, which resulted in a more complex structure with 17 partitions. Oblique.tree proved to be affected by prevalence and it does not include the possibility of weighting the observations, which potentially discourage its actual use. Although the use of evtree did not suggest a major improvement compared with the remaining packages, it allowed the development of regression trees which may be informative for additional modelling tasks such as abundance estimation. Looking at the resulting decision-trees, the optimal habitats for the Iberian gudgeon were large pools in lowland river segments with depositional areas and aquatic vegetation present, which typically appeared in the form of scattered macrophytes clumps. Furthermore, Iberian gudgeon seems to avoid habitats characterised by scouring phenomena and limited vegetated cover availability. Accordingly, we can assume that river regulation and artificial impoundment would have favoured the spread of the Iberian gudgeon across the entire peninsula
File in questo prodotto:
File Dimensione Formato  
Munoz-Mas_et_al_2016.pdf

accesso aperto

Descrizione: Manoscritto
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 1.83 MB
Formato Adobe PDF
1.83 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2685090
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo