Simpson, GL;
Turner, S;
Brooks, S;
Greenwood, M;
Yang, H;
Monteith, D;
Patrick, S;
(2009)
Acid Waters macroInvertebrate Status Tool.
ENSIS Ltd. / UCL Environmental Change Research Centre
Preview |
Text
ensis_report_2009_Simpson_WFD60_InvertTool.pdf - Published Version Download (4MB) | Preview |
Abstract
BACKROUND TO RESEARCH: This project (WFD60) forms part of the UK Strategy for the implementation of the EC Water Framework Directive (WFD: European Union, 2000). Within its broad remit the WFD requires the development of ecological classification tools for the purpose of determining ecological status with reference to specific environmental pressures. The WFD requires that these tools should assign lakes to one of five categories (High, Good, Moderate, Poor, Bad) to indicate conditions relative to what is considered “high status”. This report focuses on the further development of a tool with which to determine the extent of the pressure of acidification on lake macroinvertebrate communities. OBJECTIVES OF RESEARCH: The primary objective is the development of a method and tool with which to assess the pressure of acidification (a major threat to the ecology of acid-sensitive freshwaters, particularly in the UK uplands) on the benthic macroinvertebrate assemblage of lakes. This phase of the WFD60 project focuses on the development of a new, improved version of the original WFD60 classification tool made possible by a programme of new data collection. KEY FINDINGS AND RECOMMENDATION: Development of the new WFD60 tool was based closely on the observations from and the results of the first phase of the WFD60 project (Monteith and Simpson 2007). Briefly, knowledge on lake acidification status was combined into a simple measure of acidification damage based upon acid neutralising capacity (ANC) and lake water Calcium concentration (Ca^{2+}), and a simple classification scheme known as the “damage matrix” was produced. This damage matrix represents our current understanding of ecosystem damage resulting from acidification, and is used to provide an assessment of the true acidification status of lakes in the WFD60 data set. We then employed Random Forests, a sophisticated statistical data mining methodology to build an ensemble (collection) of classification trees that contained decision rules that attempt to predict the a prior assigned damage matrix classification using only the macro species data. We supplement the raw macroinvertebrate data with “meta taxa”; aggregations of subsets of the species data that reflect aspects of the community such as total species richness or richness of important acid tolerant or acid intolerant indicator groups. We show that the random forest approach can produce a classification tool that is able to correctly predict the damage matrix class for 99% of the samples in the WFD60 data set, incorrectly assigning the wrong class to a single observation. This represents the apparent performance of the tool and will over-estimate the expected performance when the tool is applied to new samples that have not taken part in the model building. Cross-validated performance estimates suggest that the correct class is assigned 50% of the time. Whilst this figure appears low, it is twice as good as randomly guessing the class and is comparable to other WFD tools developed for acidification status (LAMM and CPET). However, due to uneven sampling of the four status classes predicted by the new tool we believe this performance estimate to be an under-estimate due to problems performing the bootstrap sampling in cases where some classes contain relatively few samples. We believe the true performance of the tool lies somewhere between the apparent and cross-validated error and as such the new tool performs favourably compared to LAMM and much better than CPET for the subset of samples to which the three tools have been applied. A major addition to the new tool is the calculation of sample EQRs from the output form the random forest. The four classes predicted by the new tool are assigned a base score. The final EQR for an individual sample or site is computed as a weighted average of these base scores, with weights for each base score given by the probability that the sample belong to that status class. We show that this EQR performs favourably compared to WFD normative definitions and measures of acidification damage. Furthermore, we show that the EQR contains an element of uncertainty in the assigned classification; a site that could just as likely be assigned Good status or Moderate status should have a lower EQR than a site that is clearly in Good status, even if both sites are ultimately assigned to Good status on the basis of majority votes from the random forest. A further development of the new tool has allowed sample-specific estimates of prediction uncertainty or confidence of class to be determined. The confidence of class is taken as the proportion of votes for the assigned class out of the total votes for all classes for individual samples. This measure is computed for all four classes. As such, to simplify the confidence of class information, we use Shannon's entropy measure to combine the four confidence of class measures. A low value (close to 0) of Shannon's entropy indicates high confidence of class and a high value (close to 1) indicates low confidence of class. Several assessments of the new tool have been performed as part of this phase of WFD60. The first assessment looked at performance of the tool against temporal data from the lake sites of the UK Acid Waters Monitoring Network (UK AWMN). The second assessment compared the results from AWIST with data from palaeoecological studies at two acidified sites and one minimally impacted site. Despite complications arising from the nature of the individual data (large temporal variability in the macroinvertebrate time series data, and incomplete preservation of the macroinvertebrate community in the palaeoecological data), the results from applying AWIST to the test sites show that the new tool can track recovery in the macroinvertebrate community through time in several UK AWMN sites and that the tool predicts acidification status that reflects the acidification history of the two acidified test lakes as demonstrated by diatom and macrofossil remains. In conclusion, the new tool developed as part of the second phase of WFD60 is capable of providing relevant information for WFD purposes and that AWIST is able to capture and reflect known acidification status and track temporal recovery in macroinvertebrate assemblages where this is occuring in UK AWMN sites. Additional work should be performed to investigate whether and how LAMM, CPET and AWIST predictions can be combined to provide an ensemble classification tool that draws upon the best features of each of the tools. This will require further collaborative work between the tool developers and the relevant agencies to conduct relevant comparisons and testing. This additional work may allow time to tweak the underlying AWIST random forest to improve the cross-validation performance statistics by performing stratified bootstrap resampling within with random forest algorithm.
Type: | Report |
---|---|
Title: | Acid Waters macroInvertebrate Status Tool |
Open access status: | An open access version is available from UCL Discovery |
Publisher version: | https://www.geog.ucl.ac.uk/research/research-centr... |
Language: | English |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL SLASH UCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of S&HS UCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of S&HS > Dept of Geography |
URI: | https://discovery.ucl.ac.uk/id/eprint/10116226 |
Archive Staff Only
View Item |