In our previous study (Betancourt et al. 2021), we published a benchmark dataset on ozone metrics extracted from the TOAR database along with a machine learning task. We want to tackle the challenge with machine learning to predict the ozone metrics based upon geospatial features. As a baseline, we predicted, for example, the average ozone with different methods, among them a random forest and a shallow neural network.
Although many studies, like ours, reach their goal by just making the predictions – we asked more questions. How exactly do the machine learning algorithms predict average ozone? We analyzed our machine learning models’ functioning to understand their predictions and found out even more than we expected.
By focusing on inaccurate predictions and explaining why these predictions fail, we (i) identified underrepresented samples, (ii) flagged unexpected inaccurate predictions, and (iii) pointed out training samples irrelevant for predictions. We suggested locations for building new measurement stations based on the underrepresented samples. We also showed which training samples do not substantially contribute to the model performance. We can even drop these samples without performance loss! Our study demonstrates the application of explainable machine learning beyond simply explaining the trained model.
Stadtler et al., Explainable Machine Learning Reveals Capabilities, Redundancy, and Limitations of a Geospatial Air Quality Benchmark Dataset, Machine Learning and Knowledge Extraction. 2022; 4(1):150-171, https://doi.org/10.3390/make4010008
Why is the field of machine learning advancing so fast? There are many reasons why machine learning research is flourishing. One of them is benchmark datasets. Loosely speaking, benchmark datasets combine a task with preprocessed data. The task is usually performed with a machine learning algorithm, accelerating development and performance testing. Our benchmark dataset paper (Betancourt et al. 2021) provides geospatial data paired with ozone metrics, called AQ-Bench. We tackle predicting the ozone metrics based on the geospatial data with different machine learning models. The figure below shows the concept of our study.
What is the goal of this study? In the end, we want to predict ozone. Predicting ozone metrics, for example, related to health, supports mitigating adverse effects. Nevertheless, ozone prediction is difficult due to its atmospheric chemistry and interactions with weather patterns. Computationally expensive and sophisticated models exist but we want to use machine learning. Therefore, our goal is to compose a benchmark dataset to develop machine learning for ozone prediction.
What data is in AQ-Bench? AQ-Bench consists of globally available geospatial data and ozone metrics based upon measurements, which are scarce and unevenly distributed worldwide. We took the ozone metrics from the TOAR database. The TOAR community put an enormous effort into collecting the data from different countries and providing it to us.
What about machine learning? Using our AQ-Bench, we trained different machine learning models, a linear regression, a shallow neural network, and a random forest. The study shows their performance in predicting different ozone metrics. We hope other researchers can easily reproduce our work and join the ozone research using machine learning.
Clara Betancourt, Timo Stomberg, Ribana Roscher, Martin G. Schultz, and Scarlet Stadtler, AQ-Bench: a benchmark dataset for machine learning on global air quality metrics, Earth Syst. Sci. Data, 13, 3013–3033, 2021 https://doi.org/10.5194/essd-13-3013-2021
The tropospheric ozone assessment report vegetation (Mills et al, 2018) provided the first assessment of worldwide concentration based ozone metrics relevant for vegetation. The authors caution though that for a detailed description of plant damage and the related crop loss, flux-based metrics must be available. The implementation of flux-based ozone metrics requires – unlike the already existing concentration based metrics –meteorological input data and embedded modelling.
Flux-based ozone metrics are necessary for an accurate quantification of ozone damage on vegetation, because plant pores open and close depending on environmental conditions and growing season, and thus take in more or less ozone (Figure 4, Figure 5).
In contrast to the already available concentration based metrics of ozone, they require the input of meteorological and soil data, as well as a consistent parametrization of vegetation growing seasons and the inclusion of a stomatal flux model. Embedding this model with the TOAR database will make a global assessment of stomatal ozone fluxes possible for the first time ever. This requires new query patterns, which need to merge several variables onto a consistent time axis, as well as more elaborate calculations, which are presently coded in FORTRAN.
For the stomatal flux implementation in the TOAR database service infrastructure, we collaborate with plant ecophysiologists (e.g. Lisa Emberson from SEI York, UK).
Reference: Mills, G. et al. Tropospheric Ozone Assessment Report: Present-day tropospheric ozone distribution and trends relevant to vegetation. Elem Sci Anth 6, 47 (2018).
For the interpolation of air quality data, the relationship between air quality and geo/meteorological data is crucial. Here, unsupervised methods are applied to find patterns in meta-data which was derived at air quality measurement stations, and to explain how meta-data is connected to the local air quality.
K-means clustering of meta-data at the station and in close surrounding of the station (population density, stable night lights, elevation and NOx column) was used to group the stations into clusters of similar meta data. Each cluster contains stations with distinct characteristics in meta data. Cluster 1 are city centers with a very large population density, Cluster 2 are mostly urban areas, Cluster 3 rural and Cluster 4/5 are remote stations with low/high elevation. Figure 1 shows the cluster assignments of European air quality stations, together with the heuristic station classification by TOAR-1 (Schultz et al, 2017) for comparison. The clusters also have characteristic air quality metrics.
Figure 2 shows the mean number of exceedances of 50 ppb per year for the different clusters. City centers typically have lower exceedance days, as ozone is formed downwind of sources and destroyed by NOx. Rural areas thus have a higher ozone burden (Cluster 3), and remote areas a lower burden as they are far away from pollutant sources (Cluster 4-5).
Finding links between air quality data and meta data is helpful for spatial and temporal interpolation: Meta-data is easy to access and available in global gridded form, while air quality monitoring is costly and thus only done at point locations. In the future, we will extend clustering routines to process meteorological data maps to find weather regimes that are relevant for air quality. This requires linking the clustering algorithm to a neural network for multi-channel image recognition.
Reference: Schultz, M. G. et al. Tropospheric Ozone Assessment Report: Database and Metrics Data of Global Surface Ozone Observations. Elem Sci Anth 5, 58 (2017).