AQX: Explaining Air Quality Forecast for Verifying Domain Knowledge using Feature Importance Visualization

Air pollution forecast has become critical because of its direct impact on human health and its increased production caused by rapid industrialization. Machine learning (ML) solutions are being drastically explored in this domain because they can potentially produce highly accurate results with access to historical data. However, experts in the environmental area are skeptical about adopting ML solutions in real-world applications and policy making due to their black-box nature. In contrast, despite having low accuracy sometimes, the existing traditional simulation model (e.g., CMAQ) are widely used and follows well-defined and transparent equations. Therefore, presenting the knowledge learned by the ML model can make it transparent as well as comprehensible. In addition, validating the ML model’s learning with the existing domain knowledge might aid in addressing their skepticism, building appropriate trust, and better utilizing ML models. In collaboration with three experts with an average of five years of research experience in the air pollution domain, we identified that feature (meteorological feature like wind) contribution, towards the final forecast as the major information to be verified with domain knowledge. In addition, the accuracy of ML models compared with traditional simulation models and raw wind trajectories are essential for domain experts to validate the feature contribution. Based on the identified information, we designed and developed AQX, a visual analytics system to help experts validate and verify the ML model’s learning with their domain knowledge. The system includes multiple coordinated views to present the contributions of input features at different levels of aggregation in both temporal and spatial dimensions. It also provides a performance comparison of ML and traditional models in terms of accuracy and spatial map, along with the animation of raw wind trajectories for the input period. We further demonstrated two case studies and conducted expert interviews with two domain experts to show the effectiveness and usefulness of AQX.


