skip to main content
10.1145/3632410.3632457acmotherconferencesArticle/Chapter ViewAbstractPublication PagescomadConference Proceedingsconference-collections
research-article
Open access

Robust Shape-regularized Non-negative Matrix Factorization for Real-time Source Apportionment

Published: 04 January 2024 Publication History

Abstract

Discovering the anthropogenic sources of pollution is a key step in air quality monitoring and management. Real-time source apportionment (RTSA) is the globally accepted standard for this task and critically uses non-negative matrix factorization (NMF) that is made challenging by the non-convexity of the problem and noisy data. In this work, we develop a technique ROSE-NMF offering improvements over state-of-the-art RTSA and NMF solvers. ROSE-NMF offers provable convergence guarantees, allows diurnal patterns to to be specified to guide the solver to an optimal solution, and is robust to row and column outliers. In multiple experiments, ROSE-NMF offered markedly improved performance over standard NMF solvers as well as commercial RTSA solvers. Code for ROSE-NMF is available at https://github.com/purushottamkar/rose-nmf.

References

[1]
Costanza Acciai, Zhenyi Zhang, Fenjuan Wang, Zhangxiong Zhong, and Giovanni Lonati. 2017. Characteristics and source Analysis of trace Elements in PM2.5 in the Urban Atmosphere of Wuhan in Spring. Aerosol and Air Quality Research 17, 9 (2017), 2224–2234. https://doi.org/10.4209/aaqr.2017.06.0207
[2]
CA Belis, Michail Pikridas, Franco Lucarelli, E Petralia, Fabrizia Cavalli, Giulia Calzolai, M Berico, and J Sciare. 2019. Source apportionment of fine PM by combining high time resolution organic and inorganic chemical composition datasets. Atmospheric Environment: X 3 (2019), 100046.
[3]
Sahil Bhandari, Shahzad Gani, Kanan Patel, Dongyu S Wang, Prashant Soni, Zainab Arub, Gazala Habib, Joshua S Apte, and Lea Hildebrandt Ruiz. 2020. Sources and atmospheric dynamics of organic aerosol in New Delhi, India: insights from receptor modeling. Atmospheric Chemistry and Physics 20, 2 (2020), 735–752.
[4]
H. S. Bhowmik, A. Shukla, V. Lalchandani, J. Dave, N. Rastogi, M. Kumar, V. Singh, and S. N. Tripathi. 2022. Inter-comparison of online and offline methods for measuring ambient heavy and trace elements and water-soluble inorganic ions (NOMath 111, SOMath 112, NHMath 113, and Cl−) in PM2.5 over a heavily polluted megacity, Delhi. Atmospheric Measurement Techniques 15, 9 (2022), 2667–2684. https://doi.org/10.5194/amt-15-2667-2022
[5]
Stephen Boyd and Lieven Vandenberghe. 2004. Convex Optimization. Cambridge University Press.
[6]
Steven G Brown, Shelly Eberly, Pentti Paatero, and Gary A Norris. 2015. Methods for estimating uncertainty in PMF solutions: Examples with ambient air and water quality data and guidance on reporting PMF results. Science of the Total Environment 518 (2015), 626–635.
[7]
Sebastien Bubeck. 2015. Convex Optimization: Algorithms and Complexity. Foundations and Trends® in Machine Learning 8, 34 (2015), 231–357.
[8]
Francesco Canonaco, Monica Crippa, Jay Gates Slowik, Urs Baltensperger, and André SH Prévôt. 2013. SoFi, an IGOR-based interface for the efficient use of the generalized multilinear engine (ME-2) for the source apportionment: ME-2 application to aerosol mass spectrometer data. Atmospheric Measurement Techniques 6, 12 (2013), 3649–3661.
[9]
Yunhua Chang, Kan Huang, Mingjie Xie, Congrui Deng, Zhong Zou, Shoudong Liu, and Yanlin Zhang. 2018. First long-term and near real-time measurement of trace elements in China’s urban atmosphere: temporal variability, source apportionment and precipitation effect. Atmospheric Chemistry and Physics 18, 16 (2018), 11793–11812.
[10]
Wen-Sheng Chen, Qianwen Zeng, and Binbin Pan. 2022. A survey of deep nonnegative matrix factorization. Neurocomputing 491 (2022), 305–320. https://doi.org/10.1016/j.neucom.2021.08.152
[11]
Yudong Chen, Constantine Caramanis, and Shie Mannor. 2013. Robust Sparse Regression under Adversarial Corruption. In Proceedings of the 30th International Conference on Machine Learning (ICML).
[12]
Andrzej Cichocki and Anh-Huy Phan. 2009. Fast Local Algorithms for Large Scale Nonnegative Matrix and Tensor Factorizations. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E92.A, 3 (2009), 708–721. https://doi.org/10.1587/transfun.E92.A.708
[13]
M. Crippa, F. Canonaco, J. G. Slowik, I. El Haddad, P. F. DeCarlo, C. Mohr, M. F. Heringa, R. Chirico, N. Marchand, B. Temime-Roussel, E. Abidi, L. Poulain, A. Wiedensohler, U. Baltensperger, and A. S. H. Prévôt. 2013. Primary and secondary organic aerosol origin by combined gas-particle phase source apportionment. Atmospheric Chemistry and Physics 13, 16 (2013), 8411–8426.
[14]
Dabrina D Dutcher, Kevin D Perry, Thomas A Cahill, and Scott A Copeland. 1999. Effects of indoor pyrotechnic displays on the air quality in the Houston Astrodome. Journal of the Air & Waste Management Association 49, 2 (1999), 156–160.
[15]
Cédric Févotte and Jérôme Idier. 2011. Algorithms for Nonnegative Matrix Factorization with the β -Divergence. Neural Computation 23, 9 (09 2011), 2421–2456. https://doi.org/10.1162/NECO_a_00168
[16]
Markus Furger, Mara Cruz Minguillón, Varun Yadav, Jay G Slowik, Christoph Hüglin, Roman Fröhlich, Krag Petterson, Urs Baltensperger, and André SH Prévôt. 2017. Elemental composition of ambient aerosols measured with high temporal resolution using an online XRF spectrometer. Atmospheric Measurement Techniques 10, 6 (2017), 2061–2076.
[17]
Qi Huang, Xuesong Yin, Songcan Chen, Yigang Wang, and Bowen Chen. 2020. Robust nonnegative matrix factorization with structure regularization. Neurocomputing 412 (2020), 72–90. https://doi.org/10.1016/j.neucom.2020.06.049
[18]
Prateek Jain and Purushottam Kar. 2017. Non-convex Optimization for Machine Learning. Foundations and Trends® in Machine Learning 10, 3–4 (2017), 142–363.
[19]
Cheol-Heon Jeong, Jon M Wang, and Greg J Evans. 2016. Source apportionment of urban particulate matter using hourly resolved trace metals, organics, and inorganic aerosol components. Atmospheric Chemistry and Physics Discussions (2016), 1–32.
[20]
UC Kulshrestha, T Nageswara Rao, S Azhaguvel, and MJ Kulshrestha. 2004. Emissions and accumulation of metals in the atmosphere due to crackers and sparkles during Diwali festival in India. Atmospheric Environment 38, 27 (2004), 4421–4425.
[21]
Matthew S Landis, J Patrick Pancras, Joseph R Graney, Emily M White, Eric S Edgerton, Allan Legge, and Kevin E Percy. 2017. Source apportionment of ambient fine and coarse particulate matter at the Fort McKay community site, in the Athabasca Oil Sands Region, Alberta, Canada. Science of the Total Environment 584 (2017), 105–117.
[22]
Daniel Lee and H. Sebastian Seung. 2000. Algorithms for Non-negative Matrix Factorization. In Advances in Neural Information Processing Systems, T. Leen, T. Dietterich, and V. Tresp (Eds.). Vol. 13. MIT Press.
[23]
Qian Li, Hongguang Cheng, Tan Zhou, Chunye Lin, and Shu Guo. 2012. The estimated atmospheric lead emissions in China, 1990–2009. Atmospheric Environment 60 (2012), 1–8.
[24]
Zhi-Quan Luo and Paul Tseng. 1993. Error bounds and convergence analysis of feasible descent methods: A general approach. Annals of Operations Research 46, 1 (1993), 157–178.
[25]
Bhaskar Mukhoty, Govind Gopakumar, Prateek Jain, and Purushottam Kar. 2019. Globally-convergent Iteratively Reweighted Least Squares for Robust Regression Problems. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS).
[26]
Hermine Nalbandian. 2012. Trace element emissions from coal. IEA Clean Coal Centre 601 (2012).
[27]
Pentti Paatero. 1999. The Multilinear Engine: A Table-Driven, Least Squares Program for Solving Multilinear Problems, including the n-Way Parallel Factor Analysis Model. Journal of Computational and Graphical Statistics 8, 4 (1999), 854–888. http://www.jstor.org/stable/1390831
[28]
Pentti Paatero and Philip K. Hopke. 2009. Rotational tools for factor analytic models. Journal of Chemometrics 23, 2 (2009), 91–100.
[29]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830.
[30]
Siwatt Pongpiachan, Akihiro Iijima, and Junji Cao. 2018. Hazard quotients, hazard indexes, and cancer risks of toxic metals in PM10 during firework displays. Atmosphere 9, 4 (2018), 144.
[31]
Pragati Rai, Markus Furger, Imad El Haddad, Varun Kumar, Liwei Wang, Atinderpal Singh, Kuldeep Dixit, Deepika Bhattu, Jean-Eudes Petit, Dilip Ganguly, 2020. Real-time measurement and source apportionment of elements in Delhi’s atmosphere. Science of the Total Environment 742 (2020), 140332.
[32]
Pragati Rai, Markus Furger, Jay G Slowik, Francesco Canonaco, Roman Fröhlich, Christoph Hüglin, Mara Cruz Minguillón, Krag Petterson, Urs Baltensperger, and André SH Prévôt. 2020. Source apportionment of highly time-resolved elements during a firework episode from a rural freeway site in Switzerland. Atmospheric Chemistry and Physics 20, 3 (2020), 1657–1674.
[33]
Ville Satopaa, Jeannie Albrecht, David Irwin, and Barath Raghavan. 2011. Finding a "Kneedle" in a Haystack: Detecting Knee Points in System Behavior. In 2011 31st International Conference on Distributed Computing Systems Workshops. 166–171. https://doi.org/10.1109/ICDCSW.2011.20
[34]
Ashutosh K Shukla, Vipul Lalchandani, Deepika Bhattu, Jay S Dave, Pragati Rai, Navaneeth M Thamban, Suneeti Mishra, Sreenivas Gaddamidi, Nidhi Tripathi, Pawan Vats, 2021. Real-time quantification and source apportionment of fine particulate matter including organics and elements in Delhi during summertime. Atmospheric Environment 261 (2021), 118598.
[35]
Ashutosh Kumar Shukla, Sachchida Nand Tripathi, Francesco Canonaco, Vipul Lalchandani, Ravi Sahu, Deepchandra Srivastava, Jay Dave, Navaneeth Meena Thamban, Sreenivas Gaddamidi, Lokesh Sahu, 2023. Spatio-temporal variation of C-PM2.5 (composition based PM2.5) sources using PMF*PMF (double-PMF) and single-combined PMF technique on real-time non-refractory, BC and elemental measurements during post-monsoon and winter at two sites in Delhi, India. Atmospheric Environment 293 (2023), 119456.
[36]
Anna Tobler, Deepika Bhattu, Francesco Canonaco, Vipul Lalchandani, Ashutosh Shukla, Navaneeth M Thamban, Suneeti Mishra, Atul K Srivastava, Deewan S Bisht, Suresh Tiwari, 2020. Chemical characterization of PM2. 5 and source apportionment of organic aerosol in New Delhi, India. Science of The Total Environment 745 (2020), 140924.
[37]
Anja H Tremper, Anna Font, Max Priestman, Samera H Hamad, Tsai-Chia Chung, Ari Pribadi, Richard JC Brown, Sharon L Goddard, Nathalie Grassineau, Krag Petterson, 2018. Field and laboratory evaluation of a high time resolution x-ray fluorescence instrument for determining the elemental composition of ambient aerosols. Atmospheric Measurement Techniques 11, 6 (2018), 3541–3557.
[38]
Stephen A. Vavasis. 2010. On the Complexity of Nonnegative Matrix Factorization. SIAM Journal on Optimization 20, 3 (2010), 1364–1377. https://doi.org/10.1137/070709967
[39]
S Visser, Jay G Slowik, M Furger, P Zotter, N Bukowiecki, F Canonaco, U Flechsig, K Appel, DC Green, AH Tremper, 2015. Advanced source apportionment of size-resolved trace elements at multiple sites in London during winter. Atmospheric Chemistry and Physics 15, 19 (2015), 11291–11309.
[40]
Yu-Xiong Wang and Yu-Jin Zhang. 2013. Nonnegative Matrix Factorization: A Comprehensive Review. IEEE Transactions on Knowledge and Data Engineering 25, 6 (2013), 1336–1353. https://doi.org/10.1109/TKDE.2012.51
[41]
Jianchao Yang, Shuicheng Yang, Yun Fu, Xuelong Li, and Thomas Huang. 2008. Non-negative graph embedding. In IEEE Conference on Computer Vision and Pattern Recognition. 1–8. https://doi.org/10.1109/CVPR.2008.4587665
[42]
Qi Zhang, Jose L Jimenez, Manjula R Canagaratna, Ingrid M Ulbrich, Nga L Ng, Douglas R Worsnop, and Yele Sun. 2011. Understanding atmospheric organic aerosols via factor analysis of aerosol mass spectrometry: a review. Analytical and bioanalytical chemistry 401 (2011), 3045–3067.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CODS-COMAD '24: Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD)
January 2024
627 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 January 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. air-quality monitoring
  2. non-negative matrix factorization
  3. robust learning
  4. shape regularization
  5. source apportionment

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

CODS-COMAD 2024

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 165
    Total Downloads
  • Downloads (Last 12 months)151
  • Downloads (Last 6 weeks)18
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media