Discovering Communicable Models from Earth Science Data

Schwabacher, Mark; Langley, Pat; Potter, Christopher; Klooster, Steven; Torregrosa, Alicia

doi:10.1007/978-3-540-73920-3_7

Mark Schwabacher¹,
Pat Langley²,
Christopher Potter³,
Steven Klooster^3,4 &
…
Alicia Torregrosa^3,4

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4660))

800 Accesses

Abstract

This chapter describes how we used regression rules to improve upon results previously published in the Earth science literature. In such a scientific application of machine learning, it is crucially important for the learned models to be understandable and communicable. We recount how we selected a learning algorithm to maximize communicability, and then describe two visualization techniques that we developed to aid in understanding the model by exploiting the spatial nature of the data. We also report how evaluating the learned models across time let us discover an error in the data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Machine Learning, Big Data, and Spatial Tools: A Combination to Reveal Complex Facts That Impact Environmental Health

Special Issue: Geostatistics and Machine Learning

Article Open access 21 March 2022

Troubles in the Paradise: Hydrology Does not Respond to Newtonian Mechanics and the Rise of Machines

References

Andrienko, G.L., Andrienko, N.V.: Interactive maps for visual data exploration. International Journal Geographic Information Science 13, 355–374 (1999)
Article Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and regression trees. Wadsworth, Belmont, CA (1984)
Google Scholar
Brodley, C.E., Friedl, M.A.: Identifying mislabeled training data. Journal of Artificial Intelligence Research 11, 131–167 (1999)
MATH Google Scholar
Brunk, C., Kelly, J., Kohavi, R.: MineSet: An integrated system for data mining. In: Proceedings of the Second International Conference of Knowledge Discovery and Data Mining, Portland, OR, pp. 135–138 (1996)
Google Scholar
Chen, H.S.: Remote sensing calibration systems: An introduction. A. Deepak Publishing, Hampton, VA (1997)
Google Scholar
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference of Knowledge Discovery and Data Mining, Portland, OR, pp. 226–231 (1996)
Google Scholar
John, G.A.: Robust decision trees: Removing outliers from data. In: Proceedings of the First International Conference of Knowledge Discovery and Data Mining, Montreal, Canada, pp. 174–179 (1995)
Google Scholar
Keim, D.A., Kriegel, H.-P.: Visualization techniques for mining large databases: A comparison. Transactions on Knowledge and Data Engineering 8, 923–938 (1996)
Article Google Scholar
Kodratoff, Y., Nédellec, C. (eds.): Working Notes of the IJCAI-95 Workshop on Machine Learning and Comprehensibility, Montreal, Canada (1995)
Google Scholar
Lieth, H.: Modeling the primary productivity of the world. In: Lieth, H., Whittaker, R.H. (eds.) Primary Productivity of the Biosphere, pp. 237–263. Springer, Heidelberg (1975)
Google Scholar
Michalski, R.S.: A theory and methodology of inductive learning. Artificial Intelligence 20, 111–161 (1983)
Article MathSciNet Google Scholar
Pazzani, M.J., Bay, S.D.: The independent sign bias: gaining insight from multiple linear regression. In: Proceeding of the Twenty-First Annual Meeting of the Cognitive Science Society, Vancouver, Canada (1999)
Google Scholar
Potter, C.S., Brooks, V.: Global analysis of empirical relations between annual climate and seasonality of NDVI. International Journal of Remote Sensing 19, 2921–2948 (1998)
Article Google Scholar
Potter, C.S., Klooster, S.A.: Interannual variability in soil trace gas (CO ₂, N ₂ O, NO) fluxes and analysis of controllers on regional to global scales. Global Biochemical Cycles 12, 621–635 (1998)
Article Google Scholar
Potter, C.S., Klooster, S.A., Brooks, V.: Interannual variability in terrestrial net primary production: Exploration of trends and controls on regional to global scales. Ecosystems 2(1), 36–48 (1999)
Article Google Scholar
Provost, F., Kohavi, R.: On applied research in machine learning. Machine Learning 30, 127–132 (1998)
Article Google Scholar
Quinlan, J.R.: Learning with continuous classes. In: Proceedings of the Australian Joint Conference on Artificial Intelligence, Hobart, Australia, pp. 343–348 (1992)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA (1993)
Google Scholar
Rheingans, P., desJardins, M.: Visualizing high-dimensional predictive model quality. In: Proceedings of the Eleventh IEEE Visualization Conference, Salt Lake City, UT, pp. 493–496 (2000)
Google Scholar
RuleQuest. RuleQuest Research data mining tools (2002), http://www.rulequest.com
Schwabacher, M., Langley, P.: Discovering communicable scientific knowledge from spatio-temporal data. In: Proceedings of the Eighteenth International Conference on Machine Learning, Stanford, CA, pp. 489–496 (2001)
Google Scholar
Smyth, P., Ghil, M., Ide, K.: Multiple regimes in Northern hemisphere height fields via mixture model clustering. Journal of the Atmospheric Sciences 56 (1999)
Google Scholar
SPIN!: Spatial mining for data of public interest (2002), http://www.ccg.leeds.ac.uk/spin
Thornthwaite, C.W.: An approach toward rational classification of climate. Geographical Review 38, 55–94 (1948)
Article Google Scholar
Todorovski, L., Dzeroski, S.: Declarative bias in equation discovery. In: Proceedings of the Fourteenth International Conference on Machine Learning, Nashville, TN, pp. 376–384 (1997)
Google Scholar
Tufte, E.R.: The visual display of quantitative information. Graphics Press, Cheshire (1983)
Google Scholar
Weiss, S., Indurkhya, N.: Rule-based regression. In: Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, Chambéry, France, pp. 1072–1078 (1993)
Google Scholar
Willmott, C.J., Feddema, J.J.: A more rational climate moisture index. Professional Geographer 44, 84–87 (1992)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Intelligent Systems Division, NASA Ames Research Center, Moffett Field, California, USA
Mark Schwabacher
Institute for the Study of Learning and Expertise, Palo Alto, California, USA
Pat Langley
Earth Science Division, NASA Ames Research Center, Moffett Field, California, USA
Christopher Potter, Steven Klooster & Alicia Torregrosa
Earth System Science and Policy, California State University Monterey Bay, Seaside, California, USA
Steven Klooster & Alicia Torregrosa

Authors

Mark Schwabacher
View author publications
You can also search for this author in PubMed Google Scholar
Pat Langley
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Potter
View author publications
You can also search for this author in PubMed Google Scholar
Steven Klooster
View author publications
You can also search for this author in PubMed Google Scholar
Alicia Torregrosa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Sašo Džeroski Ljupčo Todorovski

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Schwabacher, M., Langley, P., Potter, C., Klooster, S., Torregrosa, A. (2007). Discovering Communicable Models from Earth Science Data. In: Džeroski, S., Todorovski, L. (eds) Computational Discovery of Scientific Knowledge. Lecture Notes in Computer Science(), vol 4660. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73920-3_7

Download citation

DOI: https://doi.org/10.1007/978-3-540-73920-3_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73919-7
Online ISBN: 978-3-540-73920-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics