Skip to main content
Log in

Distributed Generation of NASA Earth Science Data Products

  • Published:
Journal of Grid Computing Aims and scope Submit manuscript

Abstract

The objective of this work is the development of Grid-based approaches through which NASA data centers can become active participants in serving data users by transforming archived data into the specific form needed by the user. This approach involves generating custom data products from data stored in multiple NASA data centers. We describe a prototype developed to explore how Grid technology can facilitate this multi-center product generation. Our initial example of a custom data product is phenomena-based subsetting. This example involves production of a subset of a large collection of data based on the subset's association with some phenomena, such as a mesoscale convective system (severe storm) or a hurricane. We demonstrate that this subsetting can be performed on data located at a single data center or at multiple data centers. We also describe a system that performed customized data product generation using a combination of commodity processors deployed at a NASA data center, Grid technology to access these processors, and data mining software that intelligently selects where to perform processing based on data location and availability of compute resources. This demonstration also suggests that we could create a catalog of phenomena related data at multiple data centers, in which the catalog can contain references to the original data in different locations. The catalog is important to providing other users with efficient access to the data belonging to the identified phenomenon.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

  1. W. Allcock, J. Bresnahan, I. Floster, L. Liming, J. Link, and P. Plaszczac, "GridFTP Update January 2002", Globus Project Technical Report, January 2002, http://www.globus. org/datagrid/deliverables/GridFTP-Overview-200201.pdf.

  2. P. Avery and I. Foster, "The GriPhyN Project: Towards Petas-cale Virtual-Data Grids", Grid Physics Network GriPhyN 2001–14, April 17, 2000.

  3. B.R. Barkstrom, "Digital Archive Issues from the Perspec-tive of an Earth Science Data Producer", paper presented at the Digital Archive Directions (DADS) Workshop, June 22–26, 1998, available at http://ssdoo.gsfc.nasa.gov/nost/isoas/ dads/dads21b.html.

  4. B.R. Barkstrom, "Data Product Configuration Management and Versioning in Large-Scale Production of Satellite Scien-tific Data", in B. Westfechtel and A. van der Hoek (eds.), SCM 2001/2003, Lecture Notes in Computer Science, Vol. 2649, pp. 118–133, 2003.

  5. C. Baru, R. Moore, A. Rajasekar, and M. Wan, "The SDSC Storage Resource Broker", in Proceedings of the CASCON'98, Toronto, Canada, 1998.

  6. A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S. Tuecke, "The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets", Journal of Network and Computer Applications, Vol. 23, pp. 187–200, 2001.

    Google Scholar 

  7. Commodity Grid Kits, http://www-unix.globus.org/cog/.

  8. Data Mining and Exploration Middleware for Distributed and Grid Computing, University of Minnesota Supercomputing Institute, September 18–19, 2003, http://www.msi.umn.edu/ general/Symposia/dmem/agenda.htm.

  9. W. Du and G. Agrawal, "Developing Distributed Data Mining Implementations for a Grid Environment", in Proceedings 2 nd IEEE/ACM International Symposium on Cluster Computing and the Grid, Berlin, Germany, May 2002.

  10. K.I. Devlin, "Application of the 85 GHz Ice Scattering Sig-nature to a Global Study of Mesoscale Convective Systems", Master's thesis, Meteorology, Texas A&M University, August 1995.

  11. I. Foster, J. Vockler, M. Wilde, and Y. Zhao, "The Virtual Data Grid: A New Model and Architecture for Data-Intensive Col-laboration", in Proceedings of the Conference on Innovative Data System Research, 2003.

  12. A. Ghiselli, "DataGrid Prototype 1", in Proceedings of the TERENA Networking Conference, 2002.

  13. Government Data Centers: Meeting Increasing Demands, National Research Council of the National Academies, Wash-ington, DC, 2003, http://www.nap.edu.

  14. L. Guy, P. Kunszt, E. Laure, H. Stockinger, and K. Stockinger, "Replica Management in Data Grids", in Proceedings of the 5th Global Grid Forum Meeting, Edinburgh, Scotland, 2002.

  15. Th.H. Hinke, J. Rushing, H. Ranganath, and S.J. Graves, "Techniques and Experience in Mining Remotely Sensed Satellite Data", Artificial Intelligence Review: Issues on the Application of Data Mining, Vol. 14, No. 6, pp. 503–531, December 2000.

    Google Scholar 

  16. Th.H. Hinke and J. Novotny, "Data Mining on NASA's Infor-mation Power Grid", in Proceedings Ninth IEEE International Symposium on High Performance Distributed Computing, Pittsburgh, Pennsylvania, August 2000.

  17. Th.H. Hinke, J. Rushing, S. Kansal, S.J. Graves, H. Ran-ganath, and E. Criswell, "Eureka Phenomena Discovery and Phenomena Mining System", in Proceedings: 13th Interna-tional Conference on Interactive Information and Processing Systems (IIPS) for Meteorology, Oceanography and Hydrol-ogy, Long Beach, California, February 1997.

  18. Th.H. Hinke, J. Rushing, S. Kansal, S.J. Graves, and H. Ranganath, "For Scientific Data Discovery: Why Can't the Archive be More Like the Web", in Proceedings Ninth In-ternational Conference on Scientific Database Management, Evergreen State College, Olympia, Washington, August 1997.

  19. W. Johnston, D. Gannon, and B. Nitzberg, "Grids as Pro-duction Computing Environments: The Engineering Aspects of NASA's Information Power Grid", in Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing, 1999.

  20. G.V. Laszewski, I. Foster, J. Gawor, W. Smith, and S. Tuecke, "CoG Kits: A Bridge between Commodity Distributed Com-puting and High-Performance Grids", in Proceedings of the ACM Java Grande Conference, 2000.

  21. NASA Workshop on the Issues in the Application of Data Mining to Scientific Data, NASA Goddard Space Flight Center, 1999, http://datamining.itsc.uah.edu/meeting/ DMFinalReport.pdf.

  22. M. Schwaller, B. Krupp, and W. North, "Particle Physics Data Grid", Science Data Plan for the EOS Data and Information System, Technical Report, Goddard Space Flight Center, July 1996, http://www.ppdg.net.

  23. U.S. National Virtual Observatory, http://www.us-vo.org.

  24. M. Wan, A. Rajasekar, R. Moore, and P. Andrew, "A Sim-ple Mass Storage System for the SRB Data Grid", 20 th IEEE/11 th NASA Goddard Conference on Mass Storage Sys-tems & Technologies (MSST2003), San Diego, CA, April 2003.

  25. Workshop on Data Mining and Exploration Middleware for Distributed and Grid Computing, University of Minnesota Supercomputing Center, September 2003.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Barkstrom, B.R., Hinke, T.H., Gavali, S. et al. Distributed Generation of NASA Earth Science Data Products. Journal of Grid Computing 1, 101–116 (2003). https://doi.org/10.1023/B:GRID.0000024069.33399.ee

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:GRID.0000024069.33399.ee

Navigation