Abstract
Data mining and visualization in very large spatiotemporal databases requires three kinds of computing parallelism: file system, data processor, and visualization or rendering farm. Transparent data cube combines on the same hardware a database cluster for active storage of spatiotemporal data with an MPI compute cluster for data processing and rendering on a tiled-display video wall. This approach results in a scalable and inexpensive architecture for interactive analysis and high-resolution mapping of environmental and remote sensing data which we use for comparative study of the climate and vegetation change.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Riedel, E., Gibson, G., Faloutsos, C.: Active storage for large-scale data mining and multimedia. In: Proceedings of 24th International Conference on Very Large Data Bases (VLDB), pp. 62–73 (1998)
Mesnier, M., Ganger, G., Riedel, E.: Object-based storage. IEEE Commun. Mag. 41, 84–90 (2005)
Wang, F., Oral, S., Shipman, G., Drokin, O., Wang, T., Huang, I.: Understanding Lustre File System Internals, Technical Report, National Center for Computational Sciences, ORNL/TM-2009/117 (2009). http://wiki.lustre.org/images/d/da/UnderstandingLustreFilesystem_Internals.pdf. Accessed 9 Jan 2011
Felix, E.J., Fox, K., Regimbal, K., Nieplocha, J.: Active Storage processing in a parallel file system. In: Proceedings of the 6th LCI International Conference on Linux Clusters: The HPC Revolution (2006)
Piernas, J., Nieplocha, J., Felix, E.J.: http://sc07.supercomputing.org/schedule/pdf/pap287.pdf(2007). Accessed 9 Jan 2011
Ghemawat, S., Gobioff, H., Leung, S.T.: The Google File System, SOSP’03, Bolton Landing. http://labs.google.com/papers/gfs-sosp2003.pdf(2003). Accessed 9 Jan 2011
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D. A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: A Distributed Storage System for Structured Data, OSDI’06: Seventh Symposium on Operating System Design and Implementation, Seattle (2006). http://labs.google.com/papers/bigtable.html. Accessed 9 Jan 2011
Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters, OSDI’04: Sixth Symposium on Operating System Design and Implementation, San Francisco. http://labs.google.com/papers/mapreduce.html(2004). Accessed 9 Jan 2011
Lam, C.: Hadoop in Action, p. 325, 1st edn. Manning Publications, CT. ISBN 1935182196 (2010)
Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks, European Conference on Computer Systems (EuroSys), Lisbon, Portugal. http://research.microsoft.com/research/sv/Dryad/eurosys07.pdf.(2007). Accessed 9 Jan 2011
Szalay, A.S., Bell, G., Vandenberg, J., Wonders, A., Burns, R., Fay, D., Heasley, J., Hey, T., Nieto-SantiSteban, M., Thakar, A., van Ingen, C., Wilton, R.: GrayWulf: Scalable Clustered Architecture for Data Intensive Computing. In: Proceedings of 42nd Hawaii International Conference System Sciences, pp. 1–10. http://hssl.cs.jhu.edu/papers/szalayhicss09.pdf(2009). Accessed 9 Jan 2011
Hey, T., Tansley, S., Tolle, K.: The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research, p. 287 http://research.microsoft.com/en-us/collaboration/fourthparadigm/4thparadigmbook_complete_lr.pdf(2009). Accessed 9 Jan 2011
Kossmann, D., Kraska, T., Loesing, S.: An Evaluation of Alternative Architectures for Transaction Processing in the Cloud, SIGMOD’10, Indianapolis, pp. 579–590. http://systems.ethz.pubzone.org/pages/publications/showPublication.do?pos=0&publicationId=1363428(2010). Accessed 9 Jan 2011
Zhizhin, M.N., Rouland, D., Bonnin, J., Gvishiani, A.D., Burtsev, A.: Rapid estimation of earthquake source parameters from pattern analysis of waveforms recorded at a single three-component broadband station. Bull. Seism. Soc. Am. 96, 2329–2347 (2006). doi:10.1029/2005SW000199
Zhizhin, M., Poyda, A., Mishin, D., Medvedev, D., Kihn, E., Lyutsarev, V.: Grid data mining with environmental scenario search engine (ESSE). In: Dubitsky, W. (ed.) Data Mining Techniques in Grid Computing Environments, pp. 281–306. Wiley, NY (2008)
Elvidge, C.D., Ziskin, D., Baugh, K.E., Tuttle, B.T., Ghosh, T., Pack, D.W., Erwin, E.H., Zhizhin, M.: A fifteen year record of global natural gas flaring derived from satellite data. Energies 2, 595–622 (2009). doi:10.3390/en20300595
Zhizhin,., Kihn, E., Redmon, R., Medvedev, D., Mishin, D.: Space physics interactive data resource – SPIDR. Earth Sci. Informat. 1, 79–91 (2008). doi: 10.1007/s12145–008–0012–5
Common Data Model (CDM) by UNIDATA. http://www.unidata.ucar.edu/software/netcdf/CDM/(2011). Accessed 9 Jan 2011
Michalakes, J.: The same-source parallel MM5. Sci. Program. 8, 5–12 (2000)
Kihn, E.A., Zhizhin, M., Kamide, Y.: An analog forecast model for the high-latitude ionospheric potential based on assimilative mapping of ionospheric electrodynamics archives. Space Weather 4, S05001 (2006)
NetCDF file format and API by UNIDATA. http://www.unidata.ucar.edu/software/netcdf/(2011). Accessed 9 Jan 2011
National Center for Supercomputing Applications Introduction to HDF5. University of Illinois at Urbana Champaign. http://hdf2.ncsa.uiuc.edu/HDF5/doc/H5.intro.html(1998). Accessed 9 Jan 2011
Jianwei, L., Liao, W., Choudhary, A., Ross, R., Thakur, R., Gropp, W., Latham, R., Siegel, A., Gallagher, B., Zingale, M.: Parallel netCDF: A high-performance scientific I/O interface, Supercomputing ACM/IEEE Conference, p. 39 (2003)
Antonioletti, M., Atkinson, M.P., Baxter, R., Borley, A., Chue Hong, N.P., Collins, B., Hardman, N., Hume, A., Knox, A., Jackson, M., Krause, A., Laws, S., Magowan, J., Paton, N.W., Pearson, D., Sugden, T., Watson, P., Westhead, M.: The design and implementation of grid database services in OGSA-DAI. Concurrency Comput. Pract. Ex. 17, 357–376 (2005)
http://www.ogsadai.org.uk/(2011). Accessed 9 Jan 2011
Baumann, P., Dehmel, A., Furtado, P., Ritsch, R., Widmann, N.: The multidimensional database system RasDaMan. In: Proceedings of ACM SIGMOD International Conference on Management of data, Seattle WA, 575–577. http://www.rasdaman.com(1998). Accessed 9 Jan 2011
Kalnay, E., et al.: The NCEP/NCAR 40-year reanalysis project. Bull Am. Meteorol. Soc. 77, 437–471. http://www.cdc.noaa.gov/cdc/reanalysis/(1996). Accessed 9 Jan 2011
Matlab NetCDF Toolbox. http://mexcdf.sourceforge.net/index.php(2011). Accessed 9 Jan 2011
NetCDF XML Markaup Langauge. http://www.unidata.ucar.edu/software/netcdf/ncml/(2011). Accessed 9 Jan 2011
Weigel, R.S., Zhizhin, M., Mishin, D., Kokovin, D., Kihn, E., Faden, J.: VxOware: Software for managing virtual observatory metadata. Earth Sci. Informat. 3, 19–28 (2010). doi: 10.1007/s12145–010–0048–1
Open Geospatial Consortium standards and specifications for Web Map Services. http://www.opengeospatial.org/standards(2011). Accessed 9 Jan 2011
Open-source Project for a Network Data Access Protocol (OPeNDAP). http://www.opendap.org(2011). Accessed 9 Jan 2011
Zadeh, L.: Fuzzy sets. Inf. Contr. 8, 338–353 (1965)
Jang, J.S.R., Sun, C.T., Mizutani, E.: Neuro-Fuzzy and Soft Computing. Prentice Hall, NJ (1997)
Berezin, S.B., Voitsekhovsky, D.V., Zhizhin, M.N., Mishin, D.Y., Novikov, A.M.: Video walls for Multiresolution Visualization of Natural Environment, Scientific Visualization 1:100–107 (in Russian). http://sv-journal.com/2009-1/04.php?lang=en(2009). Accessed 9 Jan 2011
Renambot, L., Rao, A., Singh, R., Byungil, J., Krishnaprasad, N., Vishwanath, V., Chandrasekhar, V., Schwarz, N., Spale, A., Zhang, C., Goldman, G., Leigh, J., Johnson, A.: SAGE: The Scalable Adaptive Graphics Environment. Electronic Visualization Laboratory, Dept. of Computer Science, University of Illinois at Chicago. http://www.optiputer.net/publications/articles/RENAMBOT-WACE2004-SAGE.pdf(2004). Accessed 9 Jan 2011
NASA WorldWind virtual 3D globe. http://worldwind.arc.nasa.gov/(2011). Accessed 9 Jan 2011
OpenStreetMap tile-server project http://www.openstreetmap.org(2011). Accessed 9 Jan 2011
KML documentation. http://code.google.com/apis/kml/documentation/(2011). Accessed 9 Jan 2011
Zhizhin, M., Kihn, E., Lyutsarev, V., Berezin, S., Poyda, A., Mishin, D., Medvedev, D., Voitsekhovsky, D.: Environmental scenario search and visualization. In: Proceedings of 15th ACM symposium on advances in geographic information systems (2007)
Multiviewer source code. http://www.codeplex.com/multiviewer(2011). Accessed 9 Jan 2011
Acknowledgements
This research was supported by the Russian Foundation for Basic Research Grant “Parallel scalable Grid-center for data mining,” Russian-Belorussian “SKIF-Grid” Project, CRDF Grant “Space Physics Interactive Data Resource,” and the Microsoft Research Grants “Environmental Scenario Search Engine.”
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Zhizhin, M., Medvedev, D., Mishin, D., Poyda, A., Novikov, A. (2011). Transparent Data Cube for Spatiotemporal Data Mining and Visualization. In: Fiore, S., Aloisio, G. (eds) Grid and Cloud Database Management. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20045-8_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-20045-8_15
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20044-1
Online ISBN: 978-3-642-20045-8
eBook Packages: Computer ScienceComputer Science (R0)