Abstract
In-situ analytics have been increasingly adopted by leadership scientific applications to gain fast insights into massive output data of simulations. With the current practice, systems buffer the output data in DRAM for analytics processing, constraining it to DRAM capacity un-used by the simulation. The rapid growth of data size requires alternative approaches to accommodating data-rich analytics, such as using solid-state disks to increase effective memory capacity. For this purpose, this paper explores software solutions for exploring the deep memory hierarchies expected on future high-end machines. Leveraging the fact that many analytics are sensitive to data features (regions-of-interest) hidden in the data being processed, the approach incorporates the knowledge of the data features into in-situ data management. It uses adaptive index creation/refinement to reduce the overhead of index management. In addition, it uses data features to predict data skew and improve load balance through controlling data distribution and placement on distributed staging servers. The experimental results show that such feature-guided optimizations achieve substantial improvements over state-of-the-art approaches for managing output data in-situ.










Similar content being viewed by others
References
Abbasi, H., Wolf, M., Eisenhauer, G., Klasky, S., Schwan, K., Zheng, F.: Datastager: scalable data staging services for petascale applications. In: HPDC (2009)
ADIOS. Adios.: Adaptive i/o system. http://www.olcf.ornl.gov/center-projects/adios/ (2012)
Al-Furaih, I., Aluru, S., Goil, S., Ranka, S.: Parallel construction of multidimensional binary search trees. In: ICS (1996)
Caulfield, A.M., Grupp, L.M., Swanson, S.: Gordon: using flash memory to build fast, power-efficient clusters for data-intensive applications. In: ASPLOS (2009)
Center-wide Scrach Filesystem Atlas.: https://www.olcf.ornl.gov/kb_articles/atlas-transition/
Chen, F., Koufaty, D.A., Zhang, X.: Hystor: making the best use of solid state drives in high performance storage systems. In: ICS (2011)
Chen, G., Vo, H.T., Wu, S., Ooi, B.C., Özsu, M.T.: A framework for supporting dbms-like indexes in the cloud. PVLDB 4(11), 702–713 (2011)
Dayal, J., Bratcher, D., Eisenhauer, G., Schwan, K., Wolf, M., Zhang, X., Abbasi, H., Klasky, S., Podhorszki, N.: Flexpath: type-based publish/subscribe system for large-scale science analytics. In: CCGrid (2014)
Evpath.: An event transport middleware layer. http://www.cc.gatech.edu/systems/projects/EVPath/
Hawkes, J.C.S.E.R., Sankaran, R., Chen, J.H.: Direct numerical simulation of turbulent combustion: fundamental insights towards predictive models. J. Phys. 16, 65–79 (2005)
Eisenhauer, G., Wolf, M., Abbasi, H., Schwan, K.: Event-based systems: opportunities and challenges at exascale. In: DEBS (2009)
Guttman, A.: R-trees: A dynamic index structure for spatial searching. In: Yormark, B. (ed) SIGMOD (1984)
He, J., Bennett, J., Snavely, A.: Dash-IO: an empirical study of flash-based IO for PHC. In: TG (2010)
He, J., Jagatheesan, A., Gupta, S., Bennett, J., Snavely, A.: Dash: a recipe for a flash-based data intensive supercomputer. In: SC (2010)
Heikkinen, J.A., Janhunen, S.J., Kiviniemi, T.P., Ogando, F.: Full f gyrokinetic method for particle simulation of tokamak transport. J. Comput. Phys. 227(11), 5582–5609 (2008)
Jin, T., Zhang, F., Sun, Q., Bui, H., Parashar, M., Yu, H., Klasky, S., Podhorszki, N., Abbasi, H.: Using cross-layer adaptations for dynamic data management in large scale coupled scientific workflows. In: SC, p. 74 (2013)
Jin, T., Zhang, F., Sun, Q., Bui, H., Romanus, M., Podhorszki, N., Klasky, S., Kolla, H., Chen, J., Hager, R., Chang, C.S., Parashar, M.: Exploring data staging across deep memory hierarchies for coupled data intensive simulation workflows. In: IPDPS (2015)
Jung, M., Wilson III, E.H., Choi, W., Shalf, J., Aktulga, H.M., Yang, C., Saule, E., Catalyurek, U.V., Kandemir, M.: Exploring the future of out-of-core computing with compute-local non-volatile memory. In: SC (2013)
Kim, J., Abbasi, H., Chacón, L., Docan, C., Klasky, S., Liu, Q., Podhorszki, N., Shoshani, A., Wu, K.: Parallel in situ indexing for data-intensive computing. In: LDAV, pp. 65–72 (2011)
Klasky, S., Ethier, S., Lin, Z., Martins, K., McCune, D., Samtaney, R.: Grid -based parallel data streaming implemented for the gyrokinetic toroidal code. In: SC ’03 (2003)
Lakshminarasimhan, S., Boyuka, D.A., Pendse, S.V., Zou, X., Jenkins, J., Vishwanath, V., Papka, M.E., Samatova, N.F.: Scalable in situ scientific data encoding for analytical query processing. In: HPDC’13
Lakshminarasimhan, S., Boyuka, D.A., Pendse, S.V., Zou, X., Jenkins, J., Vishwanath, V., Papka, M.E., Samatova, N.F.: Scalable in situ scientific data encoding for analytical query processing. In: HPDC (2013)
Lashuk, I., Chandramowlishwaran, A., Langston, H., Nguyen, T.-A., Sampath, R., Shringarpure, A., Vuduc, R., Ying, L., Zorin, D., Biros, G.: A massively parallel adaptive fast multipole method on heterogeneous architectures. In: SC (2009)
Lee, D., Vuduc, R., Gray, A.G.: A distributed kernel summation framework for general-dimension machine learning. In: SDM (2012)
Lee, T., Moon, B., Lee, S.: Bulk insertion for r-trees by seeded clustering. Data Knowl. Eng. 59(1), 86–106 (2006)
Liu, N., Cope, J., Carns, P.H., Carothers, C.D., Ross, R.B., Grider, G., Crume, A., Maltzahn, C.: On the role of burst buffers in leadership-class storage systems. In: MSST, pp. 1–11 (2012)
Lorensen, W.E., Cline, H.E.: Marching cubes: a high resolution 3d surface construction algorithm. In: SIGGRAPH (1987)
Mehta, D.P., Sahni, S.: Handbook of Algorithms and Data Structures. Chapman and Hall, London (2004)
Moon, B., Jagadish, H.V., Faloutsos, C., Saltz, J.H.: Analysis of the clustering properties of the hilbert space-filling curve. Trans. Knowl. Data Eng. 13(1), 124–141 (2001)
Nam, B., Sussman, A.: Spatial indexing of distributed multidimensional datasets. In: CCGRID, pp. 743–750 (2005)
Nam, B., Sussman, A.: Dist: fully decentralized indexing for querying distributed multidimensional datasets. In: IPDPS (2006)
Nguyen, B., Tan, H., Zhang, X.: Large-scale adaptive mesh simulations through non-volatile byte-addressable memory. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017, Denver, CO (2017)
Plimpton, S.: Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 117(1), 1–19 (1995)
Prabhakar, R., Vazhkudai, S.S., Kim, Y., Butt, A.R., Li, M., Kandemir, M.: Provisioning a multi-tiered data staging area for extreme-scale machines. In: The 31st International Conference on Distributed Computing Systems (2011)
Rajachandrasekar, R., Ouyang, X., Besseron, X., Meshram, V., Panda, D.K.: Can checkpoint/restart mechanisms benefit from hierarchical data staging? Euro-Par Workshops 2, 312–321 (2011)
Reliable UDP networking library.: http://enet.bespin.org/
Schnitzer, B., Leutenegger, S.T.: Master-client R-trees: a new parallel r-tree architecture. In: SSDBM (1999)
Shekhar, R., Fayyad, E., Yagel, R., Cornhill, J.F.:. Octree-based decimation of marching cubes surfaces. In: VIS (1996)
Su, Y., Wang, Y., Agrawal, G.: In-situ bitmaps generation and efficient data analysis based on bitmaps. In: HPDC (2015)
The sith cluster.: https://www.olcf.ornl.gov/computing-resources/sith/
The architecture of burst buffer.: http://www.nersc.gov/users/computational-systems/cori/burst-buffer/burst-buffer/
Vetter, J.S., Mittal, S.: Opportunities for nonvolatile memory systems in extreme-scale high-performance computing. Comput. Sci. Eng. 17(2), 73–82 (2015)
Wang, C., Vazhkudai, S.S., Ma, X., Meng, F., Kim, Y., Engelmann, C.: Nvmalloc: Exposing an aggregate SSD store as a memory partition in extreme-scale machines. In: IPDPS, pp. 957–968 (2012)
Wolf, M., Cai, Z., Huang, W., Schwan, K.: Smartpointers: personalized scientific data portals in your hand. In: SC, pp. 1–16 (2002)
Yang, Q., Ren, J.: I-cash: Intelligently coupled array of SSD and HDD. In: HPCA (2011)
Yu, H., Wang, C., Grout, R.W., Chen, J.H., Ma, K.-L.: In situ visualization for large-scale combustion simulations. IEEE Comput. Graph. Appl. 30(3), 45–57 (2010)
Zhang, W., Tang, H., Ranshous, S., Byna, S., Martn, D.F., Wu, K., Dong, B., Klasky, S., Samatova, N.F.: Exploring memory hierarchy and network topology for runtime AMR data sharing across scientific applications. In: Big Data (2016)
Zhang, X., Zheng, F., Schwan, K., Wolf, M.: Flashstager: improving the performance of SSD-based data staging systems via write redirection. In: CLUSTER (2016)
Acknowledgements
This research was supported in part by NSF ACI-1565338 and WSU Vancouver Research Grant.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, X., Zheng, F. & Nguyen, B. DeStager: feature guided in-situ data management in distributed deep memory hierarchies. Distrib Parallel Databases 37, 209–231 (2019). https://doi.org/10.1007/s10619-018-7235-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-018-7235-3