skip to main content
10.1145/3559009.3569661acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
research-article
Public Access

GPU Adaptive In-situ Parallel Analytics (GAP)

Published: 27 January 2023 Publication History

Abstract

Despite the popularity of in-situ analytics in scientific computing, there is only limited work to date on in-situ analytics for simulations running on GPUs. Notably, two unaddressed challenges are 1) performing memory-efficient in-situ analysis on accelerators and 2)automatically choosing the processing resources and suitable data representation for a given query and platform. This paper addresses both problems. First, GAP makes several new contributions toward making bitmap indices suitable, effective, and efficient as a compressed data summary structure for the GPUs - this includes introducing a layout structure, a method for generating multi-attribute bitmaps, and novel techniques for bitmap-based processing of major operators that comprise complex data analytics. Second, this paper presents a performance modeling methodology, aiming to predict the placement (i.e., CPU or GPU) and the data representation choice (summarization or original) that yield the best performance on a given configuration. Our extensive evaluation of complex in-situ queries and real-world simulations shows that with our methods, analytics on GPU using bitmaps almost always outperforms other options, and the GAP performance model predicts the optimal placement and data representation for most scenarios.

References

[1]
E. P. Duque, D. E. Hiepler, R. Haimes, C. P. Stone, S. E. Gorrell, M. Jones, and R. A. Spencer, "Epic-an extract plug-in components toolkit for in-situ data extracts architecture," in 22nd AIAA Computational Fluid Dynamics Conference, 2015, p. 3410.
[2]
T. Fogal, F. Proch, A. Schiewe, O. Hasemann, A. Kempf, and J. Krüger, "Freeprocessing: Transparent in situ visualization via data interception," in Eurographics Symposium on Parallel Graphics and Visualization: EG PGV:[proceedings]/sponsored by Eurographics Association in cooperation with ACM SIGGRAPH. Eurographics Symposium on Parallel Graphics and Visualization, vol. 2014. NIH Public Access, 2014, p. 49.
[3]
M. Larsen, J. Ahrens, U. Ayachit, E. Brugger, H. Childs, B. Geveci, and C. Harrison, "The alpine in situ infrastructure: Ascending from the ashes of strawman," in Proceedings of the In Situ Infrastructures on Enabling Extreme-Scale Analysis and Visualization. ACM, 2017, pp. 42--46.
[4]
U. Ayachit, A. Bauer, B. Geveci, P. O'Leary, K. Moreland, N. Fabian, and J. Mauldin, "ParaView catalyst: Enabling in situ data analysis and visualization," in Proceedings of ISAV 2015: 1st International Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization, Held in conjunction with SC 2015: The International Conference for High Performance Computing, Networking, Storage and Analysis, 2015, pp. 25--29.
[5]
H. Childs, K.-L. Ma, H. Yu, B. Whitlock, J. Meredith, J. Favre, S. Klasky, N. Podhorszki, K. Schwan, M. Wolf et al., "In situ processing," Lawrence Berkeley National Lab.(LBNL), Berkeley, CA (United States), Tech. Rep., 2012.
[6]
Q. Liu, J. Logan, Y. Tian, H. Abbasi, N. Podhorszki, J. Y. Choi, S. Klasky, R. Tchoua, J. Lofstead, R. Oldfield et al., "Hello adios: the challenges and lessons of developing leadership class i/o frameworks," Concurrency and Computation: Practice and Experience, vol. 26, no. 7, pp. 1453--1473, 2014.
[7]
V. Vishwanath, M. Hereld, V. Morozov, and M. E. Papka, "Topology-aware data movement and staging for I/O acceleration on Blue Gene/P supercomputing systems," in Proceedings of 2011 SC - International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 2011, pp. 1--11.
[8]
S. S. Vazhkudai, B. R. de Supinski, A. S. Bland, A. Geist, J. Sexton, J. Kahle, C. J. Zimmer, S. Atchley, S. Oral, D. E. Maxwell et al., "The design, deployment, and evaluation of the coral pre-exascale systems," in SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 2018, pp. 661--672.
[9]
Argonne National Laboratory, "Aurora," 2020. [Online]. Available: https://press3.mcs.anl.gov/aurora/
[10]
A. Goswami, Y. Tian, K. Schwan, F. Zheng, J. Young, M. Wolf, G. Eisenhauer, and S. Klasky, "Landrush: Rethinking in-situ analysis for gpgpu workflows," in 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). IEEE, 2016, pp. 32--41.
[11]
D. Thompson, S. Jourdain, A. Bauer, B. Geveci, R. Maynard, R. R. Vatsavai, and P. O'Leary, "In situ summarization with vtk-m," in Proceedings of the In Situ Infrastructures on Enabling Extreme-Scale Analysis and Visualization, 2017, pp. 32--36.
[12]
K. Moreland, C. Sewell, W. Usher, L. T. Lo, J. Meredith, D. Pugmire, J. Kress, H. Schroots, K. L. Ma, H. Childs, M. Larsen, C. M. Chen, R. Maynard, and B. Geveci, "VTK-m: Accelerating the Visualization Toolkit for Massively Threaded Architectures," IEEE Computer Graphics and Applications, vol. 36, no. 3, pp. 48--58, 2016.
[13]
H. Xing, G. Agrawal, and R. Ramnath, "Moha: a composable system for efficient in-situ analytics on heterogeneous hpc systems," in 2020 SC20: International Conference for High Performance Computing, Networking, Storage and Analysis (SC). IEEE Computer Society, 2020, pp. 1155--1170.
[14]
K.-C. Wang, J. Xu, J. Woodring, and H.-W. Shen, "Statistical super resolution for data analysis and visualization of large scale cosmological simulations," in 2019 IEEE Pacific Visualization Symposium (PacificVis). IEEE, 2019, pp. 303--312.
[15]
H. Xing, G. Agrawal, and R. Ramnath, "Moha: a composable system for efficient in-situ analytics on heterogeneous hpc systems," in 2020 SC20: International Conference for High Performance Computing, Networking, Storage and Analysis (SC). IEEE Computer Society, 2020, pp. 1155--1170.
[16]
P. E. O'Neil, "Model 204 architecture and performance," in International Workshop on High Performance Transaction Systems. Springer, 1987, pp. 39--59.
[17]
K. Wu, E. J. Otoo, A. Shoshani, and H. Nordberg, "Notes on design and implementation of compressed bit vectors," Technical Report LBNL/PUB-3161, Lawrence Berkeley National Laboratory, Berkeley, CA, Tech. Rep., 2001.
[18]
D. Lemire, O. Kaser, and K. Aouiche, "Sorting improves word-aligned bitmap indexes," Data and Knowledge Engineering, vol. 69, no. 1, pp. 3--28, 2010.
[19]
S. Chambi, D. Lemire, O. Kaser, and R. Godin, "Better bitmap performance with Roaring bitmaps," Software - Practice and Experience, vol. 46, no. 5, pp. 709--719, 2016.
[20]
S. Kim, J. Lee, S. R. Satti, and B. Moon, "Sbh: Super byte-aligned hybrid bitmap compression," Information Systems, vol. 62, pp. 155--168, 2016.
[21]
D. Lemire, O. Kaser, N. Kurz, L. Deri, C. O'Hara, F. Saint-Jacques, and G. Ssi-Yan-Kai, "Roaring Bitmaps: Implementation of an Optimized Software Library," arXiv preprint arXiv:1709.07821, 2017.
[22]
J. Wang, C. Lin, Y. Papakonstantinou, and S. Swanson, "An Experimental Study of Bitmap Compression vs. Inverted List Compression," in Proceedings of the 2017 ACM International Conference on Management of Data - SIGMOD '17, 2017, pp. 993--1008. [Online]. Available: http://db.ucsd.edu/wp-content/uploads/2017/03/sidm338-wangA.pdfhttp://db.ucsd.edu/wp-content/uploads/2017/03/sidm338-wangA.pdf{%}0Ahttp://dl.acm.org/citation.cfm?doid=3035918.3064007
[23]
H. Lang, A. Beischl, V. Leis, P. Boncz, T. Neumann, and A. Kemper, "Tree-encoded bitmaps."
[24]
C. Y. Chan and Y. E. Ioannidis, "Bitmap index design and evaluation," in SIGMOD Record, vol. 27, no. 2. ACM, 1998, pp. 355--366.
[25]
C. Y. Chan, "An Efficient Bitmap Encoding Scheme for Selection Queries," in SIGMOD Record (ACM Special Interest Group on Management of Data), vol. 28, no. 2. ACM, 1999, pp. 215--226.
[26]
K. Wu, E. Otoo, and A. Shoshani, "On the performance of bitmap indices for high cardinality attributes," in VLDB '04, 2004.
[27]
K. Wu, "FastBit: an efficient indexing technology for accelerating data-intensive science," Journal of Physics: Conference Series, vol. 16, no. 1, pp. 556--560, 2005.
[28]
S. Blanas, K. Wu, S. Byna, B. Dong, and A. Shoshani, "Parallel data analysis directly on scientific file formats," in Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 2014, pp. 385--396.
[29]
J. Chou, M. Howison, B. Austin, K. Wu, J. Qiang, E. W. Bethel, A. Shoshani, O. Rübel, and R. D. Ryne, "Parallel index and query for large scale data analysis," in Proceedings of 2011 international conference for high performance computing, networking, storage and analysis, 2011, pp. 1--11.
[30]
L. Gosink, J. Shalf, K. Stockinger, K. Wu, and W. Bethel, "HDF5-FastQuery: Accelerating complex queries on HDF datasets using fast bitmap indices," in Proceedings of the International Conference on Scientific and Statistical Database Management, SSDBM, 2006, pp. 149--158.
[31]
S. Shohdy, Y. Su, and G. Agrawal, "Load Balancing and Accelerating Parallel Spatial Join Operations Using Bitmap Indexing," in Proceedings - 22nd IEEE International Conference on High Performance Computing, HiPC 2015. IEEE, 2016, pp. 396--405.
[32]
Y. Su, G. Agrawal, and J. Woodring, "Indexing and parallel query processing support for visualizing climate datasets," in ICPP '12, 2012.
[33]
Y. Wang, Y. Su, and G. Agrawal, "A novel approach for approximate aggregations over arrays," in ACM International Conference Proceeding Series, vol. 29-June-20. New York, New York, USA: ACM Press, 2015, pp. 1--12. [Online]. Available: http://dl.acm.org/citation.cfm?doid=2791347.2791349
[34]
K. Wu, S. Ahern, E. W. Bethel, J. Chen, H. Childs, E. Cormier-Michel, C. Geddes, J. Gu, H. Hagen, B. Hamann, W. Koegler, J. Lauret, J. Meredith, P. Messmer, E. J. Otoo, V. Perevoztchikov, A. Poskanzer, O. Rübel, A. Shoshani, A. Sim, K. Stockinger, G. Weber, and W.-M. Zhang, "FastBit: interactively searching massive data," Journal of Physics: Conference Series, vol. 180, no. 1, p. 012053, 2009. [Online]. Available: http://iopscience.iop.org/1742-6596/180/1/012053
[35]
G. Zhu, Y. Wang, and G. Agrawal, "SciCSM: novel contrast set mining over scientific datasets using bitmap indices," ... of the 27th International Conference on ..., pp. 1--6, 2015. [Online]. Available: http://dl.acm.org/citation.cfm?doid=2791347.2791361http://dl.acm.org/citation.cfm?id=2791361
[36]
K.-L. Wu and P. S. Yu, "Range-based bitmap indexing for high cardinality attributes with skew," in Proceedings. The Twenty-Second Annual International Computer Software and Applications Conference (Compsac'98)(Cat. No. 98CB 36241). IEEE, 1998, pp. 61--66.
[37]
Y. Su, Y. Wang, and G. Agrawal, "In-situ bitmaps generation and efficient data analysis based on bitmaps," in Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing. ACM, 2015, pp. 61--72.
[38]
F. Fusco, M. Vlachos, X. Dimitropoulos, and L. Deri, "Indexing million of packets per second using GPUs," Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC, pp. 327--332, 2013. [Online]. Available: https://dl.acm.org/citation.cfm?id=2504756
[39]
W. Andrzejewski and R. Wrembel, "GPU-WAH: Applying GPUs to compressing bitmap indexes with word aligned hybrid," in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 6262 LNCS, no. PART 2, 2010, pp. 315--329. [Online]. Available: http://link.springer.com/10.1007/978-3-642-15251-1{_}26
[40]
N. Corp., "Cuda c programming guide." [Online]. Available: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#cooperative-groups
[41]
R. Pagh and F. F. Rodler, "Cuckoo hashing," Journal of Algorithms, vol. 51, no. 2, pp. 122--144, 2004.
[42]
D. A. Alcantara, V. Volkov, S. Sengupta, M. Mitzenmacher, J. D. Owens, and N. Amenta, "Building an efficient hash table on the gpu," in GPU Computing Gems Jade Edition. Elsevier, 2012, pp. 39--53.
[43]
N. Buchanan, S. Calvez, P. Ding, D. Doyle, C. Green, A. Himmel, B. Holzman, J. Kowalkowski, A. Norman, M. Paterno et al., "Enabling neutrino and antineutrino appearance observation measurements with hpc facilities."
[44]
K. Yoshimoto, D. Choi, R. Moore, A. Majumdar, and E. Hocks, "Implementations of urgent computing on production hpc systems," Procedia Computer Science, vol. 9, pp. 1687--1693, 2012.
[45]
C. Wang, H. Yu, and K.-L. Ma, "Importance-driven time-varying data visualization," IEEE Transactions on Visualization and Computer Graphics, vol. 14, no. 6, pp. 1547--1554, 2008.
[46]
G. Aad, B. Abbott, J. Abdallah, R. Aben, M. Abolins, O. AbouZeid, H. Abramowicz, H. Abreu, R. Abreu, Y. Abulaiti et al., "Search for magnetic monopoles and stable particles with high electric charges in 8 tev p p collisions with the atlas detector," Physical Review D, vol. 93, no. 5, p. 052009, 2016.
[47]
S. D. Bay and M. J. Pazzani, "Detecting group differences: Mining contrast sets," Data mining and knowledge discovery, vol. 5, no. 3, pp. 213--246, 2001.
[48]
N. S. Holliman, M. Antony, J. Charlton, S. Dowsland, P. James, and M. Turner, "Petascale cloud supercomputing for terapixel visualization of a digital twin," IEEE Transactions on Cloud Computing, 2019.
[49]
"Hydrodynamics Challenge Problem, Lawrence Livermore National Laboratory," Tech. Rep. LLNL-TR-490254.
[50]
I. Karlin, J. Keasler, and R. Neely, "LULESH 2.0 Updates and Changes," Tech. Rep. LLNL-TR-641973, aug 2013. [Online]. Available: http://codesign.llnl.gov/lulesh
[51]
B. Acun, A. Gupta, N. Jain, A. Langer, H. Menon, E. Mikida, X. Ni, M. Robson, Y. Sun, E. Totoni, L. Wesolowski, and L. Kale, "Parallel Programming with Migratable Objects: Charm++ in Practice," in International Conference for High Performance Computing, Networking, Storage and Analysis, SC, vol. 2015-January, no. January. IEEE, 2014, pp. 647--658.
[52]
H. Zhang and H. Hoffmann, "PoDD: Power-capping dependent distributed applications," in International Conference for High Performance Computing, Networking, Storage and Analysis, SC, 2019, pp. 1--23.
[53]
L. Brieda, Plasma Simulations by Example, 1st ed., 2019.
[54]
R. Heiland and M. P. Baker, "A survey of co-processing systems," CEWES MSRC PET Technical Report, pp. 52--98, 1998.
[55]
H. Childs, "Architectural challenges and solutions for petascale postprocessing," in Journal of Physics: Conference Series, vol. 78, no. 1. IOP Publishing, 2007, p. 12012.
[56]
H. Yu, C. Wang, R. W. Grout, J. H. Chen, and K. L. Ma, "In situ visualization for large-scale combustion simulations," IEEE Computer Graphics and Applications, vol. 30, no. 3, pp. 45--57, 2010.
[57]
T. Kuhlen, R. Pajarola, and K. Zhou, "Parallel in situ coupling of simulation with a fully featured visualization system," in Proceedings of the 11th Eurographics Conference on Parallel Graphics and Visualization (EGPGV), vol. 10. Eurographics Association Aire-la-Ville, Switzerland, 2011, pp. 101--109.
[58]
J. Chen, I. Yoon, and E. W. Bethel, "Interactive, Internet delivery of visualization via structured prerendered multiresolution imagery," IEEE Transactions on Visualization and Computer Graphics, vol. 14, no. 2, pp. 302--312, 2008.
[59]
Y. Ye, R. Miller, and K.-l. Ma, "In Situ Pathtube Visualization with Explorable Images," in Egpgv. Eurographics Association, 2013, p. 2312.
[60]
J. Ahrens, S. Jourdain, P. O'Leary, J. Patchett, D. H. Rogers, and M. Petersen, "An Image-Based Approach to Extreme Scale in Situ Visualization and Analysis," in International Conference for High Performance Computing, Networking, Storage and Analysis, SC, vol. 2015-January, no. January. IEEE, 2014, pp. 424--434.
[61]
P. O'Leary, J. Ahrens, S. Jourdain, S. Wittenburg, D. H. Rogers, and M. Petersen, "Cinema image-based in situ analysis and visualization of MPAS-ocean simulations," Parallel Computing, vol. 55, pp. 43--48, 2016.
[62]
J. Lofstead, S. Klasky, K. Schwan, N. Podhorszki, and C. Jin, "Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS)," in CLADE - Proceedings of the 6th International Workshop on Challenges of Large Applications in Distributed Environments 2008, CLADE'08, 2008, pp. 15--24.
[63]
W. F. Godoy, N. Podhorszki, R. Wang, C. Atkins, G. Eisenhauer, J. Gu, P. Davis, J. Choi, K. Germaschewski, K. Huck et al., "Adios 2: The adaptable input output system. a framework for high-performance data management," SoftwareX, vol. 12, p. 100561, 2020.
[64]
U. Ayachit, A. Bauer, E. P. Duque, G. Eisenhauer, N. Ferrier, J. Gu, K. E. Jansen, B. Loring, Z. Lukic, S. Menon et al., "Performance analysis, design considerations, and applications of extreme-scale in situ infrastructures," in SC'16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 2016, pp. 921--932.
[65]
Y. C. Ye, T. Neuroth, F. Sauer, K.-L. Ma, G. Borghesi, A. Konduri, H. Kolla, and J. Chen, "In situ generated probability distribution functions for interactive post hoc visualization and analysis," in 2016 IEEE 6th Symposium on Large Data Analysis and Visualization (LDAV). IEEE, 2016, pp. 65--74.
[66]
N. Seekhao, J. JaJa, L. Mongeau, and N. Y. Li-Jessen, "In situ visualization for 3d agent-based vocal fold inflammation and repair simulation," Supercomputing frontiers and innovations, vol. 4, no. 3, p. 68, 2017.
[67]
S. Dutta, H.-W. Shen, and J.-P. Chen, "In situ prediction driven feature analysis in jet engine simulations," in 2018 IEEE Pacific Visualization Symposium (PacificVis). IEEE, 2018, pp. 66--75.
[68]
S. Dutta, J. Woodring, H.-W. Shen, J.-P. Chen, and J. Ahrens, "Homogeneity guided probabilistic data summaries for analysis and visualization of large-scale data sets," in 2017 IEEE Pacific Visualization Symposium (PacificVis). IEEE, 2017, pp. 111--120.
[69]
S. Maroulis, N. Bikakis, G. Papastefanatos, P. Vassiliadis, and Y. Vassiliou, "Rawvis: A system for efficient in-situ visual analytics." SIGMOD, 2021.
[70]
A. C. Bauer, H. Abbasi, J. Ahrens, H. Childs, B. Geveci, S. Klasky, K. Moreland, P. O'Leary, V. Vishwanath, B. Whitlock, and E. W. Bethel, "In Situ Methods, Infrastructures, and Applications on High Performance Computing Platforms," in Computer Graphics Forum, vol. 35, no. 3. Wiley Online Library, 2016, pp. 577--597.
[71]
S. Lakshminarasimhan, X. Zou, D. A. Boyuka, S. V. Pendse, J. Jenkins, V. Vishwanath, M. E. Papka, S. Klasky, and N. F. Samatova, "DIRAQ: scalable in situ data- and resource-aware indexing for optimized query performance," Cluster Computing, vol. 17, no. 4, pp. 1101--1119, 2014.
[72]
Q. Zheng, C. D. Cranor, D. Guo, G. R. Ganger, G. Amvrosiadis, G. A. Gibson, B. W. Settlemyer, G. Grider, and F. Guo, "Scaling embedded in-situ indexing with deltaFS," in Proceedings - International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018. IEEE, 2019, pp. 30--44.
[73]
M. Wolf, J. Choi, G. Eisenhauer, S. Ethier, K. Huck, S. Klasky, J. Logan, A. Malony, C. Wood, J. Dominski et al., "Scalable performance awareness for in situ scientific applications," in 2019 15th International Conference on eScience (eScience). IEEE, 2019, pp. 266--276.
[74]
K. Wu, W. Koegler, J. Chen, and A. Shoshani, "Using bitmap index for interactive exploration of large datasets," in Proceedings of the International Conference on Scientific and Statistical Database Management, SSDBM, vol. 2003-January. IEEE, 2003, pp. 65--74.
[75]
H. Xing and G. Agrawal, "COMPASS: compact array storage with value index," in Proceedings of the 30th International Conference on Scientific and Statistical Database Management - SSDBM '18. New York, New York, USA: ACM Press, 2018, pp. 1--12. [Online]. Available: http://dl.acm.org/citation.cfm?doid=3221269.3223033
[76]
H. Xing, "Accelerating array joining with integrated value-index," in Proceedings of the 31st International Conference on Scientific and Statistical Database Management, 2019, pp. 145--156.
[77]
M. Burtscher and P. Ratanaworabhan, "Fpc: A high-speed compressor for double-precision floating-point data," IEEE Transactions on Computers, vol. 58, no. 1, pp. 18--31, 2008.
[78]
S. Di and F. Cappello, "Fast error-bounded lossy hpc data compression with sz," in 2016 ieee international parallel and distributed processing symposium (ipdps). IEEE, 2016, pp. 730--739.
[79]
J. Ziv and A. Lempel, "A Universal Algorithm for Sequential Data Compression," IEEE Transactions on Information Theory, vol. 23, no. 3, 1977.
[80]
J. Tian, S. Di, K. Zhao, C. Rivera, M. H. Fulp, R. Underwood, S. Jin, X. Liang, J. Calhoun, D. Tao et al., "Cusz: An efficient gpu-based error-bounded lossy compression framework for scientific data," in Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques, 2020, pp. 3--15.
[81]
Lawrence Livermore National Laboratory, "cuzfp," 202. [Online]. Available: https://github.com/LLNL/zfp/tree/develop/src/cuda_zfp
[82]
N. Govindaraju, J. Gray, R. Kumar, and D. Manocha, "Gputerasort: high performance graphics co-processor sorting for large database management," in Proceedings of the 2006 ACM SIGMOD international conference on Management of data, 2006, pp. 325--336.
[83]
B. He, K. Yang, R. Fang, M. Lu, N. Govindaraju, Q. Luo, and P. Sander, "Relational joins on graphics processors," in Proceedings of the 2008 ACM SIGMOD international conference on Management of data, 2008, pp. 511--524.
[84]
B. He, M. Lu, K. Yang, R. Fang, N. K. Govindaraju, Q. Luo, and P. V. Sander, "Relational query coprocessing on graphics processors," ACM Transactions on Database Systems (TODS), vol. 34, no. 4, pp. 1--39, 2009.
[85]
M. Heimel, M. Saecker, H. Pirk, S. Manegold, and V. Markl, "Hardware-oblivious parallelism for in-memory column-stores," Proceedings of the VLDB Endowment, vol. 6, no. 9, pp. 709--720, 2013.
[86]
Y. Yuan, R. Lee, and X. Zhang, "The yin and yang of processing data warehousing queries on gpu devices," Proceedings of the VLDB Endowment, vol. 6, no. 10, pp. 817--828, 2013.
[87]
S. M. A. Raza, P. Chrysogelos, P. Sioulas, V. Indjic, A. C. Anadiotis, and A. Ailamaki, "Gpu-accelerated data management under the test of time," in Online proceedings of the 10th Conference on Innovative Data Systems Research (CIDR), no. CONF, 2020.
[88]
N. Leischner, V. Osipov, and P. Sanders, "Gpu sample sort," in 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS). IEEE, 2010, pp. 1--10.
[89]
N. Satish, C. Kim, J. Chhugani, A. D. Nguyen, V. W. Lee, D. Kim, and P. Dubey, "Fast sort on cpus and gpus: a case for bandwidth oblivious simd sort," in Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, 2010, pp. 351--362.
[90]
E. Stehle and H.-A. Jacobsen, "A memory bandwidth-efficient hybrid radix sort on gpus," in Proceedings of the 2017 ACM International Conference on Management of Data, 2017, pp. 417--432.
[91]
S. Sengupta, M. Harris, Y. Zhang, and J. D. Owens, "Scan primitives for gpu computing," 2007.
[92]
D. Merrill and M. Garland, "Single-pass parallel prefix scan with decoupled look-back," NVIDIA, Tech. Rep. NVR-2016-002, 2016.
[93]
O. Green, R. McColl, and D. A. Bader, "Gpu merge path: a gpu merging algorithm," in Proceedings of the 26th ACM international conference on Supercomputing, 2012, pp. 331--340.
[94]
O. Green, P. Yalamanchili, and L.-M. Munguía, "Fast triangle counting on the gpu," in Proceedings of the 4th Workshop on Irregular Applications: Architectures and Algorithms, 2014, pp. 1--8.
[95]
T. Karnagel, D. Habich, and W. Lehner, "Adaptive work placement for query processing on heterogeneous computing resources," Proceedings of the VLDB Endowment, vol. 10, no. 7, pp. 733--744, 2017.

Cited By

View all
  • (2023)In-Situ Techniques on GPU-Accelerated Data-Intensive Applications2023 IEEE 19th International Conference on e-Science (e-Science)10.1109/e-Science58273.2023.10254865(1-10)Online publication date: 9-Oct-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PACT '22: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques
October 2022
569 pages
ISBN:9781450398688
DOI:10.1145/3559009
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IFIP WG 10.3: IFIP WG 10.3
  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 January 2023

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

PACT '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 121 of 471 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)131
  • Downloads (Last 6 weeks)16
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)In-Situ Techniques on GPU-Accelerated Data-Intensive Applications2023 IEEE 19th International Conference on e-Science (e-Science)10.1109/e-Science58273.2023.10254865(1-10)Online publication date: 9-Oct-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media