Abstract
We consider the classic partial sums problem on the ultra-wide word RAM model of computation. This model extends the classic w-bit word RAM model with special ultrawords of length \(w^2\) bits that support standard arithmetic and boolean operation and scattered memory access operations that can access w (non-contiguous) locations in memory. The ultra-wide word RAM model captures (and idealizes) modern vector processor architectures.
Our main result is a new in-place data structure for the partial sum problem that only stores a constant number of ultrawords in addition to the input and supports operations in doubly logarithmic time. This matches the best known time bounds for the problem (among polynomial space data structures) while improving the space from superlinear to a constant number of ultrawords. Our results are based on a simple and elegant in-place word RAM data structure, known as the Fenwick tree. Our main technical contribution is a new efficient parallel ultra-wide word RAM implementation of the Fenwick tree, which is likely of independent interest.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ben-Amram, A.M., Galil, Z.: A generalization of a lower bound technique due to Fredman and Saks. Algorithmica 30(1), 34–66 (2001)
Ben-Amram, A.M., Galil, Z.: Lower bounds for dynamic data structures on algebraic RAMs. Algorithmica 32(3), 364–395 (2002)
Bille, P., et al.: Dynamic relative compression, dynamic partial sums, and substring concatenation. Algorithmica 80(11), 3207–3224 (2018). Announced at ISAAC 2016
Bille, P., Christiansen, A.R., Prezza, N., Skjoldjensen, F.R.: Succinct partial sums and Fenwick trees. In: Fici, G., Sciortino, M., Venturini, R. (eds.) SPIRE 2017. LNCS, vol. 10508, pp. 91–96. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67428-5_8
Blelloch, G.E.: Prefix sums and their applications. In: Synthesis of Parallel Algorithms (1990)
Brodnik, A.: Searching in constant time and minimum space (Minimae res magni momenti sunt). Ph.D. thesis, University of Waterloo (1995)
Brodnik, A., Carlsson, S., Fredman, M.L., Karlsson, J., Munro, J.I.: Worst case constant time priority queue. J. Syst. Softw. 78(3), 249–256 (2005)
Brodnik, A., Karlsson, J., Munro, J.I., Nilsson, A.: An O(1) solution to the prefix sum problem on a specialized memory architecture. In: Navarro, G., Bertossi, L., Kohayakawa, Y. (eds.) TCS 2006. IIFIP, vol. 209, pp. 103–114. Springer, Boston, MA (2006). https://doi.org/10.1007/978-0-387-34735-6_12
Burkhard, W.A., Fredman, M.L., Kleitman, D.J.: Inherent complexity trade-offs for range query problems. Theor. Comput. Sci. 16(3), 279–290 (1981)
Chan, T.M., Chen, E.Y.: Optimal in-place algorithms for 3-D convex hulls and 2-D segment intersection. In: Proceedings of the 25th SOCG, pp. 80–87 (2009)
Chen, T., Raghavan, R., Dale, J.N., Iwata, E.: Cell broadband engine architecture and its first implementation—a performance view. IBM J. Res. Dev. 51(5), 559–572 (2007)
Dietz, P.F.: Optimal algorithms for list indexing and subset rank. In: Dehne, F., Sack, J.-R., Santoro, N. (eds.) WADS 1989. LNCS, vol. 382, pp. 39–46. Springer, Heidelberg (1989). https://doi.org/10.1007/3-540-51542-9_5
Farzan, A., López-Ortiz, A., Nicholson, P.K., Salinger, A.: Algorithms in the ultra-wide word model. In: Jain, R., Jain, S., Stephan, F. (eds.) TAMC 2015. LNCS, vol. 9076, pp. 335–346. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-17142-5_29
Fenwick, P.M.: A new data structure for cumulative frequency tables. Softw. Pract. Exp. 24(3), 327–336 (1994)
Franceschini, G., Muthukrishnan, S., Pǎtraşcu, M.: Radix sorting with no extra space. In: Arge, L., Hoffmann, M., Welzl, E. (eds.) ESA 2007. LNCS, vol. 4698, pp. 194–205. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75520-3_19
Frandsen, G.S., Miltersen, P.B., Skyum, S.: Dynamic word problems. J. ACM 44(2), 257–271 (1997)
Fredman, M., Saks, M.: The cell probe complexity of dynamic data structures. In: Proceedings of the 21st STOC, pp. 345–354 (1989)
Fredman, M.L.: A lower bound on the complexity of orthogonal range queries. J. ACM 28(4), 696–705 (1981)
Fredman, M.L.: The complexity of maintaining an array and computing its partial sums. J. ACM 29(1), 250–260 (1982)
Hagerup, T.: Sorting and searching on the word RAM. In: Morvan, M., Meinel, C., Krob, D. (eds.) STACS 1998. LNCS, vol. 1373, pp. 366–398. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0028575
Hampapuram, H., Fredman, M.L.: Optimal biweighted binary trees and the complexity of maintaining partial sums. SIAM J. Comput. 28(1), 1–9 (1998)
Hon, W.K., Sadakane, K., Sung, W.K.: Succinct data structures for searchable partial sums with optimal worst-case performance. Theor. Comput. Sci. 412(39), 5176–5186 (2011)
Husfeldt, T., Rauhe, T.: New lower bound techniques for dynamic partial sums and related problems. SIAM J. Comput. 32(3), 736–753 (2003)
Husfeldt, T., Rauhe, T., Skyum, S.: Lower bounds for dynamic transitive closure, planar point location, and parentheses matching. In: Karlsson, R., Lingas, A. (eds.) SWAT 1996. LNCS, vol. 1097, pp. 198–211. Springer, Heidelberg (1996). https://doi.org/10.1007/3-540-61422-2_132
Ladner, R.E., Fischer, M.J.: Parallel prefix computation. J. ACM 27(4), 831–838 (1980)
Larsen, K.G., Pagh, R.: I/O-efficient data structures for colored range and prefix reporting. In: Proceedings of the 23rd SODA, pp. 583–592 (2012)
Leben, R., Miletic, M., Špegel, M., Trost, A., Brodnik, A., Karlsson, J.: Design of high performance memory module on PC100. In: Proceedings of the ECSC, pp. 75–78 (1999)
Lindholm, E., Nickolls, J., Oberman, S., Montrym, J.: NVIDIA Tesla: a unified graphics and computing architecture. IEEE Micro 28(2), 39–55 (2008)
Miltersen, P.B.: Cell probe complexity-a survey. In: Proceedings of the 19th FSTTCS, p. 2 (1999)
Munro, J.I., Suwanda, H.: Implicit data structures for fast search and update. J. Comput. Syst. Sci. 21(2), 236–250 (1980)
Pǎtraşcu, M., Demaine, E.D.: Logarithmic lower bounds in the cell-probe model. SIAM J. Comput. 35(4), 932–963 (2006). Announced at SODA 2004
Raman, R., Raman, V., Rao, S.S.: Succinct dynamic data structures. In: Dehne, F., Sack, J.-R., Tamassia, R. (eds.) WADS 2001. LNCS, vol. 2125, pp. 426–437. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44634-6_39
Reinders, J.: AVX-512 Instructions. Intel Corporation, Santa Clara (2013)
Ryabko, B.Y.: A fast on-line adaptive code. IEEE Trans. Inf. Theory 38(4), 1400–1404 (1992)
Salowe, J., Steiger, W.: Simplified stable merging tasks. J. Algorithms 8(4), 557–571 (1987)
Stephens, N., et al.: The ARM scalable vector extension. IEEE Micro 37(2), 26–39 (2017)
Williams, J.W.J.: Algorithm 232: heapsort. Commun. ACM 7, 347–348 (1964)
Yao, A.C.: On the complexity of maintaining partial sums. SIAM J. Comput. 14(2), 277–288 (1985)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Bille, P., Gørtz, I.L., Skjoldjensen, F.R. (2020). Partial Sums on the Ultra-Wide Word RAM. In: Chen, J., Feng, Q., Xu, J. (eds) Theory and Applications of Models of Computation. TAMC 2020. Lecture Notes in Computer Science(), vol 12337. Springer, Cham. https://doi.org/10.1007/978-3-030-59267-7_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-59267-7_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59266-0
Online ISBN: 978-3-030-59267-7
eBook Packages: Computer ScienceComputer Science (R0)