Skip to main content

Partial Sums on the Ultra-Wide Word RAM

  • Conference paper
  • First Online:
Theory and Applications of Models of Computation (TAMC 2020)

Abstract

We consider the classic partial sums problem on the ultra-wide word RAM model of computation. This model extends the classic w-bit word RAM model with special ultrawords of length \(w^2\) bits that support standard arithmetic and boolean operation and scattered memory access operations that can access w (non-contiguous) locations in memory. The ultra-wide word RAM model captures (and idealizes) modern vector processor architectures.

Our main result is a new in-place data structure for the partial sum problem that only stores a constant number of ultrawords in addition to the input and supports operations in doubly logarithmic time. This matches the best known time bounds for the problem (among polynomial space data structures) while improving the space from superlinear to a constant number of ultrawords. Our results are based on a simple and elegant in-place word RAM data structure, known as the Fenwick tree. Our main technical contribution is a new efficient parallel ultra-wide word RAM implementation of the Fenwick tree, which is likely of independent interest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Ben-Amram, A.M., Galil, Z.: A generalization of a lower bound technique due to Fredman and Saks. Algorithmica 30(1), 34–66 (2001)

    Article  MathSciNet  Google Scholar 

  2. Ben-Amram, A.M., Galil, Z.: Lower bounds for dynamic data structures on algebraic RAMs. Algorithmica 32(3), 364–395 (2002)

    Article  MathSciNet  Google Scholar 

  3. Bille, P., et al.: Dynamic relative compression, dynamic partial sums, and substring concatenation. Algorithmica 80(11), 3207–3224 (2018). Announced at ISAAC 2016

    Article  MathSciNet  Google Scholar 

  4. Bille, P., Christiansen, A.R., Prezza, N., Skjoldjensen, F.R.: Succinct partial sums and Fenwick trees. In: Fici, G., Sciortino, M., Venturini, R. (eds.) SPIRE 2017. LNCS, vol. 10508, pp. 91–96. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67428-5_8

    Chapter  Google Scholar 

  5. Blelloch, G.E.: Prefix sums and their applications. In: Synthesis of Parallel Algorithms (1990)

    Google Scholar 

  6. Brodnik, A.: Searching in constant time and minimum space (Minimae res magni momenti sunt). Ph.D. thesis, University of Waterloo (1995)

    Google Scholar 

  7. Brodnik, A., Carlsson, S., Fredman, M.L., Karlsson, J., Munro, J.I.: Worst case constant time priority queue. J. Syst. Softw. 78(3), 249–256 (2005)

    Article  Google Scholar 

  8. Brodnik, A., Karlsson, J., Munro, J.I., Nilsson, A.: An O(1) solution to the prefix sum problem on a specialized memory architecture. In: Navarro, G., Bertossi, L., Kohayakawa, Y. (eds.) TCS 2006. IIFIP, vol. 209, pp. 103–114. Springer, Boston, MA (2006). https://doi.org/10.1007/978-0-387-34735-6_12

    Chapter  Google Scholar 

  9. Burkhard, W.A., Fredman, M.L., Kleitman, D.J.: Inherent complexity trade-offs for range query problems. Theor. Comput. Sci. 16(3), 279–290 (1981)

    Article  MathSciNet  Google Scholar 

  10. Chan, T.M., Chen, E.Y.: Optimal in-place algorithms for 3-D convex hulls and 2-D segment intersection. In: Proceedings of the 25th SOCG, pp. 80–87 (2009)

    Google Scholar 

  11. Chen, T., Raghavan, R., Dale, J.N., Iwata, E.: Cell broadband engine architecture and its first implementation—a performance view. IBM J. Res. Dev. 51(5), 559–572 (2007)

    Article  Google Scholar 

  12. Dietz, P.F.: Optimal algorithms for list indexing and subset rank. In: Dehne, F., Sack, J.-R., Santoro, N. (eds.) WADS 1989. LNCS, vol. 382, pp. 39–46. Springer, Heidelberg (1989). https://doi.org/10.1007/3-540-51542-9_5

    Chapter  MATH  Google Scholar 

  13. Farzan, A., López-Ortiz, A., Nicholson, P.K., Salinger, A.: Algorithms in the ultra-wide word model. In: Jain, R., Jain, S., Stephan, F. (eds.) TAMC 2015. LNCS, vol. 9076, pp. 335–346. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-17142-5_29

    Chapter  Google Scholar 

  14. Fenwick, P.M.: A new data structure for cumulative frequency tables. Softw. Pract. Exp. 24(3), 327–336 (1994)

    Article  Google Scholar 

  15. Franceschini, G., Muthukrishnan, S., Pǎtraşcu, M.: Radix sorting with no extra space. In: Arge, L., Hoffmann, M., Welzl, E. (eds.) ESA 2007. LNCS, vol. 4698, pp. 194–205. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75520-3_19

    Chapter  Google Scholar 

  16. Frandsen, G.S., Miltersen, P.B., Skyum, S.: Dynamic word problems. J. ACM 44(2), 257–271 (1997)

    Article  MathSciNet  Google Scholar 

  17. Fredman, M., Saks, M.: The cell probe complexity of dynamic data structures. In: Proceedings of the 21st STOC, pp. 345–354 (1989)

    Google Scholar 

  18. Fredman, M.L.: A lower bound on the complexity of orthogonal range queries. J. ACM 28(4), 696–705 (1981)

    Article  MathSciNet  Google Scholar 

  19. Fredman, M.L.: The complexity of maintaining an array and computing its partial sums. J. ACM 29(1), 250–260 (1982)

    Article  MathSciNet  Google Scholar 

  20. Hagerup, T.: Sorting and searching on the word RAM. In: Morvan, M., Meinel, C., Krob, D. (eds.) STACS 1998. LNCS, vol. 1373, pp. 366–398. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0028575

    Chapter  Google Scholar 

  21. Hampapuram, H., Fredman, M.L.: Optimal biweighted binary trees and the complexity of maintaining partial sums. SIAM J. Comput. 28(1), 1–9 (1998)

    Article  MathSciNet  Google Scholar 

  22. Hon, W.K., Sadakane, K., Sung, W.K.: Succinct data structures for searchable partial sums with optimal worst-case performance. Theor. Comput. Sci. 412(39), 5176–5186 (2011)

    Article  MathSciNet  Google Scholar 

  23. Husfeldt, T., Rauhe, T.: New lower bound techniques for dynamic partial sums and related problems. SIAM J. Comput. 32(3), 736–753 (2003)

    Article  MathSciNet  Google Scholar 

  24. Husfeldt, T., Rauhe, T., Skyum, S.: Lower bounds for dynamic transitive closure, planar point location, and parentheses matching. In: Karlsson, R., Lingas, A. (eds.) SWAT 1996. LNCS, vol. 1097, pp. 198–211. Springer, Heidelberg (1996). https://doi.org/10.1007/3-540-61422-2_132

    Chapter  Google Scholar 

  25. Ladner, R.E., Fischer, M.J.: Parallel prefix computation. J. ACM 27(4), 831–838 (1980)

    Article  MathSciNet  Google Scholar 

  26. Larsen, K.G., Pagh, R.: I/O-efficient data structures for colored range and prefix reporting. In: Proceedings of the 23rd SODA, pp. 583–592 (2012)

    Google Scholar 

  27. Leben, R., Miletic, M., Špegel, M., Trost, A., Brodnik, A., Karlsson, J.: Design of high performance memory module on PC100. In: Proceedings of the ECSC, pp. 75–78 (1999)

    Google Scholar 

  28. Lindholm, E., Nickolls, J., Oberman, S., Montrym, J.: NVIDIA Tesla: a unified graphics and computing architecture. IEEE Micro 28(2), 39–55 (2008)

    Article  Google Scholar 

  29. Miltersen, P.B.: Cell probe complexity-a survey. In: Proceedings of the 19th FSTTCS, p. 2 (1999)

    Google Scholar 

  30. Munro, J.I., Suwanda, H.: Implicit data structures for fast search and update. J. Comput. Syst. Sci. 21(2), 236–250 (1980)

    Article  MathSciNet  Google Scholar 

  31. Pǎtraşcu, M., Demaine, E.D.: Logarithmic lower bounds in the cell-probe model. SIAM J. Comput. 35(4), 932–963 (2006). Announced at SODA 2004

    Article  MathSciNet  Google Scholar 

  32. Raman, R., Raman, V., Rao, S.S.: Succinct dynamic data structures. In: Dehne, F., Sack, J.-R., Tamassia, R. (eds.) WADS 2001. LNCS, vol. 2125, pp. 426–437. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44634-6_39

    Chapter  MATH  Google Scholar 

  33. Reinders, J.: AVX-512 Instructions. Intel Corporation, Santa Clara (2013)

    Google Scholar 

  34. Ryabko, B.Y.: A fast on-line adaptive code. IEEE Trans. Inf. Theory 38(4), 1400–1404 (1992)

    Article  Google Scholar 

  35. Salowe, J., Steiger, W.: Simplified stable merging tasks. J. Algorithms 8(4), 557–571 (1987)

    Article  MathSciNet  Google Scholar 

  36. Stephens, N., et al.: The ARM scalable vector extension. IEEE Micro 37(2), 26–39 (2017)

    Article  Google Scholar 

  37. Williams, J.W.J.: Algorithm 232: heapsort. Commun. ACM 7, 347–348 (1964)

    Article  Google Scholar 

  38. Yao, A.C.: On the complexity of maintaining partial sums. SIAM J. Comput. 14(2), 277–288 (1985)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Philip Bille .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bille, P., Gørtz, I.L., Skjoldjensen, F.R. (2020). Partial Sums on the Ultra-Wide Word RAM. In: Chen, J., Feng, Q., Xu, J. (eds) Theory and Applications of Models of Computation. TAMC 2020. Lecture Notes in Computer Science(), vol 12337. Springer, Cham. https://doi.org/10.1007/978-3-030-59267-7_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59267-7_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59266-0

  • Online ISBN: 978-3-030-59267-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics