skip to main content
10.1145/1989493.1989551acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
research-article

Understanding bloom filter intersection for lazy address-set disambiguation

Published: 04 June 2011 Publication History

Abstract

A Bloom filter is a probabilistic bit-array-based set representation that has recently been applied to address-set disambiguation in systems that ease the burden of parallel programming. However, many of these systems intersect the Bloom filter bit-arrays to approximate address-set intersection and decide set disjointness. This is in contrast with the conventional and well-studied approach of making individual membership queries into the Bloom filter. In this paper we present much-needed probabilistic models for the unconventional application of testing set disjointness using Bloom filters. Consequently, we demonstrate that intersecting Bloom filters requires substantially larger bit-arrays to provide the same probability of false set-overlap as querying into the bit-array. For when intersection is unavoidable, we prove that partitioned Bloom filters require less space than unpartitioned. Finally, we show that for Bloom filters with a single hash function, surprisingly, intersection and querying share the same probability of false set-overlap.

References

[1]
B. H. Bloom. Space/time trade-offs in hash coding with allowable errors. Commun. ACM, 13(7):422--426, 1970.
[2]
P. Bose, H. Guo, E. Kranakis, A. Maheshwari, P. Morin, J. Morrison, M. Smid, and Y. Tang. On the false-positive rate of Bloom filters. Inf. Process. Lett., 108(4):210--213, 2008.
[3]
A. Broder and M. Mitzenmacher. Network applications of Bloom filters: A survey. Internet Mathematics, 1:485--509, January 2004.
[4]
J. L. Carter and M. N. Wegman. Universal classes of hash functions. Journal of Computer and System Sciences, 18(2):143 -- 154, 1979.
[5]
L. Ceze, J. Tuck, P. Montesinos, and J. Torrellas. Bulksc: bulk enforcement of sequential consistency. SIGARCH Comp. Arch. News, 35(2):278--289, 2007.
[6]
L. Ceze, J. Tuck, J. Torrellas, and C. Cascaval. Bulk disambiguation of speculative threads in multiprocessors. In International Symposium on Computer Architecture, 2006.
[7]
K. Christensen, A. Roginsky, and M. Jimeno. A new analysis of the false positive rate of a Bloom filter. Inf. Processing Letters, 110(21):944 -- 949, 2010.
[8]
J. E. Gottschlich, M. Vachharajani, and J. G. Siek. An efficient software transactional memory using commit-time invalidation. In International Symposium on Code Generation and Optimization, 2010.
[9]
D. Guo, J. Wu, H. Chen, Y. Yuan, and X. Luo. The dynamic Bloom filters. IEEE Transactions on Knowledge and Data Engineering, 22:120--133, 2010.
[10]
L. Hammond, M. Willey, and K. Olukotun. Data speculation support for a chip multiprocessor. In Conference on Architectural Support for Programming Languages and Operating Systems, October 1998.
[11]
L. Han, W. Liu, and J. M. Tuck. Speculative parallelization of partial reduction variables. In International Symposium on Code Generation and Optimization, 2010.
[12]
M. Herlihy and J. E. B. Moss. Transactional memory: architectural support for lock-free data structures. In Intl. Symposium on Computer Architecture, 1993.
[13]
D. R. Hower and M. D. Hill. Rerun: Exploiting episodes for lightweight memory race recording. In Intl. Symposium on Computer Architecture, 2008.
[14]
M. Jeffrey. Modeling Bloom filter intersection for address-set disambiguation. Master's thesis, University of Toronto, June 2011.
[15]
V. Krishnan and J. Torrellas. A chip multiprocessor architecture with speculative multithreading. IEEE Transactions on Computers, Special Issue on Multithreaded Architecture, September 1999.
[16]
M. Labrecque, M. Jeffrey, and J. G. Steffan. Application-specific signatures for transactional memory in soft processors. In Intl. Symposium on Applied Reconfigurable Computing, 2010.
[17]
B. Lucia, L. Ceze, and K. Strauss. Colorsafe: architectural support for debugging and dynamically avoiding multi-variable atomicity violations. SIGARCH Comput. Archit. News, 38(3):222--233, 2010.
[18]
B. Lucia, J. Devietti, L. Ceze, and K. Strauss. Atom-aid: Detecting and surviving atomicity violations. IEEE Micro, 29(1):73--83, Jan.-Feb. 2009.
[19]
M. Mehrara, J. Hao, P.-C. Hsu, and S. Mahlke. Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory. In Conference on Programming Language Design and Implementation, 2009.
[20]
L. Michael, W. Nejdl, O. Papapetrou, and W. Siberski. Improving distributed join efficiency with extended Bloom filter operations. In International Conference on Advanced Networking and Applications, 2007.
[21]
C. C. Minh, M. Trautmann, J. Chung, A. McDonald, N. Bronson, J. Casper, C. Kozyrakis, and K. Olukotun. An effective hybrid transactional memory system with strong isolation guarantees. In Intl. Symposium on Computer Architecture, 2007.
[22]
D. S. Mitrinovic and J. E. Pecaric. Bernoulli's inequality. Rendiconti del Circolo Matematico di Palermo, 42:317--337, 1993.
[23]
D. S. Mitrinovic and P. M. Vasic. Analytic Inequalities. Springer-Verlag, Berlin, 1970.
[24]
M. Mitzenmacher and S. Vadhan. Why simple hash functions work: exploiting the entropy in a data stream. In SODA '08: ACM-SIAM Symposium On Discrete Algorithms, 2008.
[25]
P. Montesinos, L. Ceze, and J. Torrellas. Delorean: Recording and deterministically replaying shared-memory multiprocessor execution efficiently. In Intl. Symposium on Computer Architecture, 2008.
[26]
K. Moore, J. Bobba, M. Moravan, M. Hill, and D. Wood. Logtm: log-based transactional memory. In International Symposium on High-Performance Computer Architecture, 2006.
[27]
J. K. Mullin. Estimating the size of a relational join. Information Systems, 18(3):189 -- 196, 1993.
[28]
A. Muzahid, D. Suárez, S. Qi, and J. Torrellas. Sigrace: signature-based data race detection. SIGARCH Comp. Arch. News, 37(3):337--348, 2009.
[29]
O. Papapetrou, W. Siberski, and W. Nejdl. Cardinality estimation and dynamic length adaptation for Bloom filters. Distributed and Parallel Databases, 28:119--156, 2010.
[30]
L. Peng, L. guo Xie, X. qiang Zhang, and X. yan Xie. Conflict detection via adaptive signature for software transactional memory. In International Conference on Computer Engineering and Technology, 2010.
[31]
G. Pokam, C. Pereira, K. Danne, R. Kassa, and A.-R. Adl-Tabatabai. Architecting a chunk-based memory race recorder in modern cmps. In International Symposium on Microarchitecture, 2009.
[32]
R. Quislant, E. Gutierrez, O. Plata, and E. L. Zapata. Improving signatures by locality exploitation for transactional memory. In Intl. Conference on Parallel Architectures and Compilation Techniques, 2009.
[33]
D. Sanchez, L. Yen, M. D. Hill, and K. Sankaralingam. Implementing signatures for transactional memory. In International Symposium on Microarchitecture, 2007.
[34]
A. Shriraman, S. Dwarkadas, and M. L. Scott. Flexible decoupled transactional memory support. In Intl. Symposium on Computer Architecture, 2008.
[35]
A. Shriraman, S. Dwarkadas, and M. L. Scott. Implementation tradeoffs in the design of flexible transactional memory support. J. Parallel Distrib. Comput., 70, October 2010.
[36]
M. F. Spear, M. M. Michael, and C. von Praun. Ringstm: scalable transactions with a single atomic instruction. In Symposium on Parallelism in Algorithms and Architectures, 2008.
[37]
J. G. Steffan and T. C. Mowry. The potential for using thread-level data speculation to facilitate automatic parallelization. In International Symposium on High-Performance Computer Architecture, pages 2--13, 1998.
[38]
J. Torrellas, L. Ceze, J. Tuck, C. Cascaval, P. Montesinos, W. Ahn, and M. Prvulovic. The bulk multicore architecture for improved programmability. Commun. ACM, 52(12):58--65, 2009.
[39]
M. Waliullah and P. Stenstrom. Efficient management of speculative data in hardware transactional memory systems. In International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation, 2008.
[40]
S. Wang, D. Wu, Z. Pang, and X. Yang. Software assisted transact cache to support efficient unbounded transactional memory. In Intl. Conference on High Performance Computing and Communications, 2008.
[41]
L. Yen, J. Bobba, M. R. Marty, K. E. Moore, H. Volos, M. D. Hill, M. M. Swift, and D. A. Wood. Logtm-se: Decoupling hardware transactional memory from caches. In International Symposium on High Performance Computer Architecture, 2007.
[42]
L. Yen, S. C. Draper, and M. D. Hill. Notary: Hardware techniques to enhance signatures. In International Symposium on Microarchitecture, 2008.
[43]
P. Zhou, R. Teodorescu, and Y. Zhou. Hard: Hardware-assisted lockset-based race detection. In International Symposium on High Performance Computer Architecture, pages 121--132, 2007.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SPAA '11: Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
June 2011
404 pages
ISBN:9781450307437
DOI:10.1145/1989493
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • EATCS: European Association for Theoretical Computer Science

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 June 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. address-set disambiguation
  2. bloom filters
  3. parallelism
  4. set intersection
  5. signatures
  6. thread-level speculation
  7. transactional memory

Qualifiers

  • Research-article

Conference

SPAA '11

Acceptance Rates

Overall Acceptance Rate 447 of 1,461 submissions, 31%

Upcoming Conference

SPAA '25
37th ACM Symposium on Parallelism in Algorithms and Architectures
July 28 - August 1, 2025
Portland , OR , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)A Case for Partitioned Bloom FiltersIEEE Transactions on Computers10.1109/TC.2022.321899572:6(1681-1691)Online publication date: 1-Jun-2023
  • (2019)Bloom HoppingIEEE Transactions on Mobile Computing10.1109/TMC.2018.284012318:3(534-545)Online publication date: 17-Jul-2019
  • (2018)Sampling and Reconstruction Using Bloom FiltersIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2017.278580330:7(1324-1337)Online publication date: 1-Jul-2018
  • (2016)False-Positive Probability and Compression Optimization for Tree-Structured Bloom FiltersACM Transactions on Modeling and Performance Evaluation of Computing Systems10.1145/29403241:4(1-39)Online publication date: 21-Sep-2016
  • (2013)Membership classification using Integer Bloom Filter2013 IEEE/ACIS 12th International Conference on Computer and Information Science (ICIS)10.1109/ICIS.2013.6607871(385-390)Online publication date: Jun-2013
  • (2013)A Bloom Filter Based Model for Decentralized AuthorizationInternational Journal of Intelligent Systems10.1002/int.2159328:6(565-582)Online publication date: 26-Mar-2013

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media