Skip to main content

An Out-of-Core Branch and Bound Method for Solving the 0-1 Knapsack Problem on a GPU

  • Conference paper
  • First Online:
Algorithms and Architectures for Parallel Processing (ICA3PP 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10393))

Abstract

In this paper, we propose an out-of-core branch and bound (B&B) method for solving the 0–1 knapsack problem on a graphics processing unit (GPU). Given a large problem that produces many subproblems, the proposed method dynamically swaps subproblems out to CPU memory. We adopt two strategies to realize this swapping-out procedure with minimum amount of CPU-GPU data transfer. The first strategy is a GPU-based stream compaction strategy that reduces the sparseness of arrays. The second strategy is a double buffering strategy that hides the data transfer overhead by overlapping data transfer with GPU-based B&B operations. Experimental results show that the proposed method can store 33.7 times more subproblems than the previous method, solving twice more instances on the GPU. As for the stream compaction strategy, an input-output separated scheme runs \(13.1\%\) faster than an input-output unified scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Martello, S., Toth, P.: Knapsack Problems: Algorithms and Computer Implementations. Wiley, Chichester (1990)

    MATH  Google Scholar 

  2. Land, A.H., Doig, A.G.: An automatic method of solving discrete programming problems. Econometrica 28(3), 497–520 (1960)

    Article  MathSciNet  MATH  Google Scholar 

  3. Lin, J., Storer, J.A.: Processor-efficient hypercube algorithms for the knapsack problem. J. Parallel Distrib. Comput. 13(3), 332–337 (1991)

    Article  Google Scholar 

  4. Eckstein, J., Phillips, C.A., Hart, W.E.: PICO: an object-oriented framework for parallel branch and bound. Stud. Comput. Math. 8, 219–265 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  5. Goux, J.-P., Kulkarni, S., Yoder, M., Linderoth, J.: Master-worker: an enabling framework for applications on the computational grid. Cluster Comput. 4(1), 63–70 (2001)

    Article  Google Scholar 

  6. Tanaka, Y., Sato, M., Hirano, M., Nakada, H., Sekiguchi, S.: Performance evaluation of a firewall-compliant Globus-based wide-area cluster system. In: Proceedings of HPDC 2000, pp. 121–128 (2000)

    Google Scholar 

  7. Boyer, V., Baz, D.E., Elkihel, M.: Solving knapsack problems on GPU. Comput. Oper. Res. 39(1), 42–47 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  8. Boukedjar, A., Lalami, M.E., El-Baz, D.: Parallel branch and bound on a CPU-GPU system. In: Proceedings of PDP 2012, pp. 392–398 (2012)

    Google Scholar 

  9. Lalami, M.E., El-Baz, D.: GPU implementation of the branch, bound method for knapsack problems. In: Proceedings of IPDPSW 2012, pp. 1769–1777 (2012)

    Google Scholar 

  10. Pedemonte, M., Alba, E., Luna, F.: Towards the design of systolic genetic search. In: Proceedings of IPDPSW 2012, pp. 1778–1786 (2012)

    Google Scholar 

  11. Luebke, D., Humphreys, G.: How GPUs work. Computer 40(2), 96–100 (2007)

    Article  Google Scholar 

  12. Ino, F., Munekawa, Y., Hagihara, K.: Sequence homology search using fine grained cycle sharing of idle GPUs. IEEE Trans. Parallel Distrib. Syst. 23(4), 751–759 (2012)

    Article  Google Scholar 

  13. Mitani, Y., Ino, F., Hagihara, K.: Parallelizing exact and approximate string matching via inclusive scan on a GPU. IEEE Trans. Parallel Distrib. Syst. 28, 1989–2002 (2017)

    Article  Google Scholar 

  14. Carneiro, T., Muritiba, A.E., Negreiros, M., de Campos, G.A.L.: A new parallel schema for branch-and-bound algorithms using GPGPU. In: Proceedings of SBAC-PAD 2011, pp. 41–47 (2011)

    Google Scholar 

  15. Dantzig, G.B.: Discrete variable extremum problems. Oper. Res. 5(2), 266–277 (1957)

    Article  MathSciNet  Google Scholar 

  16. Bell, N., Hoberock, J.: Thrust: A Productivity-Oriented Library for CUDA. Morgan Kaufmann, San Mateo (2011). Chap. 26. http://thrust.github.io/

    Google Scholar 

  17. Rennich, S.: CUDA C/C++ Streams and Concurrency, Nvidia GTC express (2011). http://on-demand.gputechconf.com/gtc-express/2011/presentations/StreamsAndConcurrencyWebinar.pdf

  18. Martello, S., Pisinger, D., Toth, P.: New trends in exact algorithms for the 0-1 knapsack problem. Eur. J. Oper. Res. 123(2), 325–332 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  19. Martello, S., Pisinger, D., Toth, P.: Dynamic Programming and Tight Bounds for the 0-1 Knapsack Problem, Datalogisk Institut København: DIKU-Rapport, Datalogisk Institut, Københavns Universitet (1997)

    Google Scholar 

  20. CUDA Toolkit Documentation: Nvidia (2017). http://docs.nvidia.com/cuda/index.html

Download references

Acknowledgments

This study was supported in part by the Japan Society for the Promotion of Science KAKENHI Grant Numbers 15H01687, 16H02801 and 15K12008. We are also grateful to the anonymous reviewers for their valuable comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingcheng Shen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Shen, J., Shigeoka, K., Ino, F., Hagihara, K. (2017). An Out-of-Core Branch and Bound Method for Solving the 0-1 Knapsack Problem on a GPU. In: Ibrahim, S., Choo, KK., Yan, Z., Pedrycz, W. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2017. Lecture Notes in Computer Science(), vol 10393. Springer, Cham. https://doi.org/10.1007/978-3-319-65482-9_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-65482-9_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-65481-2

  • Online ISBN: 978-3-319-65482-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics