Skip to main content

The AGI Containment Problem

  • Conference paper
  • First Online:
Artificial General Intelligence (AGI 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9782))

Included in the following conference series:

Abstract

There is considerable uncertainty about what properties, capabilities and motivations future AGIs will have. In some plausible scenarios, AGIs may pose security risks arising from accidents and defects. In order to mitigate these risks, prudent early AGI research teams will perform significant testing on their creations before use. Unfortunately, if an AGI has human-level or greater intelligence, testing itself may not be safe; some natural AGI goal systems create emergent incentives for AGIs to tamper with their test environments, make copies of themselves on the internet, or convince developers and operators to do dangerous things. In this paper, we survey the AGI containment problem – the question of how to build a container in which tests can be conducted safely and reliably, even on AGIs with unknown motivations and capabilities that could be dangerous. We identify requirements for AGI containers, available mechanisms, and weaknesses that need to be addressed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2013)

    Google Scholar 

  2. Christiano, P.: Cryptographic boxes for unfriendly AI, 18 December 2010. http://lesswrong.com/lw/3cz/cryptographic_boxes_for_unfriendly_ai/

  3. Coleşa, A., Tudoran, R., Bănescu, S.: Software random number generation based on race conditions. In: 10th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC 2008), pp. 439–444. IEEE (2008)

    Google Scholar 

  4. Corwin, J.: AI boxing. http://sl4.org/archive/0207/4935.html

  5. Gruss, D., Maurice, C., Mangard, S.: Rowhammer.js: a remote software-induced fault attack in javascript. arXiv preprint (2015). arXiv:1507.06955

  6. Masterjun: SNES Super Mario World (USA) “arbitrary code execution” in 02: 25.19 (2014). http://tasvideos.org/2513M.html

  7. Maurice, C., Neumann, C., Heen, O., Francillon, A.: Confidentiality issues on a GPU in a virtualized environment. In: Christin, N., Safavi-Naini, R. (eds.) FC 2014. LNCS, vol. 8437, pp. 119–135. Springer, Heidelberg (2014)

    Google Scholar 

  8. Novikov, D., Yampolskiy, R.V., Reznik, L.: Anomaly detection based intrusion detection. In: Third International Conference on Information Technology: New Generations (ITNG 2006), pp. 420–425. IEEE (2006)

    Google Scholar 

  9. Novikov, D., Yampolskiy, R.V., Reznik, L.: Artificial intelligence approaches for intrusion detection. In: Systems, Applications and Technology Conference (LISAT 2006). IEEE Long Island, pp. 1–8. IEEE (2006)

    Google Scholar 

  10. NSA: Defense in depth: a practical strategy for achieving information assurance in today’s highly networked environments, 12 March 2010. http://www.nsa.gov/ia/ files/support/defenseindepth.pdf

  11. Omohundro, S.: The basic AI drives. In: AGI 2008 (2008). https://selfawaresystems.files.wordpress.com/2008/01/ai_drives_final.pdf

  12. Shaw, J., Porter, S.: Constructing rich false memories of committing crime. Psychol. Sci. 26(3), 291–301 (2015)

    Article  Google Scholar 

  13. Sotala, K., Yampolskiy, R.V.: Responses to catastrophic AGI risk: a survey. Physica Scripta 90(1), 018001 (2015). http://iopscience.iop.org/1402-4896/90/1/018001

    Article  Google Scholar 

  14. Togelius, J., Shaker, N., Karakovskiy, S., Yannakakis, G.N.: The Mario AI championship 2009–2012. AI Mag. 34(3), 89–92 (2013)

    Google Scholar 

  15. Tuxedage: I attempted the AI box experiment again! (and won - twice!), 5 September 2013. http://lesswrong.com/lw/ij4/i_attempted_the_ai_box_experiment_again_and_won/

  16. Winfield, A.: Artificial intelligence will not turn into a frankenstein’s monster (2014). http://www.theguardian.com/technology/2014/aug/10/artificial-intelligence-will-not-become-a-frankensteins-monster-ian-winfield

  17. Yampolskiy, R.: Leakproofing the singularity: artificial intelligence confinement problem. J. Conscious. Stud. 19(1–2), 194–214 (2012). http://cecs.louisville.edu/ry/LeakproofingtheSingularity.pdf

    Google Scholar 

  18. Yudkowsky, E.: Intelligence explosion microeconomics. Machine Intelligence Research Institute, 23 October 2015 (2013)

    Google Scholar 

  19. Yudkowsky, E.S.: The AI-box experiment (2002). http://www.yudkowsky.net/singularity/aibox

Download references

Acknowledgements

Authors are grateful to Jaan Tallinn and Effective Altruism Ventures for providing funding towards this project, and to Victoria Krakovna and Evan Hefner for their feedback.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to James Babcock .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Babcock, J., Kramár, J., Yampolskiy, R. (2016). The AGI Containment Problem. In: Steunebrink, B., Wang, P., Goertzel, B. (eds) Artificial General Intelligence. AGI 2016. Lecture Notes in Computer Science(), vol 9782. Springer, Cham. https://doi.org/10.1007/978-3-319-41649-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41649-6_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41648-9

  • Online ISBN: 978-3-319-41649-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics