The AGI Containment Problem

Babcock, James; Kramár, János; Yampolskiy, Roman

doi:10.1007/978-3-319-41649-6_6

James Babcock¹⁶,
János Kramár¹⁷ &
Roman Yampolskiy¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9782))

Included in the following conference series:

International Conference on Artificial General Intelligence

1800 Accesses
6 Citations

Abstract

There is considerable uncertainty about what properties, capabilities and motivations future AGIs will have. In some plausible scenarios, AGIs may pose security risks arising from accidents and defects. In order to mitigate these risks, prudent early AGI research teams will perform significant testing on their creations before use. Unfortunately, if an AGI has human-level or greater intelligence, testing itself may not be safe; some natural AGI goal systems create emergent incentives for AGIs to tamper with their test environments, make copies of themselves on the internet, or convince developers and operators to do dangerous things. In this paper, we survey the AGI containment problem – the question of how to build a container in which tests can be conducted safely and reliably, even on AGIs with unknown motivations and capabilities that could be dangerous. We identify requirements for AGI containers, available mechanisms, and weaknesses that need to be addressed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2013)
Google Scholar
Christiano, P.: Cryptographic boxes for unfriendly AI, 18 December 2010. http://lesswrong.com/lw/3cz/cryptographic_boxes_for_unfriendly_ai/
Coleşa, A., Tudoran, R., Bănescu, S.: Software random number generation based on race conditions. In: 10th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC 2008), pp. 439–444. IEEE (2008)
Google Scholar
Corwin, J.: AI boxing. http://sl4.org/archive/0207/4935.html
Gruss, D., Maurice, C., Mangard, S.: Rowhammer.js: a remote software-induced fault attack in javascript. arXiv preprint (2015). arXiv:1507.06955
Masterjun: SNES Super Mario World (USA) “arbitrary code execution” in 02: 25.19 (2014). http://tasvideos.org/2513M.html
Maurice, C., Neumann, C., Heen, O., Francillon, A.: Confidentiality issues on a GPU in a virtualized environment. In: Christin, N., Safavi-Naini, R. (eds.) FC 2014. LNCS, vol. 8437, pp. 119–135. Springer, Heidelberg (2014)
Google Scholar
Novikov, D., Yampolskiy, R.V., Reznik, L.: Anomaly detection based intrusion detection. In: Third International Conference on Information Technology: New Generations (ITNG 2006), pp. 420–425. IEEE (2006)
Google Scholar
Novikov, D., Yampolskiy, R.V., Reznik, L.: Artificial intelligence approaches for intrusion detection. In: Systems, Applications and Technology Conference (LISAT 2006). IEEE Long Island, pp. 1–8. IEEE (2006)
Google Scholar
NSA: Defense in depth: a practical strategy for achieving information assurance in today’s highly networked environments, 12 March 2010. http://www.nsa.gov/ia/ files/support/defenseindepth.pdf
Omohundro, S.: The basic AI drives. In: AGI 2008 (2008). https://selfawaresystems.files.wordpress.com/2008/01/ai_drives_final.pdf
Shaw, J., Porter, S.: Constructing rich false memories of committing crime. Psychol. Sci. 26(3), 291–301 (2015)
Article Google Scholar
Sotala, K., Yampolskiy, R.V.: Responses to catastrophic AGI risk: a survey. Physica Scripta 90(1), 018001 (2015). http://iopscience.iop.org/1402-4896/90/1/018001
Article Google Scholar
Togelius, J., Shaker, N., Karakovskiy, S., Yannakakis, G.N.: The Mario AI championship 2009–2012. AI Mag. 34(3), 89–92 (2013)
Google Scholar
Tuxedage: I attempted the AI box experiment again! (and won - twice!), 5 September 2013. http://lesswrong.com/lw/ij4/i_attempted_the_ai_box_experiment_again_and_won/
Winfield, A.: Artificial intelligence will not turn into a frankenstein’s monster (2014). http://www.theguardian.com/technology/2014/aug/10/artificial-intelligence-will-not-become-a-frankensteins-monster-ian-winfield
Yampolskiy, R.: Leakproofing the singularity: artificial intelligence confinement problem. J. Conscious. Stud. 19(1–2), 194–214 (2012). http://cecs.louisville.edu/ry/LeakproofingtheSingularity.pdf
Google Scholar
Yudkowsky, E.: Intelligence explosion microeconomics. Machine Intelligence Research Institute, 23 October 2015 (2013)
Google Scholar
Yudkowsky, E.S.: The AI-box experiment (2002). http://www.yudkowsky.net/singularity/aibox

Download references

Acknowledgements

Authors are grateful to Jaan Tallinn and Effective Altruism Ventures for providing funding towards this project, and to Victoria Krakovna and Evan Hefner for their feedback.

Author information

Authors and Affiliations

Cornell University, Ithaca, NY, USA
James Babcock
University of Montreal, Montreal, QC, Canada
János Kramár
University of Louisville, Louisville, KY, USA
Roman Yampolskiy

Authors

James Babcock
View author publications
You can also search for this author in PubMed Google Scholar
János Kramár
View author publications
You can also search for this author in PubMed Google Scholar
Roman Yampolskiy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to James Babcock .

Editor information

Editors and Affiliations

Galleria 1, IDSIA, Manno, Switzerland
Bas Steunebrink
Temple University, Phoenixville, Pennsylvania, USA
Pei Wang
Hong Kong Polytechnic University, Hong Kong, Hong Kong
Ben Goertzel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Babcock, J., Kramár, J., Yampolskiy, R. (2016). The AGI Containment Problem. In: Steunebrink, B., Wang, P., Goertzel, B. (eds) Artificial General Intelligence. AGI 2016. Lecture Notes in Computer Science(), vol 9782. Springer, Cham. https://doi.org/10.1007/978-3-319-41649-6_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-41649-6_6
Published: 25 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41648-9
Online ISBN: 978-3-319-41649-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics