Abstract
There is considerable uncertainty about what properties, capabilities and motivations future AGIs will have. In some plausible scenarios, AGIs may pose security risks arising from accidents and defects. In order to mitigate these risks, prudent early AGI research teams will perform significant testing on their creations before use. Unfortunately, if an AGI has human-level or greater intelligence, testing itself may not be safe; some natural AGI goal systems create emergent incentives for AGIs to tamper with their test environments, make copies of themselves on the internet, or convince developers and operators to do dangerous things. In this paper, we survey the AGI containment problem – the question of how to build a container in which tests can be conducted safely and reliably, even on AGIs with unknown motivations and capabilities that could be dangerous. We identify requirements for AGI containers, available mechanisms, and weaknesses that need to be addressed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2013)
Christiano, P.: Cryptographic boxes for unfriendly AI, 18 December 2010. http://lesswrong.com/lw/3cz/cryptographic_boxes_for_unfriendly_ai/
Coleşa, A., Tudoran, R., Bănescu, S.: Software random number generation based on race conditions. In: 10th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC 2008), pp. 439–444. IEEE (2008)
Corwin, J.: AI boxing. http://sl4.org/archive/0207/4935.html
Gruss, D., Maurice, C., Mangard, S.: Rowhammer.js: a remote software-induced fault attack in javascript. arXiv preprint (2015). arXiv:1507.06955
Masterjun: SNES Super Mario World (USA) “arbitrary code execution” in 02: 25.19 (2014). http://tasvideos.org/2513M.html
Maurice, C., Neumann, C., Heen, O., Francillon, A.: Confidentiality issues on a GPU in a virtualized environment. In: Christin, N., Safavi-Naini, R. (eds.) FC 2014. LNCS, vol. 8437, pp. 119–135. Springer, Heidelberg (2014)
Novikov, D., Yampolskiy, R.V., Reznik, L.: Anomaly detection based intrusion detection. In: Third International Conference on Information Technology: New Generations (ITNG 2006), pp. 420–425. IEEE (2006)
Novikov, D., Yampolskiy, R.V., Reznik, L.: Artificial intelligence approaches for intrusion detection. In: Systems, Applications and Technology Conference (LISAT 2006). IEEE Long Island, pp. 1–8. IEEE (2006)
NSA: Defense in depth: a practical strategy for achieving information assurance in today’s highly networked environments, 12 March 2010. http://www.nsa.gov/ia/ files/support/defenseindepth.pdf
Omohundro, S.: The basic AI drives. In: AGI 2008 (2008). https://selfawaresystems.files.wordpress.com/2008/01/ai_drives_final.pdf
Shaw, J., Porter, S.: Constructing rich false memories of committing crime. Psychol. Sci. 26(3), 291–301 (2015)
Sotala, K., Yampolskiy, R.V.: Responses to catastrophic AGI risk: a survey. Physica Scripta 90(1), 018001 (2015). http://iopscience.iop.org/1402-4896/90/1/018001
Togelius, J., Shaker, N., Karakovskiy, S., Yannakakis, G.N.: The Mario AI championship 2009–2012. AI Mag. 34(3), 89–92 (2013)
Tuxedage: I attempted the AI box experiment again! (and won - twice!), 5 September 2013. http://lesswrong.com/lw/ij4/i_attempted_the_ai_box_experiment_again_and_won/
Winfield, A.: Artificial intelligence will not turn into a frankenstein’s monster (2014). http://www.theguardian.com/technology/2014/aug/10/artificial-intelligence-will-not-become-a-frankensteins-monster-ian-winfield
Yampolskiy, R.: Leakproofing the singularity: artificial intelligence confinement problem. J. Conscious. Stud. 19(1–2), 194–214 (2012). http://cecs.louisville.edu/ry/LeakproofingtheSingularity.pdf
Yudkowsky, E.: Intelligence explosion microeconomics. Machine Intelligence Research Institute, 23 October 2015 (2013)
Yudkowsky, E.S.: The AI-box experiment (2002). http://www.yudkowsky.net/singularity/aibox
Acknowledgements
Authors are grateful to Jaan Tallinn and Effective Altruism Ventures for providing funding towards this project, and to Victoria Krakovna and Evan Hefner for their feedback.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Babcock, J., Kramár, J., Yampolskiy, R. (2016). The AGI Containment Problem. In: Steunebrink, B., Wang, P., Goertzel, B. (eds) Artificial General Intelligence. AGI 2016. Lecture Notes in Computer Science(), vol 9782. Springer, Cham. https://doi.org/10.1007/978-3-319-41649-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-41649-6_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41648-9
Online ISBN: 978-3-319-41649-6
eBook Packages: Computer ScienceComputer Science (R0)