Addressing the Value Alignment Problem Through Online Institutions

Noriega, Pablo; Verhagen, Harko; Padget, Julian; d’Inverno, Mark

doi:10.1007/978-3-031-49133-7_5

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14002))

Included in the following conference series:

International Workshop on Coordination, Organizations, Institutions, Norms, and Ethics for Governance of Multi-Agent Systems

77 Accesses

Abstract

As artificial intelligence systems permeate society, it becomes clear that aligning the behaviour of these systems with the values of those involved and affected by them is needed. The value alignment problem is widely recognised yet needs addressing in a principled way. This paper investigates how such a principled approach regarding online institutions—a class of multiagent systems—can provide key insights on how the value alignment problem can be addressed in general.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In OIs, like in any multiagent system, one can identify two primitive components: the active agents in the institution and the environment that capacitates and governs the interactions of those agents [4, 7]. In OIs, the environment itself includes a limited ontology—which includes a set of entities that are involved in the description of the facts that may at some point hold in the institution, as well as enabling actions and feasible events—that is common to all the active agents. Because we mean to capture the governance functions of conventional institutions, the environment also provides the devices that determine whether agents can enter the environment, as well as the devices that govern the activity of agents (communication, display of information, enforcement of institutional constraints).
2.
Humans don’t need to be involved in every OI; what is, in fact, assumed is that the decision-making of participating (non-institutional) agents is “opaque” or not accessible to the institution. The point of this property is to acknowledge the need to govern the behaviour of participating agents that may be heterogeneous, incompetent, malevolent, or belong to different principals.
3.
This feature may be realised in different ways; one is to think of OIs as normative multiagent systems (see [3]); however, in a given OI, the particular representation of institutional constraints and their enforcement is reflected in the institutional model (\(\varPsi \) of \(\mathcal {I}\)) see Sect. 3.
4.
We can be more precise defining it as a point in the institutional space at time t. That is, \(s_{t}\in \mathcal {S}_t=\times _{i=1}^n D_i\), where each \(D_i\) is a “domain”, there is an initial state \(\mathcal {S}_0\) that changes only when an event or an action performed by a participating agent complies with the active institutional constraints (actions and events are partial functions on \(\mathcal {S}\)).
5.
We have previously referred to OIs as socio-cognitive technical systems and as hybrid online social systems in previous publications (see [5, 8, 11, 21]).
6.
The rationale is as follows: First, by definition, OI are state-based and by the (Observability Stance (Construct 4)), the institutional state is a finite set of observable facts. Second, from Val.4), we assume that values can determine preferences over the state of the world, and therefore, one can define a preference relation on the set of institutional states \(P_v\) for any given value v. Third, Since the state of the world is finite, one can choose preferable states for a given value v and define them as goals \(G_v\) that are motivated for that value (Val.1)) and also legitimised by it (Val.3)). Fourth, note that any goal (g) of value \(v_i\) will be included also in the preference relation (\(P_{v_j}\)) for every other value \(v_j\) (because g is one state of the world and because of V.6, several values may be involved in the assessment of a state of the world), however, it might not be a goal for \(v_j\) (g may or may not be in \(G_{v_j}\).)
7.
For example. the conjunction of Heuristics 2, 3 and 7 amounts to a weak form of consequentialism in which values are identified with goals but only for one specific OI and by the consensus of the design stakeholders who agree on the consequences of values.
8.
The heuristics we propose in Sect. 5 (notably Heuristic 5) are meant to allow value alignments that reflect the individual perspectives of the different design stakeholders, the consensual perspective and a combination of the two.
9.
In fact, one may implement institutional agents whose behaviour operationalise those three types of instruments values. For example, institutional agents that perform discretionary norm-enforcement functions.
10.
These heuristics complement the ones in [10].

References

Aldewereld, H., Boissier, O., Dignum, V., Noriega, P., Padget, J.: Introduction, pp. 3–9. Springer (2016). https://doi.org/10.1007/978-3-319-33570-4_1
Alexander, C.: A Pattern Language: Towns, Buildings, Construction. OUP, Oxford (1977)
Google Scholar
Andrighetto, G., Governatori, G., Noriega, P., van der Torre, L.W.N. (eds.): Normative Multi-Agent Systems, vol. 4. Dagstuhl Publishing, Saarbrucken (2013)
Google Scholar
Argente, E., Boissier, O., Carrascosa, C., Fornara, N., Mcburney, P., Noriega, P., Ricci, A., Sabater-Mir, J., Schumacher, M.I., Tampitsikas, C., Taveter, K., Vizzari, G., Vouros, G.A.: The role of the environment in agreement technologies. Artif. Intell. Rev. 39, 21–38 (2013)
Google Scholar
Christiaanse, R., Ghose, A.K., Noriega, P., Singh, M.P.: Characterizing artificial socio-cognitive technical systems. In: Herzig, A., Lorini, E. (eds.) Proceedings of the European Conference on Social Intelligence (ECSI-2014), pp. 336–446. CeUR (2014). www.ceur-ws.org/Vol-1283/
High-Level Expert Group on AI (AI HLEG): Ethics guidelines for trustworthy AI (2019). www.ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai
Noriega, P., Padget, J., Verhagen, H., d’Inverno, M.: Towards a framework for socio-cognitive technical systems. In: Ghose, A., Oren, N., Telang, P., Thangarajah, J. (eds.) Coordination, Organizations, Institutions, and Norms in Agent Systems X, Lecture Notes in Computer Science, vol. 9372, pp. 164–181. Springer International Publishing (2015). https://doi.org/10.1007/978-3-319-25420-3_11
Noriega, P., Padget, J., Verhagen, H., d’Inverno, M.: Anchoring online institutions. In: Casanovas, P., Moreso, J.J. (eds.) Anchoring Institutions. Democracy and Regulations in a Global and Semi-automated World. Springer ((in press))
Google Scholar
Noriega, P., Verhagen, H., d’Inverno, M., Padget, J.A.: A Manifesto for Conscientious Design of Hybrid Online Social Systems. In: Cranefield, S., Mahmoud, S., Padget, J.A., Rocha, A.P. (eds.) COIN@AAMAS, Singapore, May 2016, COIN@ECAI, The Hague, The Netherlands, August 2016, Revised Selected Papers. LNCS, vol. 10315, pp. 60–78. Springer (2016)
Google Scholar
Noriega, P., Verhagen, H., Padget, J., d’Inverno, M.: Design Heuristics for Ethical Online Institutions. In: Ajmeri, N., Morris Martin, A., Savarimuthu, B.T.R. (eds.) Coordination, Organizations, Institutions, Norms, and Ethics for Governance of Multi-Agent Systems XV, pp. 213–230. Springer International Publishing, Cham (2022)
Google Scholar
Noriega, P., Verhagen, H., Padget, J., d’Inverno, M.: Ethical online AI systems through conscientious design. IEEE Internet Comput. 25(6), 58–64 (2021)
Article Google Scholar
North, D.: Institutions. Institutional Change and Economic Performance. CUP, Cambridge (1991)
Google Scholar
Ostrom, E.: Governing the Commons. The Evolutions of Institutions for Collective Action. Cambridge University Press, Cambridge (1990)
Google Scholar
van de Poel, I.: Embedding values in artificial intelligence (AI) systems. Mind. Mach. 30(3), 385–409 (2020)
Article Google Scholar
Russell, S.: Of Myths and Moonshine. A conversation with Jaron Lanier, 14–11-14. The Edge (2014). www.edge.org/conversation/the-myth-of-ai#26015. Accessed 12 Dec 2022
Schwartz, S.H.: An overview of the Schwartz theory of basic values. Psychol. Cult. (Online readings) 2(1), 11 (2012)
Google Scholar
Searle, J.R.: The Construction of Social Reality. The Penguin Press, Allen Lane (1995)
Google Scholar
Simon, H.A.: The Sciences of the Artificial, 3rd edn. MIT Press, Cambridge (1996)
Google Scholar
Simon, H.A.: Fact and value in decision-making. In: Administrative Behavior: A Study of Decision-making Processes in Administrative Organization, 4th edn. The Free Press (1997)
Google Scholar
The IEEE Global Initiative on Ethics of Autonomous and Intelligent System: Ethically aligned design: A vision for prioritizing human well-being with autonomous and intelligent systems, first edition (2019), www.standards.ieee.org/content/dam/ieee-standards/standards/web/documents/other/ead1e.pdf
Verhagen, H., Noriega, P., d’Inverno, M.: Towards a design framework for controlled hybrid social games. In: Social Coordination: Principles, Artefacts and Theories, SOCIAL.PATH 2013 - AISB Convention 2013, pp. 83–87 (2013)
Google Scholar

Download references

Acknowledgements

Research for his paper is supported by the EU Project VALAWAI 101070930 (funded by HORIZON-EIC-2021-PATHFINDERCHALLENGES-01), project VAE (grant TED2021-131295B-C31 funded by MCIN/AEI /10.13039/501100011033 and by the European Union’s NextGenerationEU/PRTR), and CSIC’s project DESAFIA2030 (BILTC22005 funded by the Bilateral Collaboration Initiative i-LINK-TEC).

Author information

Authors and Affiliations

Artificial Intelligence Research Institute (IIIA-CSIC), 08193, Barcelona, Spain
Pablo Noriega
Stockholm University, 114 19, Stockholm, Sweden
Harko Verhagen
University of Bath, Bath, BA2 7AY, UK
Julian Padget
Goldsmiths, University of London, London, SE14 6NW, UK
Mark d’Inverno

Authors

Pablo Noriega
View author publications
You can also search for this author in PubMed Google Scholar
Harko Verhagen
View author publications
You can also search for this author in PubMed Google Scholar
Julian Padget
View author publications
You can also search for this author in PubMed Google Scholar
Mark d’Inverno
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pablo Noriega .

Editor information

Editors and Affiliations

Università della Svizzera Italiana, Lugano, Switzerland
Nicoletta Fornara
University of Otago, Dunedin, Otago, New Zealand
Jithin Cheriyan
Imperial College London, London, UK
Asimina Mertzani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Noriega, P., Verhagen, H., Padget, J., d’Inverno, M. (2023). Addressing the Value Alignment Problem Through Online Institutions. In: Fornara, N., Cheriyan, J., Mertzani, A. (eds) Coordination, Organizations, Institutions, Norms, and Ethics for Governance of Multi-Agent Systems XVI. COINE 2023. Lecture Notes in Computer Science(), vol 14002. Springer, Cham. https://doi.org/10.1007/978-3-031-49133-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-49133-7_5
Published: 20 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-49132-0
Online ISBN: 978-3-031-49133-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics