Skip to main content

Addressing the Value Alignment Problem Through Online Institutions

  • Conference paper
  • First Online:
Coordination, Organizations, Institutions, Norms, and Ethics for Governance of Multi-Agent Systems XVI (COINE 2023)

Abstract

As artificial intelligence systems permeate society, it becomes clear that aligning the behaviour of these systems with the values of those involved and affected by them is needed. The value alignment problem is widely recognised yet needs addressing in a principled way. This paper investigates how such a principled approach regarding online institutions—a class of multiagent systems—can provide key insights on how the value alignment problem can be addressed in general.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In OIs, like in any multiagent system, one can identify two primitive components: the active agents in the institution and the environment that capacitates and governs the interactions of those agents [4, 7]. In OIs, the environment itself includes a limited ontology—which includes a set of entities that are involved in the description of the facts that may at some point hold in the institution, as well as enabling actions and feasible events—that is common to all the active agents. Because we mean to capture the governance functions of conventional institutions, the environment also provides the devices that determine whether agents can enter the environment, as well as the devices that govern the activity of agents (communication, display of information, enforcement of institutional constraints).

  2. 2.

    Humans don’t need to be involved in every OI; what is, in fact, assumed is that the decision-making of participating (non-institutional) agents is “opaque” or not accessible to the institution. The point of this property is to acknowledge the need to govern the behaviour of participating agents that may be heterogeneous, incompetent, malevolent, or belong to different principals.

  3. 3.

    This feature may be realised in different ways; one is to think of OIs as normative multiagent systems (see [3]); however, in a given OI, the particular representation of institutional constraints and their enforcement is reflected in the institutional model (\(\varPsi \) of \(\mathcal {I}\)) see Sect. 3.

  4. 4.

    We can be more precise defining it as a point in the institutional space at time t. That is, \(s_{t}\in \mathcal {S}_t=\times _{i=1}^n D_i\), where each \(D_i\) is a “domain”, there is an initial state \(\mathcal {S}_0\) that changes only when an event or an action performed by a participating agent complies with the active institutional constraints (actions and events are partial functions on \(\mathcal {S}\)).

  5. 5.

    We have previously referred to OIs as socio-cognitive technical systems and as hybrid online social systems in previous publications (see [5, 8, 11, 21]).

  6. 6.

    The rationale is as follows: First, by definition, OI are state-based and by the (Observability Stance (Construct 4)), the institutional state is a finite set of observable facts. Second, from Val.4), we assume that values can determine preferences over the state of the world, and therefore, one can define a preference relation on the set of institutional states \(P_v\) for any given value v. Third, Since the state of the world is finite, one can choose preferable states for a given value v and define them as goals \(G_v\) that are motivated for that value (Val.1)) and also legitimised by it (Val.3)). Fourth, note that any goal (g) of value \(v_i\) will be included also in the preference relation (\(P_{v_j}\)) for every other value \(v_j\) (because g is one state of the world and because of V.6, several values may be involved in the assessment of a state of the world), however, it might not be a goal for \(v_j\) (g may or may not be in \(G_{v_j}\).)

  7. 7.

    For example. the conjunction of Heuristics 2, 3 and 7 amounts to a weak form of consequentialism in which values are identified with goals but only for one specific OI and by the consensus of the design stakeholders who agree on the consequences of values.

  8. 8.

    The heuristics we propose in Sect. 5 (notably Heuristic 5) are meant to allow value alignments that reflect the individual perspectives of the different design stakeholders, the consensual perspective and a combination of the two.

  9. 9.

    In fact, one may implement institutional agents whose behaviour operationalise those three types of instruments values. For example, institutional agents that perform discretionary norm-enforcement functions.

  10. 10.

    These heuristics complement the ones in [10].

References

  1. Aldewereld, H., Boissier, O., Dignum, V., Noriega, P., Padget, J.: Introduction, pp. 3–9. Springer (2016). https://doi.org/10.1007/978-3-319-33570-4_1

  2. Alexander, C.: A Pattern Language: Towns, Buildings, Construction. OUP, Oxford (1977)

    Google Scholar 

  3. Andrighetto, G., Governatori, G., Noriega, P., van der Torre, L.W.N. (eds.): Normative Multi-Agent Systems, vol. 4. Dagstuhl Publishing, Saarbrucken (2013)

    Google Scholar 

  4. Argente, E., Boissier, O., Carrascosa, C., Fornara, N., Mcburney, P., Noriega, P., Ricci, A., Sabater-Mir, J., Schumacher, M.I., Tampitsikas, C., Taveter, K., Vizzari, G., Vouros, G.A.: The role of the environment in agreement technologies. Artif. Intell. Rev. 39, 21–38 (2013)

    Google Scholar 

  5. Christiaanse, R., Ghose, A.K., Noriega, P., Singh, M.P.: Characterizing artificial socio-cognitive technical systems. In: Herzig, A., Lorini, E. (eds.) Proceedings of the European Conference on Social Intelligence (ECSI-2014), pp. 336–446. CeUR (2014). www.ceur-ws.org/Vol-1283/

  6. High-Level Expert Group on AI (AI HLEG): Ethics guidelines for trustworthy AI (2019). www.ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai

  7. Noriega, P., Padget, J., Verhagen, H., d’Inverno, M.: Towards a framework for socio-cognitive technical systems. In: Ghose, A., Oren, N., Telang, P., Thangarajah, J. (eds.) Coordination, Organizations, Institutions, and Norms in Agent Systems X, Lecture Notes in Computer Science, vol. 9372, pp. 164–181. Springer International Publishing (2015). https://doi.org/10.1007/978-3-319-25420-3_11

  8. Noriega, P., Padget, J., Verhagen, H., d’Inverno, M.: Anchoring online institutions. In: Casanovas, P., Moreso, J.J. (eds.) Anchoring Institutions. Democracy and Regulations in a Global and Semi-automated World. Springer ((in press))

    Google Scholar 

  9. Noriega, P., Verhagen, H., d’Inverno, M., Padget, J.A.: A Manifesto for Conscientious Design of Hybrid Online Social Systems. In: Cranefield, S., Mahmoud, S., Padget, J.A., Rocha, A.P. (eds.) COIN@AAMAS, Singapore, May 2016, COIN@ECAI, The Hague, The Netherlands, August 2016, Revised Selected Papers. LNCS, vol. 10315, pp. 60–78. Springer (2016)

    Google Scholar 

  10. Noriega, P., Verhagen, H., Padget, J., d’Inverno, M.: Design Heuristics for Ethical Online Institutions. In: Ajmeri, N., Morris Martin, A., Savarimuthu, B.T.R. (eds.) Coordination, Organizations, Institutions, Norms, and Ethics for Governance of Multi-Agent Systems XV, pp. 213–230. Springer International Publishing, Cham (2022)

    Google Scholar 

  11. Noriega, P., Verhagen, H., Padget, J., d’Inverno, M.: Ethical online AI systems through conscientious design. IEEE Internet Comput. 25(6), 58–64 (2021)

    Article  Google Scholar 

  12. North, D.: Institutions. Institutional Change and Economic Performance. CUP, Cambridge (1991)

    Google Scholar 

  13. Ostrom, E.: Governing the Commons. The Evolutions of Institutions for Collective Action. Cambridge University Press, Cambridge (1990)

    Google Scholar 

  14. van de Poel, I.: Embedding values in artificial intelligence (AI) systems. Mind. Mach. 30(3), 385–409 (2020)

    Article  Google Scholar 

  15. Russell, S.: Of Myths and Moonshine. A conversation with Jaron Lanier, 14–11-14. The Edge (2014). www.edge.org/conversation/the-myth-of-ai#26015. Accessed 12 Dec 2022

  16. Schwartz, S.H.: An overview of the Schwartz theory of basic values. Psychol. Cult. (Online readings) 2(1), 11 (2012)

    Google Scholar 

  17. Searle, J.R.: The Construction of Social Reality. The Penguin Press, Allen Lane (1995)

    Google Scholar 

  18. Simon, H.A.: The Sciences of the Artificial, 3rd edn. MIT Press, Cambridge (1996)

    Google Scholar 

  19. Simon, H.A.: Fact and value in decision-making. In: Administrative Behavior: A Study of Decision-making Processes in Administrative Organization, 4th edn. The Free Press (1997)

    Google Scholar 

  20. The IEEE Global Initiative on Ethics of Autonomous and Intelligent System: Ethically aligned design: A vision for prioritizing human well-being with autonomous and intelligent systems, first edition (2019), www.standards.ieee.org/content/dam/ieee-standards/standards/web/documents/other/ead1e.pdf

  21. Verhagen, H., Noriega, P., d’Inverno, M.: Towards a design framework for controlled hybrid social games. In: Social Coordination: Principles, Artefacts and Theories, SOCIAL.PATH 2013 - AISB Convention 2013, pp. 83–87 (2013)

    Google Scholar 

Download references

Acknowledgements

Research for his paper is supported by the EU Project VALAWAI 101070930 (funded by HORIZON-EIC-2021-PATHFINDERCHALLENGES-01), project VAE (grant TED2021-131295B-C31 funded by MCIN/AEI /10.13039/501100011033 and by the European Union’s NextGenerationEU/PRTR), and CSIC’s project DESAFIA2030 (BILTC22005 funded by the Bilateral Collaboration Initiative i-LINK-TEC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pablo Noriega .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Noriega, P., Verhagen, H., Padget, J., d’Inverno, M. (2023). Addressing the Value Alignment Problem Through Online Institutions. In: Fornara, N., Cheriyan, J., Mertzani, A. (eds) Coordination, Organizations, Institutions, Norms, and Ethics for Governance of Multi-Agent Systems XVI. COINE 2023. Lecture Notes in Computer Science(), vol 14002. Springer, Cham. https://doi.org/10.1007/978-3-031-49133-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-49133-7_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-49132-0

  • Online ISBN: 978-3-031-49133-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics