Abstract
The popularity of some recent applications of AI has given rise to some concerns in society about the risks of AI. One response to these concerns has been the orientation of scientific and technological efforts towards a responsible development of AI. This stance has been articulated from different perspectives. One is to focus on the risks associated with the autonomy of artificial entities, and one way of making this focus operational is the “value-alignment problem” (VAP); namely, to study how this autonomy may become provably aligned with certain moral values. With this purpose in mind, we advocate the characterisation of a problem archetype to study how values may be imbued in autonomous artificially intelligent entities. The motivation is twofold, on one hand to decompose a complex problem to study simpler elements and, on the other, the successful precedents of this artifice in analogous contexts (e.g. chess for cognitive AI, RoboCup for intelligent robotics). We propose to use agent-based modelling of policy-making for this purpose because policy-making (i) constitutes a problem domain that is rich, accessible and evocative, (ii) one may claim that it is an essentially value-drive process and, (iii) it allows for a crisp differentiation of two complementary views of VAP: imbuing values in agents and imbuing values in the social system in order to foster value-aligned behaviour of the agents that act within the system. In this paper we elaborate the former argument, propose a characterisation of the archetype and identify research lines that may be systematically studied with this archetype.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The model, its implementation and the resulting simulated world constitute a socio-cognitive technical system (see Appendix). The metamodel of Working Definition 7 characterises the sub-class of value drive policy-making simulators.
References
Alarcon, B., Aguado, A., Manga, R., Josa, A.: A value function for assessing sustainability: application to industrial buildings. Sustainability 3(1), 35–50 (2011)
Aldewereld, H., Boissier, O., Dignum, V., Noriega, P., Padget, J.: Social Coordination Frameworks for Social Technical Systems. Law, Governance and Technology Series, vol. 30. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-33570-4
Allen, C., Varner, G., Zinser, J.: Prolegomena to any future artificial moral agent. J. Exp. Theor. Artif. Intell. 12, 251–261 (2000)
Andrighetto, G., Governatori, G., Noriega, P., van der Torre, L.W.N. (eds.): Normative Multi-Agent Systems, vol. 4. Dagstuhl Publishing, Saarbrücken (2013)
Aodha, L., Edmonds, B.: Some pitfalls to beware when applying models to issues of policy relevance. In: Edmonds, B., Meyer, R. (eds.) Simulating Social Complexity. UCS, pp. 801–822. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66948-9_29
Awad, E., et al.: The moral machine experiment. Nature 563, 59 (2018)
Banks, J.: Handbook of Simulation. Wiley, Hoboken (1998)
Botterill, L.C., Fenna, A.: Interrogating Public Policy Theory. Edward Elgar Publishing, Cheltenham (2019)
Cairney, P.: The Politics of Evidence-Based Policy Making. Palgrave Macmillan, Basingstoke (2016)
Campbell, J.L.: Institutional analysis and the role of ideas in political economy. Theory Soc. 27(3), 377–409 (1998)
Collingridge, D.: The Social Control of Technology. Palgrave Macmillan, Basingstoke (1981)
Asilomar Conference: Asilomar AI principles (2017). https://futureoflife.org/ai-principles/. Accessed 13 2019
Dente, B.: Understanding Policy Decisions. SpringerBriefs in Applied Sciences and Technology. Springer, Cham (2013)
Edmonds, B.: Different modelling purposes. In: Edmonds, B., Meyer, R. (eds.) Simulating Social Complexity. UCS, pp. 39–58. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66948-9_4
European Commission: Better Regulation Toolbox. https://ec.europa.eu/info/better-regulation-toolbox_en. Accessed 20 Mar 2019
Floridi, L. (ed.): The Onlife Manifesto, pp. 7–13. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-04093-6_2
Floridi, L., et al.: AI4People - an ethical framework for a good AI society: opportunities, risks, principles, and recommendations. Minds Mach. 28(4), 689–707 (2018). https://doi.org/10.1007/s11023-018-9482-5
Gilbert, G.N., Conte, R.: Artificial Societies: The Computer Simulation of Social Life. UCL Press, London (1995)
Gilbert, N., Ahrweiler, P., Barbrook-Johnson, P., Narasimhan, K.P., Wilkinson, H.: Computational modelling of public policy: reflections on practice. J. Artif. Soc. Soc. Simul. 21(1), 14 (2018)
Hitlin, S., Pinkston, K.: Values, attitudes, and ideologies: explicit and implicit constructs shaping perception and action. In: DeLamater, J., Ward, A. (eds.) Handbook of Social Psychology. Handbooks of Sociology and Social Research, pp. 319–339. Springer, Netherlands (2013). https://doi.org/10.1007/978-94-007-6772-0_11
Hoppe, R.: Heuristics for practitioners of policy design: rules-of-thumb for structuring unstructured problems. Public Policy Adm. 33(4), 384–408 (2018)
IEEE: Ethically aligned design, version 2 (2017). https://ethicsinaction.ieee.org/. Accessed 13 2019
Jasanoff, S., Wynne, B.: Science and decision making. In: Rayner, S., Malone, E.L. (eds.) Human Choice and Climate Change, pp. 1–87. Battelle Press, Columbus (1998)
Lakoff, G.: Don’t Think of an Elephant! Chelsea Green Publishing, Hartford (2004)
Miceli, M., Castelfranchi, C.: A cognitive approach to values. J. Theory Soc. Behav. 19(2), 169–193 (1989)
Noriega, P., Padget, J., Verhagen, H., d’Inverno, M.: Towards a framework for socio-cognitive technical systems. In: Ghose, A., Oren, N., Telang, P., Thangarajah, J. (eds.) COIN 2014. LNCS (LNAI), vol. 9372, pp. 164–181. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25420-3_11
Noriega, P., Sabater-Mir, J., Verhagen, H., Padget, J., d’Inverno, M.: Identifying affordances for modelling second-order emergent phenomena with the \(\cal{WIT}\) framework. In: Sukthankar, G., Rodriguez-Aguilar, J.A. (eds.) AAMAS 2017. LNCS (LNAI), vol. 10643, pp. 208–227. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71679-4_14
Parks, L., Guay, R.P.: Personality, values, and motivation. Pers. Individ. Differ. 47(7), 675–684 (2009)
Perello-Moragues, A., Noriega, P.: Using agent-based simulation to understand the role of values in policy-making. In: Advances in Social Simulation – Looking in the Mirror (in Press)
Perello-Moragues, A., Noriega, P., Poch, M.: Modelling contingent technology adoption in farming irrigation communities. J. Artif. Soc. Soc. Simul. (in Press)
Perello-Moragues, A., Noriega, P., Popartan, A., Poch, M.: Modelling policy shift advocacy. In: Proceedings of the Multi-Agent-Based Simulation Workshop in AAMAS 2019 (in Press)
Perry, C.: ABCDE+F: a framework for thinking about water resources management. Water Int. 38(1), 95–107 (2013)
Van de Poel, I.: Values in engineering design. In: Meijers, A.W.M. (ed.) Handbook of the Philosophy of Science, pp. 973–1006. Elsevier (2009)
Poel, I.: Translating values into design requirements. In: Michelfelder, D.P., McCarthy, N., Goldberg, D.E. (eds.) Philosophy and Engineering: Reflections on Practice, Principles and Process. PET, vol. 15, pp. 253–266. Springer, Dordrecht (2013). https://doi.org/10.1007/978-94-007-7762-0_20
Rokeach, M.: The Nature of Human Values. Free Press, New York (1973)
Russell, S.: Provably beneficial artificial intelligence. Exponential Life, BBVA-Open Mind, The Next Step (2017)
Saaty, T.: The Analytic Hierarchy Process. McGraw-Hill, New York (1980)
Sabatier, P.A.: Theories of the Policy Process. Westview Press, Boulder (1999)
Schwartz, S.H.: Universals in the content and structure of values: theoretical advances and empirical tests in 20 countries. In: Zanna, M.P. (ed.) Advances in Experimental Social Psychology, vol. 25, pp. 1–65. Academic Press (1992)
Schwartz, S.H., Bilsky, W.: Toward a universal psychological structure of human values. J. Pers. Soc. Psychol. 53(3), 550–562 (1987)
Schwartz, S.H., Caprara, G.V., Vecchione, M.: Basic personal values, core political values, and voting: a longitudinal analysis. Polit. Psychol. 31(3), 421–452 (2010)
Simon, H.A.: Administrative Behavior: A Study of Decision-Making Processes in Administrative Organization. Macmillan, Oxford (1957)
Stewart, J.: Value conflict and policy change. In: Stewart, J. (ed.) Public Policy Values, pp. 33–46. Palgrave Macmillan, London (2009)
Susskind, J.: Future Politics: Living Together in a World Transformed by Tech. Oxford University Press, Oxford (2018)
Witesman, E., Walters, L.: Public service values: a new approach to the study of motivation in the public sphere. Public Adm. 92(2), 375–405 (2014)
Acknowledgements
This research has been supported by the CIMBVAL project (Spanish government, project # TIN2017-89758-R). The first author is supported with the industrial doctoral 2016DI043 grant of the Catalan Secretariat for Universities and Research (AGAUR), sponsored by FCC AQUALIA, IIIA-CSIC, and UAB.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Appendix: Background
A Appendix: Background
Values
Values are constructs that are grounded on universal human needs [40]. Presumably, they are cognitive socio-cognitive constructions that articulate these human needs as principles or standards of preference. Indeed, values are involved in motivation and goal-setting [28], and it has been suggested that they are also involved in political cognition [8, 41], as they serve as moral intuitions for individuals [20]. For more practical terms, values play a role in the behaviour of agents [25, 28].
Schwartz and Bilsky [40] provided an exhaustive definition of values: values are (a) concepts or beliefs, (b) about desirable end states or behaviours, (c) that transcend specific situations, (d) guide selection or evaluation of behaviour and events, and (e) are ordered by relative importance. They derived such cognitive notion of values from universal human needs (i.e. individual biological organism, social agent and interaction, and welfare of the community).
There is no consensus on the categories of values. Rokeach [35] developed the Rokeach Value Survey—on which Schwartz and Bilsky draw their primary work—to determine the value priorities of individuals and distinguished between instrumental values (i.e., related to modes of behaviour) and terminal values (i.e., desirable end-states of existence); and also between individual values (i.e., related to satisfying individual needs and self-esteem) and societal values (i.e, related to societal demands, since supra-individual entities (e.g., society, organisations, etc.) “socialise the individual for the common good to internalise shared conceptions of the desirable”).
One of the most notable works on values, the Schwartz Theory of Basic Values, defines 10 sets of values that pursue a particular objective or goal [39]: Power; Achievement; Hedonism; Stimulation; Self-direction; Universalism; Benevolence; Conformity; Tradition; and Security. This theory has been used to study in political domains (e.g., voting behaviour [41]), and even has been subject of study to enhance their usability in public administration and policy studies (see, for instance, [45]).
Cognitive Function of Values. Values are largely stable social and internal cognitive constructs that represent individuals’ moral intuitions and which guide social evaluation and action [20]. Accordingly, values play a role in perceiving the relevant fragment of the world, in evaluating the state of such, and in motivating responding action. It has been suggested that values are essential for the socio-political cognition of individuals (regarding social outcomes and public affairs) [8].
Generally speaking, decision-making within a particular context pose ethical dilemmas that present trade-offs between multiple values, revealing desirable but opposing outcomes. Noteworthy, any decision is value-laden because it reflects the hierarchy of values of the decision-maker.
When multiple values are involved in decision-situations, we say that values are made commensurable by means of value aggregation models. These decision-making components afford individuals to consider multiple values and solve value trade-offs, eventually making a decision. With this in mind, there are at least two relevant components in value aggregation models: (i) the value system, that defines the type of values considered (e.g., Schwartz, Rokeach, etc.); and (ii) the aggregation model (e.g., satisficing combinations, aggregation functions, etc.).
Usually, such models are implemented as aggregation functions that reflect a multi-criteria decision analysis (MCDA) [33]. Sophisticated mathematical protocols have been developed to generate value functions (see [1]) and value hierarchies (see [37]).
Working Assumptions on Values. We adopt the following assumptions about values to ground a working framework:
-
Cognitive understanding of values. Values are constructs that serve as cognitive heuristics and moral intuitions of individuals, and therefore they guide perception, evaluation and decision-making in any context [20, 40].
-
Commensurability of values. Although one can say that values are incommensurable and cannot be measured on a common scale [33], we stick to the fact that individuals act and make decisions, which requires to solve ethical dilemmas and value trade-offs (e.g. for which we presume bounded rationality is crucial). Thus, values are, at least, cognitively commensurable.
-
A consequentialist view of values. The focus of value-driven decisions is placed on their consequences, rather they nature and definition. In other words, the discussion is not about what a particular value is, but rather, given a definition of that particular value, whether actions promote or not that value.
Values in Norms and in Actions Values are related to norms. Values serve as guiding and evaluative principles that capture what is right and wrong, while norms are rules that prescribe behaviours and particular courses of action. Accordingly, norms are an “implementation” of what values express (either as a personal norm in the cognition of the individual or as a institutional norm in the social space).
According to our working framework, an action A may promote a value \(\alpha \) and demote a value \(\beta \) depending on their consequences and how value \(\alpha \) and value \(\beta \) are understood. Alternatively, an action A is aligned with a value \(\alpha \) if the outcome improves the state of the world with respect to how that value \(\alpha \) is understood. Following this approach, a norm N is aligned to a value \(\alpha \) when it prescribes actions that promote value \(\alpha \) and prohibits actions that demote value \(\alpha \).
Policy-Making
Public policies are plans of action that address what has been defined as a collective problem [13] in order to produce a desirable society-level outcome. Values play a role in policy decisions, as they are involved when defining public issues, desirable states of the world, and courses of action worth to be considered [8, 43].
Ideally, policy-making cycle is often described as a linear cycle that includes agenda-setting, design, implementation, application, evaluation and revision (i.e., maintain, redesign, or terminate the policy). The truth is that policy-making is far more complex and uncertain than a linear process [8, 9, 38]. Noteworthy, policy decisions are usually made without enough information—which is not only based on scientific evidence, but also on habits and intuitions [9]—in a space where multiple stakeholders are involved—who have competing values and interests, and mobilise diverse resources [13]—but they still may have substantial impact—whose consequences are often not totally foreseen [42].
Policy Domains. A policy domain is an abstraction of the reality that serve to draw the boundaries of the relevant fragment of the world to be considered when addressing public issues. In simple terms, it consists of going from a messy problematic situation to a structured well-defined problem, which affords to conceive policies to tackle it [21].
Paradigms are taken-for-granted descriptions and theoretical analyses that constrain the range of alternative policy options [23]. Paraphrasing Campbell [10], paradigms act as “cognitive background assumptions that constrain action by limiting the range of alternatives that policy-making elites are likely to perceive as useful and worth considering” when addressing public issues. These paradigms are supported by language and discourse, contributing to form “mental structures that shape the way we see the world” [24].
Policy Ends and Indicators, and Policy Means and Instruments. In simple terms, public policies are a set of values (i.e., what is valued by the society at large), ends (i.e., what state of the world reflects them), and means (i.e., how that state is going to be achieved).
Following this view, ends must be described clearly as objectives that are meant to be achieved by the intervention. The assessment of the degree of success may rely on indexes and indicators—either quantitative or qualitative—that stand for those end states and are computed from variables of the relevant world.
In the same vein, means aim to produce a change on the relevant world (typically, a behavioural change on target groups) so as to drive the system towards a desirable state of the world. They may be implemented with diverse instruments (e.g., financial, economic, regulatory, etc.).
Policy Assessment Practices. It is common to assess policies prior to their enactment (ex ante assessment). For instance, the European Commission refers to this process as Impact Assessment (IA), and considers it necessary when the expected economic, environmental or social impacts of interventions are likely to be significant (see [15]). The main steps of the process consists of analysing (i) the definition of the problem and boundaries and scales of the system; (ii) the policy ends and how are they going to be measured; (iii) the policy means and how are they going to be implemented; and finally (iv) the policy evaluation on which base the enactment, redesign, or termination decisions.
We distinguish between effective policies and good policies [32]. The former are those policies whose social outcome is consistent with the policy declared objectives. In contrast, the latter are those policies whose social outcome is “good” according to the values held by stakeholders.
Agent-Based Simulation (ABS) for Policy-Making
Simulation is the imitation of a real-world process or system over time, and can contribute to policy assessment without disturbing the real social system and committing resources [7], as well as identifying counter-intuitive situations. ABS use a type of computational models that are able to explicitly simulate the actions and social interactions of groups of individuals within an artificial environment, thus generating “artificial societies” [18].
With this in mind, agent-based simulation (ABS) has been acknowledged as a useful tool to support policy-making and ex ante policy assessment [19]. ABS contributes to reliably anticipate data that is not currently known [14], and can be combined with other ICTs to enhance their potential (e.g., data analysis and statistics, output visualisation, etc.). Although ABS is promising, several concerns have been posed, as it can backfire if used without proper precaution [5].
Socio-cognitive Technical Systems
Socio-cognitive technical systems (SCTS) are social coordination systems [2] that articulate on-line interactions of autonomous agents that are socially rational [26]. They are composed of two first class entities: a social space where all interactions take place, and agents that interact within that environment. One presumes that the social space has a fixed ontology (the domain ontology), that at any given time it is in a state—that is an instance of the Cartesian product of a finite number of domains, whose union is a subset of the domain ontology. The state of the system changes only as a result of an action that complies with the system regulations the moment it is attempted, or because an event that is compatible with those regulations takes place.
SCTS can be decomposed in three “views”: \(\mathcal {W}\) that is the fragment of the world that is relevant for the system, \(\mathcal {I}\) an institutional representation of the conventions that define the system, and \(\mathcal {T}\) the implementation of \(\mathcal {I}\) that creates the on-line version of \(\mathcal {W}\). The views are interrelated in such a way that an attempted action modifies the state of the system if and only if that action is admitted by the system interface, which in turn should happen if and only if the attempted action complies with the conventions established in \(\mathcal {I}\) (and those conventions are properly implemented in \(\mathcal {T}\)). An admitted action changes the state of the world according to the conventions in \(\mathcal {I}\) that specify the way the input is processed in \(\mathcal {T}\). In the case of value-driven policy simulators, these three views correspond to the simulated world, the (abstract) model of the world and the implementation of the model.
In practice, the institutional specification (\(\mathcal {I}\)) is achieved by instantiating a metamodel that includes ad-hoc languages and data structures to represent key distinctive features (affordances) of a family of SCTS (e.g., crowd-based systems, electronic institutions [2], normative multiagent systems [4], second-order simulation [27]).
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Perello-Moragues, A., Noriega, P. (2019). A Playground for the Value Alignment Problem. In: Martínez-Villaseñor, L., Batyrshin, I., Marín-Hernández, A. (eds) Advances in Soft Computing. MICAI 2019. Lecture Notes in Computer Science(), vol 11835. Springer, Cham. https://doi.org/10.1007/978-3-030-33749-0_33
Download citation
DOI: https://doi.org/10.1007/978-3-030-33749-0_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33748-3
Online ISBN: 978-3-030-33749-0
eBook Packages: Computer ScienceComputer Science (R0)