Regulating social exchanges in open MAS: The problem of reciprocal conversions between POMDPs and HMMs
Introduction
Systems of social relationships have often been seen as systems of social exchanges, as extensively discussed and analyzed in the literature (see, e.g., [4], [14], [21], [30]). Based on this, theories of social exchanges (e.g., the one introduced by Piaget [30]) have been frequently adopted for the modeling of agent social interactions in different contexts (see, e.g., [11], [16], [19], [29], [34]).
One such line of work is due to Rodrigues in cooperation with different partners [33], [34]. In [33], Rodrigues, Costa and Bordini introduced an initial model of social exchange-based interactions in agent societies, including a social-reasoning mechanism and structures for storing and manipulating exchange values, presenting an example of a political process of lobbying through campaign contributions. In particular, Rodrigues and Luck [34] introduced an approach based on the Theory of Social Exchanges for the modeling of interactions in open multiagent systems, presenting a system for analyzing/evaluating partner selection and cooperative interactions in the Bioinformatics domain, which is characterized by frequent, extensive and dynamic exchanges of services. Grimaldo et al. [19] presented an application of the Piaget’s Theory of Social Exchanges to the coordination of intelligent virtual agents and sociability in a virtual university bar scenario, as a market-based social model, where groups of different types of waiters (e.g., coordinated, social, egalitarian) and customers (e.g., social, lazy) interact with both the objects in the scene and the other virtual agents. In [20], they introduced a multi-modal agent decision making model (MADeM), in order to provide virtual agents with socially acceptable decisions, coordinated social behaviors (e.g., task passing or planned meetings), based on the evaluation of the social exchanges. Franco et al. [16] applied social exchange values in order to support arguments about the assessment of exchanges. Together with the power-to-influence social relationship, those arguments were also used to help the agents to decide about the continuation or the interruption of on-going interactions.
Analyzing the works mentioned in the previous paragraph, it is possible to observe that a central problem in systems of social exchanges is that of the regulation of the exchanges, toward producing social equilibrium as formalized by Piaget [30] in his Theory of Social Exchanges. The social exchange regulation problem is an important issue in MAS-based simulation of social management in environments rich in non-economic service exchanges, e.g., the urban ecosystems [7], [8], the ecosystem markets [15], the service exchange networks [3], the social currency networks [31], and all kinds of open cooperative systems rich in services [34].
In our previous works [9], [10], [11], different models for the social exchange regulation problem were developed. In particular, Pereira et al. [29] introduced a regulation model based on BDI (Beliefs, Desires, Intentions) [32] agents, implemented in Jason [5], whose plans were derived from optimal policies of POMDPs (Partially Observable Markov Decision Process) [22] describing the agents’ social exchange behaviors for decision making. However, the regulation of social exchanges becomes a difficult task when the social system is open, where the agents can enter and leave freely. Notice that, is this case, there is a no fixed number of POMDP models. For each new agent joining the society it is necessary to construct a new POMDP model that is able to provide the optimal police for dealing with such new agent’s exchange behavior.
The objective of this paper is to introduce a general and formal approach for the problem of recognizing and learning models of social exchange strategies for the regulation of social interactions in open agent societies.1 The regulation model is defined as a combination of the POMDP structuring the regulator agent decision process (as it was firstly proposed by Pereira et al. [29]) with a Hidden Markov Model (HMM) [25] to structure the exchange strategy learning process. In this way, the social exchange regulation problem can be stated as a problem of formally establishing the reciprocal conversion procedures between the POMDP and HMM models. The problem arises from the fact that the POMDPs have state transition and observation functions based on the actions performed by the agents in each state, whereas, in the HMMs, the state transition and observation functions are not explicitly related to action performances. The solution we have found for the reciprocal conversion problem builds on the particular structures of the POMDPs and HMMs that arise in the context of the regulation of social exchanges: the HMMs involved in such problems are structured on the basis of certain “extended” states, and that allows for the establishment of an isomorphism between the sets of states of the POMDPs and the sets of “extended” states of the HMMs. Such isomorphism forms the foundation, then, for the definition of the mappings that provide the reciprocal conversions between the POMDPs and the HMMs.
The paper is organized as follows. A contextualization of the core problem treated in this paper, namely, the formalization of the POMDP-HMM conversion procedures, is presented in Sections 2 and 3, showing its specific contribution for the state-of-the-art on models for the regulation of social exchanges in open MAS. Then, in Section 2, we present a discussion about the role of social exchanges in MAS, and, in Section 3, we briefly describe the models of regulation of social exchanges in (open) MAS introduced in [29] and [10], in which this work is based. Section 4 introduces, in a formal and more generalized approach, the main concepts necessary for the development of the work, defining the POMDP model for the strategy regulation problem. The HMM for the strategy learning problem is discussed in Section 5. In Section 6, we establish the formal relationship between POMDPs and HMMs, in terms of commutative diagrams, which allow the conversions between the models. In Section 7, we discuss the application of the proposed strategy regulation/learning model to the particular problem of regulating social exchanges in open MAS, giving some examples to clarify the formalism. In Section 8, we present a discussion on related work, mainly in some existing models for the problem of decision making of autonomous agents acting in the presence of other agents in uncertain environments. Section 9 is the conclusion.
Section snippets
The role of social exchanges in MAS
We take the viewpoint that the social relationships established between the agents that participate in a society can be analyzed in terms of the exchanges that those agents perform among them. In particular, we take the viewpoint of Piaget [30], that social relationships can be analyzed in terms of the exchange of services performed by the agents.2 From such perspective, the society is seen as a complex
The BDI–POMDP–HMM approach to the problem of regulation of social exchanges in open MAS
Pereira et al. [29] introduced a BDI–POMDP hybrid agent model3 for the regulation of strategy-based social exchanges in MAS, in which the regulation mechanism is internal to the agent architecture, in order to obtain a non-centralized social equilibrium control.4
The POMDP for the strategy regulation problem
Consider two agents, a regulator agent α and a strategy-based agent β. Each agent has its own model of the states of the world. The sets of the states of the world, according to the points of view of the agents α and β, are given, respectively, by:
In order to combine those two points of view, the states of the world for α’s regulation process are modeled as ordered pairs with and so the set of the states of the world is given by
The HMM for the strategy learning problem
If the regulator agent α is not able to recognize β’s strategy (see Section 3), then α uses a mechanism based on HMMs in order to discover the state transition and observation functions that best fit the sequence of observations about β’s reactions/responses to its proposals.
Denote by HMMαβ any HMM of α’s strategy learning mechanism related to an agent β. Given a sequence of observations on β’s reactions/responses and an arbitrary initial HMMαβ, it is possible to apply the well-known Baum Welch
The conversion procedures
Let α be the agent playing the role of a regulator agent and POMDP be the model related to a strategy-based agent β, for a given . The following result, although almost immediate, is the key of the conversion processes between POMDP and HMM models, since it establishes the isomorphism between the domain of the transition functions of POMDP and HMM models:
Proposition 6.1 For all the set is isomorphic to the set of extended states. Proof See Appendix A. □
In the following,
Regulating service exchanges in open MAS
In this section, we discuss the application of the POMDP/HMM conversion model to the particular problem of regulating social exchanges in open multiagent systems, discussed in Sections 2 and 3. Here, we adopt a simplified model of social exchanges between two agents α and β (see Sections 2–4).10
Related work
Coordination of agent activities is an important problem in multiagent systems, especially when considering stochastic environments. In the literature, there exist some MDP-based models for the problem of decision making of autonomous agents acting in the presence of other agents in uncertain environments. In general, theoretical work on coordination problems considers that a simple repeated game is being played, studying methods for attaining equilibrium in the stage game. One extension of
Conclusion
System openness implies that the set of possible exchange strategies adopted by the agents may vary widely, so that agents aiming to regulate their interaction have to learn the social exchange strategies adopted by their partner agents at any time. The essential problem of an agent aiming to regulate its exchanges with another agent is, then, to discover the exchange strategy POMDP model that it should apply in order to lead to equilibrium the exchanges of services that it performs with its
Acknowledgments
This work was partially supported by the following Brazilian funding agencies: CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico), under the Proc. 481283/2013-7, 306970/2013-9 and 232827/2014-1, and FAPERGS (RS-SOC Project).
References (39)
- et al.
Action selection and task sequence learning for hybrid dynamical cognitive agents
Rob. Auton. Syst.
(2010) - et al.
Social exchange and common agency in organizations
J. Socio-Econ.
(2010) - et al.
Concurrent Markov decision processes for robot team learning
Eng. Appl. Artif. Intell.
(2015) - et al.
Planning and acting in partially observable stochastic domains
Artif. Intell.
(1998) Markov games as a framework for multi-agent reinforcement learning
Proceedings of the 11th International Conference on Machine Learning, Rutgers University, New Brunswick
(1994)- et al.
Social exchanges as motivators of hotel employees’ organizational citizenship behavior: the proposition and application of a new three-dimensional framework
Int. J. Hosp. Manage.
(2011) - et al.
Understanding Regulation: Theory, Strategy and Practice
(2012) - K. Banks, A. Brennan, A. Kidd, 2015, MoE – Means of Exchange website,...
Exchange and Power in Social Life
(1964)- et al.
Programming Multi-agent Systems in AgentSpeak Using Jason, Wiley Series in Agent Technology
(2007)
Sequential optimality and coordination in multiagent systems
Comunidades en transición: hacia otras prácticas sostenibles en los ecosistemas urbanos
Cidades Comunidades e Territórios
La comunidad como escala de trabajo en los ecosistemas urbanos
Rev. Cien. Tecnol.
Centralized regulation of social exchanges between personality-based agents
Coordination, Organizations, Institutions, and Norms in Agent Systems II
Recognizing and learning models of social exchange strategies for the regulation of social interactions in open agent societies
J. Braz. Comput. Soc.
Systems of exchange values as tools for multi-agent organizations
J. Braz. Comput. Soc.
Solving decentralized POMDP problems using genetic algorithms
Auton. Agents Multi-Agent Syst.
Social exchange theory
Cited by (8)
Generalized interval-valued OWA operators with interval weights derived from interval-valued overlap functions
2017, International Journal of Approximate ReasoningCitation Excerpt :In a practical setting, in multiagent-based social modeling and simulation [2,63], interval-valued overlap functions will be used to handle the uncertainty in the overlap problem in the agents's fuzzy beliefs [36], when reasoning about social values, such as exchange values [33,34], trust and reputation [59,60].
Deep reinforcement learning method for POMDP based tram signal priority
2023, IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSCA Fuzzy Semantic for BDI Logic
2021, Fuzzy Information and EngineeringNATYASASTRA: A dramatic game for the self-regulation of social exchange processes in MAS
2018, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)