Elsevier

Information Sciences

Volume 323, 1 December 2015, Pages 16-33
Information Sciences

Regulating social exchanges in open MAS: The problem of reciprocal conversions between POMDPs and HMMs

https://doi.org/10.1016/j.ins.2015.06.023Get rights and content

Abstract

An important problem in open multiagent systems is that of the regulation of social exchanges, toward producing social equilibrium. This problem may be generalized to the regulation of autonomous agents’ interactions when cooperating/competing in order to achieve their individual, collective objectives. In this paper, we take an abstract and generalizing approach to this issue. The problem is formalized as a regulation model for the sequential decision making of an agent, acting in an open partially observable stochastic environment, with the aim to induce another autonomous agent to interact in certain way, so as to lead both agents toward a target exchange state configuration. The regulation model is defined as a combination of a partially observable Markov decision process (POMDP), to structure the regulator agent decision process, with a Hidden Markov Model (HMM), to structure its exchange strategy learning process. The main challenge we face is the reciprocal conversion between POMDPs and HMMs. The solution we have found builds on the particular structures of the POMDPs and HMMs that arise in the context of the regulation of social exchanges, which allow for the establishment of a kind of isomorphism between the two models. This paper formally develops these ideas, stating and proving the conversion theorems, and shows their application to an example of regulation of social exchanges.

Introduction

Systems of social relationships have often been seen as systems of social exchanges, as extensively discussed and analyzed in the literature (see, e.g., [4], [14], [21], [30]). Based on this, theories of social exchanges (e.g., the one introduced by Piaget [30]) have been frequently adopted for the modeling of agent social interactions in different contexts (see, e.g., [11], [16], [19], [29], [34]).

One such line of work is due to Rodrigues in cooperation with different partners [33], [34]. In [33], Rodrigues, Costa and Bordini introduced an initial model of social exchange-based interactions in agent societies, including a social-reasoning mechanism and structures for storing and manipulating exchange values, presenting an example of a political process of lobbying through campaign contributions. In particular, Rodrigues and Luck [34] introduced an approach based on the Theory of Social Exchanges for the modeling of interactions in open multiagent systems, presenting a system for analyzing/evaluating partner selection and cooperative interactions in the Bioinformatics domain, which is characterized by frequent, extensive and dynamic exchanges of services. Grimaldo et al. [19] presented an application of the Piaget’s Theory of Social Exchanges to the coordination of intelligent virtual agents and sociability in a virtual university bar scenario, as a market-based social model, where groups of different types of waiters (e.g., coordinated, social, egalitarian) and customers (e.g., social, lazy) interact with both the objects in the scene and the other virtual agents. In [20], they introduced a multi-modal agent decision making model (MADeM), in order to provide virtual agents with socially acceptable decisions, coordinated social behaviors (e.g., task passing or planned meetings), based on the evaluation of the social exchanges. Franco et al. [16] applied social exchange values in order to support arguments about the assessment of exchanges. Together with the power-to-influence social relationship, those arguments were also used to help the agents to decide about the continuation or the interruption of on-going interactions.

Analyzing the works mentioned in the previous paragraph, it is possible to observe that a central problem in systems of social exchanges is that of the regulation of the exchanges, toward producing social equilibrium as formalized by Piaget [30] in his Theory of Social Exchanges. The social exchange regulation problem is an important issue in MAS-based simulation of social management in environments rich in non-economic service exchanges, e.g., the urban ecosystems [7], [8], the ecosystem markets [15], the service exchange networks [3], the social currency networks [31], and all kinds of open cooperative systems rich in services [34].

In our previous works [9], [10], [11], different models for the social exchange regulation problem were developed. In particular, Pereira et al. [29] introduced a regulation model based on BDI (Beliefs, Desires, Intentions) [32] agents, implemented in Jason [5], whose plans were derived from optimal policies of POMDPs (Partially Observable Markov Decision Process) [22] describing the agents’ social exchange behaviors for decision making. However, the regulation of social exchanges becomes a difficult task when the social system is open, where the agents can enter and leave freely. Notice that, is this case, there is a no fixed number of POMDP models. For each new agent joining the society it is necessary to construct a new POMDP model that is able to provide the optimal police for dealing with such new agent’s exchange behavior.

The objective of this paper is to introduce a general and formal approach for the problem of recognizing and learning models of social exchange strategies for the regulation of social interactions in open agent societies.1 The regulation model is defined as a combination of the POMDP structuring the regulator agent decision process (as it was firstly proposed by Pereira et al. [29]) with a Hidden Markov Model (HMM) [25] to structure the exchange strategy learning process. In this way, the social exchange regulation problem can be stated as a problem of formally establishing the reciprocal conversion procedures between the POMDP and HMM models. The problem arises from the fact that the POMDPs have state transition and observation functions based on the actions performed by the agents in each state, whereas, in the HMMs, the state transition and observation functions are not explicitly related to action performances. The solution we have found for the reciprocal conversion problem builds on the particular structures of the POMDPs and HMMs that arise in the context of the regulation of social exchanges: the HMMs involved in such problems are structured on the basis of certain “extended” states, and that allows for the establishment of an isomorphism between the sets of states of the POMDPs and the sets of “extended” states of the HMMs. Such isomorphism forms the foundation, then, for the definition of the mappings that provide the reciprocal conversions between the POMDPs and the HMMs.

The paper is organized as follows. A contextualization of the core problem treated in this paper, namely, the formalization of the POMDP-HMM conversion procedures, is presented in Sections 2 and 3, showing its specific contribution for the state-of-the-art on models for the regulation of social exchanges in open MAS. Then, in Section 2, we present a discussion about the role of social exchanges in MAS, and, in Section 3, we briefly describe the models of regulation of social exchanges in (open) MAS introduced in [29] and [10], in which this work is based. Section 4 introduces, in a formal and more generalized approach, the main concepts necessary for the development of the work, defining the POMDP model for the strategy regulation problem. The HMM for the strategy learning problem is discussed in Section 5. In Section 6, we establish the formal relationship between POMDPs and HMMs, in terms of commutative diagrams, which allow the conversions between the models. In Section 7, we discuss the application of the proposed strategy regulation/learning model to the particular problem of regulating social exchanges in open MAS, giving some examples to clarify the formalism. In Section 8, we present a discussion on related work, mainly in some existing models for the problem of decision making of autonomous agents acting in the presence of other agents in uncertain environments. Section 9 is the conclusion.

Section snippets

The role of social exchanges in MAS

We take the viewpoint that the social relationships established between the agents that participate in a society can be analyzed in terms of the exchanges that those agents perform among them. In particular, we take the viewpoint of Piaget [30], that social relationships can be analyzed in terms of the exchange of services performed by the agents.2 From such perspective, the society is seen as a complex

The BDI–POMDP–HMM approach to the problem of regulation of social exchanges in open MAS

Pereira et al. [29] introduced a BDI–POMDP hybrid agent model3 for the regulation of strategy-based social exchanges in MAS, in which the regulation mechanism is internal to the agent architecture, in order to obtain a non-centralized social equilibrium control.4

The POMDP for the strategy regulation problem

Consider two agents, a regulator agent α and a strategy-based agent β. Each agent has its own model of the states of the world. The sets of the states of the world, according to the points of view of the agents α and β, are given, respectively, by: Sα={Sα1,,Sαk}andSβ={Sβ1,,Sβl}.

In order to combine those two points of view, the states of the world for α’s regulation process are modeled as ordered pairs (Sα*,Sβ), with Sα*Sα and SβSβ, so the set of the states of the world is given by Sαβ=Sα×S

The HMM for the strategy learning problem

If the regulator agent α is not able to recognize β’s strategy (see Section 3), then α uses a mechanism based on HMMs in order to discover the state transition and observation functions that best fit the sequence of observations about β’s reactions/responses to its proposals.

Denote by HMMαβ any HMM of α’s strategy learning mechanism related to an agent β. Given a sequence of observations on β’s reactions/responses and an arbitrary initial HMMαβ, it is possible to apply the well-known Baum Welch

The conversion procedures

Let α be the agent playing the role of a regulator agent and POMDPαβ* be the model related to a strategy-based agent β, for a given *{1,,k}. The following result, although almost immediate, is the key of the conversion processes between POMDPαβ* and HMMαβ* models, since it establishes the isomorphism between the domain of the transition functions of POMDPαβ* and HMMαβ* models:

Proposition 6.1

For all*{1,,k}, the setSαβ*×P is isomorphic to the set of extended statesSXαβ*.

Proof

See Appendix A. 

In the following,

Regulating service exchanges in open MAS

In this section, we discuss the application of the POMDP/HMM conversion model to the particular problem of regulating social exchanges in open multiagent systems, discussed in Sections 2 and 3. Here, we adopt a simplified model of social exchanges between two agents α and β (see Sections 2–4).10

Related work

Coordination of agent activities is an important problem in multiagent systems, especially when considering stochastic environments. In the literature, there exist some MDP-based models for the problem of decision making of autonomous agents acting in the presence of other agents in uncertain environments. In general, theoretical work on coordination problems considers that a simple repeated game is being played, studying methods for attaining equilibrium in the stage game. One extension of

Conclusion

System openness implies that the set of possible exchange strategies adopted by the agents may vary widely, so that agents aiming to regulate their interaction have to learn the social exchange strategies adopted by their partner agents at any time. The essential problem of an agent aiming to regulate its exchanges with another agent is, then, to discover the exchange strategy POMDP model that it should apply in order to lead to equilibrium the exchanges of services that it performs with its

Acknowledgments

This work was partially supported by the following Brazilian funding agencies: CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico), under the Proc. 481283/2013-7, 306970/2013-9 and 232827/2014-1, and FAPERGS (RS-SOC Project).

References (39)

  • C. Boutilier

    Sequential optimality and coordination in multiagent systems

  • G. Dimuro et al.

    Comunidades en transición: hacia otras prácticas sostenibles en los ecosistemas urbanos

    Cidades Comunidades e Territórios

    (2010)
  • G. Dimuro et al.

    La comunidad como escala de trabajo en los ecosistemas urbanos

    Rev. Cien. Tecnol.

    (2011)
  • G.P. Dimuro et al.

    Centralized regulation of social exchanges between personality-based agents

    Coordination, Organizations, Institutions, and Norms in Agent Systems II

    (2007)
  • G.P. Dimuro et al.

    Recognizing and learning models of social exchange strategies for the regulation of social interactions in open agent societies

    J. Braz. Comput. Soc.

    (2011)
  • G.P. Dimuro et al.

    Systems of exchange values as tools for multi-agent organizations

    J. Braz. Comput. Soc.

    (2005)
  • B. Eker et al.

    Solving decentralized POMDP problems using genetic algorithms

    Auton. Agents Multi-Agent Syst.

    (2013)
  • R. Emerson

    Social exchange theory

  • Forest Trends, Ecosystem marktplace, 2015,...
  • Cited by (8)

    View all citing articles on Scopus
    View full text