Pricing strategies in mobile crowdsensing: an enhanced MAPPO approach using a behavior network

Zhao, Shengsheng; Yu, Yantao; Huang, Tiancong; Liu, Guojin; Wu, Yucheng

doi:10.1007/s11227-025-06957-w

Pricing strategies in mobile crowdsensing: an enhanced MAPPO approach using a behavior network

Published: 02 February 2025

Volume 81, article number 457, (2025)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Shengsheng Zhao¹,
Yantao Yu¹,
Tiancong Huang¹,
Guojin Liu¹ &
…
Yucheng Wu¹

171 Accesses
Explore all metrics

Abstract

Mobile crowdsensing involves assigning multiple tasks around points of interest to mobile users (MUs) for execution. Developing an optimal task allocation strategy is crucial for the entire system, as it directly impacts the benefits of stakeholders. Leveraging recent advancements in multi-agent reinforcement learning (MARL), which have demonstrated unique advantages in simulating complex interactions among multiple agents, we propose a pricing strategy based on an improved behavior network and multi-agent proximal policy optimization (MAPPO) algorithm. Specifically, we formulate the problem as a multi-leader multi-follower Stackelberg game, and then apply MAPPO, a MARL technique which employs centralized training and decentralized execution, to solve this game. To better capture complex sequential input information and achieve superior behavior strategies, we integrate an attention mechanism with a gated recurrent unit (GRU) network into the actor network, forming a MARL algorithm with an improved behavior network, termed GRU-and-Attention-based MAPPO (GA-MAPPO). Simulation results demonstrate that the proposed GA-MAPPO algorithm is effective compared with baseline approaches. It can learn an optimal pricing strategy that maximizes the benefits of Task Initiators (TIs) and guides TIs in pricing MUs effectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

M2MTR: Reposition Idle Taxis in the Many-to-Many Manner with Multi-agent Reinforcement Learning

Multi-agent deep reinforcement learning for computation offloading in cooperative edge network

Article 07 November 2024

Robust Online Crowdsourcing with Strategic Workers

References

Alashaikh AS, Alhazemi FM (2022) Efficient mobile crowdsourcing for environmental noise monitoring. IEEE Access 10:77251–77262. https://doi.org/10.1109/ACCESS.2022.3191780
Article Google Scholar
Capponi A, Fiandrino C, Kantarci B et al (2019) A survey on mobile crowdsensing systems: challenges, solutions, and opportunities. IEEE Commun Surv Tutor 21(3):2419–2465. https://doi.org/10.1109/COMST.2019.2914030
Article MATH Google Scholar
Dai C, Zhu K, Hossain E (2023) Multi-agent deep reinforcement learning for joint decoupled user association and trajectory design in full-duplex multi-uav networks. IEEE Trans Mob Comput 22(10):6056–6070. https://doi.org/10.1109/TMC.2022.3188473
Article MATH Google Scholar
Dey R, Salem FM (2017) Gate-variants of gated recurrent unit (GRU) neural networks. CoRR abs/1701.05923
Ding R, Yang Z, Wei Y, et al (2021) Multi-agent reinforcement learning for urban crowd sensing with for-hire vehicles. In: IEEE INFOCOM 2021 - IEEE Conference on Computer Communications, pp 1–10, https://doi.org/10.1109/INFOCOM42981.2021.9488713
Duan X, Zhao C, He S et al (2017) Distributed algorithms to compute walrasian equilibrium in mobile crowdsensing. IEEE Trans Industr Electron 64(5):4048–4057. https://doi.org/10.1109/TIE.2016.2645138
Article MATH Google Scholar
Gao H, Xu H, Li L et al (2022) Mean-field-game-based dynamic task pricing in mobile crowdsensing. IEEE Internet Things J 9(18):18098–18112. https://doi.org/10.1109/JIOT.2022.3161952
Article MATH Google Scholar
Gu B, Yang X, Lin Z et al (2021) Multiagent actor-critic network-based incentive mechanism for mobile crowdsensing in industrial systems. IEEE Trans Industr Inf 17(9):6182–6191. https://doi.org/10.1109/TII.2020.3024611
Article MATH Google Scholar
Guo X, Tu C, Hao Y et al (2024) Online user recruitment with adaptive budget segmentation in sparse mobile crowdsensing. IEEE Internet Things J 11(5):8526–8538. https://doi.org/10.1109/JIOT.2023.3318817
Article MATH Google Scholar
Hu CL, Lin KY, Chang CK (2023) Incentive mechanism for mobile crowdsensing with two-stage stackelberg game. IEEE Trans Serv Comput 16(3):1904–1918. https://doi.org/10.1109/TSC.2022.3198436
Article MATH Google Scholar
Huang H, Chen T, Wang H, et al (2022) Mappo method based on attention behavior network. In: 2022 10th International Conference on Information Systems and Computing Technology (ISCTech), pp 301–308, https://doi.org/10.1109/ISCTech58360.2022.00054
Jia J, Xing X, Chang DE (2022) Gru-attention based td3 network for mobile robot navigation. In: 2022 22nd International Conference on Control, Automation and Systems (ICCAS), pp 1642–1647, https://doi.org/10.23919/ICCAS55662.2022.10003950
Jiang B, Du J, Jiang C et al (2024) Underwater searching and multiround data collection via auv swarms: an energy-efficient aoi-aware mappo approach. IEEE Internet Things J 11(7):12768–12782. https://doi.org/10.1109/JIOT.2023.3336055
Article MATH Google Scholar
Kang J, Chen J, Xu M et al (2024) Uav-assisted dynamic avatar task migration for vehicular metaverse services: a multi-agent deep reinforcement learning approach. IEEE/CAA J Autom Sin 11(2):430–445. https://doi.org/10.1109/JAS.2023.123993
Article MATH Google Scholar
Kong X, Xia F, Li J et al (2020) A shared bus profiling scheme for smart cities based on heterogeneous mobile crowdsourced data. IEEE Trans Industr Inf 16(2):1436–1444. https://doi.org/10.1109/TII.2019.2947063
Article MATH Google Scholar
Liu Y, Wang H, Peng M et al (2020) Deepga: a privacy-preserving data aggregation game in crowdsensing via deep reinforcement learning. IEEE Internet Things J 7(5):4113–4127. https://doi.org/10.1109/JIOT.2019.2957400
Article MATH Google Scholar
Liu Y, Wang H, Peng M et al (2021) An incentive mechanism for privacy-preserving crowdsensing via deep reinforcement learning. IEEE Internet Things J 8(10):8616–8631. https://doi.org/10.1109/JIOT.2020.3047105
Article MATH Google Scholar
Lowe R, Wu Y, Tamar A, et al (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. CoRR abs/1706.02275
Tang L, Cheng Z, Dai J et al (2024) Joint optimization of vehicular sensing and vehicle digital twins deployment for dt-assisted ioVs. IEEE Trans Veh Technol 73(8):11834–11847. https://doi.org/10.1109/TVT.2024.3373175
Article Google Scholar
Wang H (2022) A survey of application and key techniques for mobile crowdsensing. Wirel Commun Mob Comput 2022(1):3693537. https://doi.org/10.1155/2022/3693537
Article Google Scholar
Wang H, Tao J, Chi D et al (2024) Imrsg: incentive mechanism based on rubinstein-starr game for mobile crowdsensing. IEEE Trans Veh Technol 73(2):2656–2668. https://doi.org/10.1109/TVT.2023.3318229
Article MATH Google Scholar
Xiao L, Chen T, Xie C, et al (2015) Mobile crowdsensing game in vehicular networks. In: TENCON 2015 - 2015 IEEE Region 10 Conference, pp 1–6, https://doi.org/10.1109/TENCON.2015.7372721
Xu Y, Wang Y, Ma J et al (2022) Psare: a rl-based online participant selection scheme incorporating area coverage ratio and degree in mobile crowdsensing. IEEE Trans Veh Technol 71(10):10923–10933. https://doi.org/10.1109/TVT.2022.3183607
Article MATH Google Scholar
Min M, Zhu H, Yang S, et al (2024) Geo-perturbation for task allocation in 3-d mobile crowdsourcing: An a3c-based approach. IEEE Internet of Things Journal 11(2):1854–1865. https://doi.org/10.1109/JIOT.2023.3295786
Article MATH Google Scholar
Yang Y, Wang W, Liu L et al (2023) Aoi optimization in the uav-aided traffic monitoring network under attack: a stackelberg game viewpoint. IEEE Trans Intell Transp Syst 24(1):932–941. https://doi.org/10.1109/TITS.2022.3157394
Article MATH Google Scholar
Yang Y, Du H, Xiong Z, et al (2024) Enhancing wireless networks with attention mechanisms: Insights from mobile crowdsensing
Yu C, Velu A, Vinitsky E, et al (2021) The surprising effectiveness of MAPPO in cooperative, multi-agent games. CoRR abs/2103.01955
Zhan Y, Xia Y, Zhang J (2018) Quality-aware incentive mechanism based on payoff maximization for mobile crowdsensing. Ad Hoc Netw 72:44–55. https://doi.org/10.1016/j.adhoc.2018.01.009
Article MATH Google Scholar
Zhan Y, Liu CH, Zhao Y et al (2020) Free market of multi-leader multi-follower mobile crowdsensing: an incentive mechanism design by deep reinforcement learning. IEEE Trans Mob Comput 19(10):2316–2329. https://doi.org/10.1109/TMC.2019.2927314
Article MATH Google Scholar
Zhang E, Trujillo R, Templeton JM et al (2023) A study on mobile crowd sensing systems for healthcare scenarios. IEEE Access 11:140325–140347. https://doi.org/10.1109/ACCESS.2023.3342158
Article Google Scholar
Zhou Y, Tong F, He S (2024) Bi-objective incentive mechanism for mobile crowdsensing with budget/cost constraint. IEEE Trans Mob Comput 23(1):223–237. https://doi.org/10.1109/TMC.2022.3229470
Article MATH Google Scholar
Zhu Z, Zhao Y, Chen B et al (2023) A crowd-aided vehicular hybrid sensing framework for intelligent transportation systems. IEEE Trans Intell Veh 8(2):1484–1497. https://doi.org/10.1109/TIV.2022.3216318
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Microelectronics and Communication Engineering, Chongqing University, Chongqing, 400044, China
Shengsheng Zhao, Yantao Yu, Tiancong Huang, Guojin Liu & Yucheng Wu

Authors

Shengsheng Zhao
View author publications
You can also search for this author inPubMed Google Scholar
Yantao Yu
View author publications
You can also search for this author inPubMed Google Scholar
Tiancong Huang
View author publications
You can also search for this author inPubMed Google Scholar
Guojin Liu
View author publications
You can also search for this author inPubMed Google Scholar
Yucheng Wu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Tiancong Huang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Proof of theorem 1

The corresponding Lagrangian form of Problem 1 is

$$\begin{aligned} L_n=\sum _m(p_m^nt_m^n-a_m^nt_m^{n2}-b_m^nt_m^n)-\lambda _{n0}(\sum _mt_m^n-\kappa _n)+\sum _m\lambda _{nm}t_m^n, \end{aligned}$$

(A1)

where ${\lambda _{n0}}$ and ${\lambda _{nm}}$ are the Lagrangian multipliers.

The Karush-Kuhn-Tucker (KKT) conditions are given as follows.

$$\begin{aligned} & \frac{{\partial {L_n}}}{{\partial t_m^n}} = 0,\forall m \in \mathcal{M}, \end{aligned}$$

(A2)

$$\begin{aligned} & {\lambda _{n0}}(\mathop \sum \limits _m t_m^n - {\kappa _n}) = 0,{\lambda _{nm}}t_m^n = 0, \end{aligned}$$

(A3)

$$\begin{aligned} & \lambda _{n0},\lambda _{nm}\ge 0,t_m^n\ge 0,\sum _mt_m^n\le \kappa _n. \end{aligned}$$

(A4)

Without loss of generality, we consider $t_m^n> 0$ and we can obtain that ${\lambda _{nm}} = 0$. Eqn. (A2) can be converted to

$$\begin{aligned} p_m^n - 2a_m^nt_m^n - b_m^n - {\lambda _{n0}} + {\lambda _{nm}} = 0,\forall m \in \mathcal{M}. \end{aligned}$$

(A5)

Then, Eqn. (A5) can be decomposed into two cases as follows.

(1)
Case I: ${\lambda _{n0}} = 0$, according to Eqn. (A5), we have
$$\begin{aligned} t_m^n = \frac{{p_m^n - b_m^n}}{{2a_m^n}}. \end{aligned}$$
(A6)
(2)
Case II: ${\lambda _{n0}}> 0$, according to Eqn. (A5), we have $t_m^n = \frac{{p_m^n - b_m^n - {\lambda _{n0}}}}{{2a_m^n}}$. Substituting $t_m^n$ into Eqn. (A3), we have ${\lambda _{n0}} = \frac{{\left( {\sum \nolimits _m {\left( {p_m^n - b_m^n} \right) \left( {\prod \nolimits _{m1 \ne m} {a_{m1}^n} } \right) } } \right) - 2{\kappa _n}\left( {\prod \nolimits _m {a_m^n} } \right) }}{{\sum \nolimits _m {\left( {\prod \nolimits _{m1 \ne m} {a_{m1}^n} } \right) } }}$, and therefore $t_m^n$ can be obtained as follows.
$$t_{m}^{n} = {\text{ }}\left\{ {\begin{array}{*{20}l} {\frac{{p_{m}^{n} - b_{m}^{n} }}{{2a_{m}^{n} }},} \hfill & {{\text{if}}\;\sum\nolimits_{{m = 1}}^{M} {\frac{{p_{m}^{n} - b_{m}^{n} }}{{2a_{m}^{n} }} \le \kappa _{n} ,} } \hfill \\ {\frac{{p_{m}^{n} - b_{m}^{n} - \lambda _{{n0}} }}{{2a_{m}^{n} }},} \hfill & {{\text{else}}\;{\text{if}}\;\lambda _{{n0}} {\text{> }}0,} \hfill \\ {0,} \hfill & {{\text{otherwise}}{\text{.}}} \hfill \\ \end{array} } \right.$$
(A7)

However, the positive or negative of $t_m^n$ is not constrained in fact and it is a must. Then, we supplement the above equation. Considering that $\sum _mt_m^n\le \kappa _n$, we have ${C_1} = \sum \nolimits _m {\min \left( {0,t_m^n} \right) }$ and ${C_2} = \sum \nolimits _m {\max \left( {0,t_m^n} \right) }$, which are the cumulative sum of sensing time that are less than or greater than 0. We process the $t_m^n$ greater than zero with a specified ratio ${{{C_1}} / {{C_2}}}$ and set the $t_m^n$ less than 0 to 0, which means the consistent distribution of losses and initial benefit.

In summary, the optimal solution to Problem 1 is given as follows.

$$\left( {t_{m}^{n} } \right)^{*} = {\text{ }}\left\{ {\begin{array}{*{20}l} {t_{m}^{n} \left( {1 - \frac{{\left( { - C_{1} } \right)}}{{C_{2} }}} \right),} \hfill & {{\text{if}}\;C_{1} < 0\;{\text{and}}\;t_{m}^{n} {\text{ }}> 0,} \hfill \\ {0,} \hfill & {{\text{else}}\;{\text{if}}\;C_{1} < 0\;{\text{and}}\;t_{m}^{n} {\text{ }} < 0,} \hfill \\ {t_{m}^{n} ,} \hfill & {{\text{otherwise,}}} \hfill \\ \end{array} } \right.$$

(A8)

where,

$$t_{m}^{n} = {\text{ }}\left\{ {\begin{array}{*{20}l} {\frac{{p_{m}^{n} - b_{m}^{n} }}{{2a_{m}^{n} }},} \hfill & {{\text{if}}\;\sum\nolimits_{{m = 1}}^{M} {\frac{{p_{m}^{n} - b_{m}^{n} }}{{2a_{m}^{n} }} \le \kappa _{n} ,} } \hfill \\ {\frac{{p_{m}^{n} - b_{m}^{n} - \lambda _{{n0}} }}{{2a_{m}^{n} }},} \hfill & {{\text{else}}\;{\text{if}}\;\lambda _{{n0}} {\text{ }}> 0,} \hfill \\ {0,} \hfill & {{\text{otherwise, }}} \hfill \\ \end{array} } \right.$$

(A9)

$$\begin{aligned} {\lambda _{n0}} = \frac{\left( {\sum \nolimits _m {\left( {p_m^n - b_m^n} \right) \left( {\prod \nolimits _{m1 \ne m} {a_{m1}^n} } \right) } } \right) - 2{\kappa _n}\left( {\prod \nolimits _m {a_m^n} } \right) }{\sum \nolimits _m {\left( {\prod \nolimits _{m1 \ne m} {a_{m1}^n} } \right) } }, \end{aligned}$$

(A10)

$$\begin{aligned} {C_1} = \sum \nolimits _m {\min \left( {0,t_m^n} \right) }, \end{aligned}$$

(A11)

$$\begin{aligned} {C_2} = \sum \nolimits _m {\max \left( {0,t_m^n} \right) }. \end{aligned}$$

(A12)

$\square$

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhao, S., Yu, Y., Huang, T. et al. Pricing strategies in mobile crowdsensing: an enhanced MAPPO approach using a behavior network. J Supercomput 81, 457 (2025). https://doi.org/10.1007/s11227-025-06957-w

Download citation

Accepted: 15 January 2025
Published: 02 February 2025
DOI: https://doi.org/10.1007/s11227-025-06957-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pricing strategies in mobile crowdsensing: an enhanced MAPPO approach using a behavior network

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

M2MTR: Reposition Idle Taxis in the Many-to-Many Manner with Multi-agent Reinforcement Learning

Multi-agent deep reinforcement learning for computation offloading in cooperative edge network

Robust Online Crowdsourcing with Strategic Workers

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A

Appendix A

Proof of theorem 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now