Deep Q-network-based multi-criteria decision-making framework for virtual simulation environment

Jang, Hyeonjun; Hao, Shujia; Chu, Phuong Minh; Sharma, Pradip Kumar; Sung, Yunsick; Cho, Kyungeun

doi:10.1007/s00521-020-04918-3

Deep Q-network-based multi-criteria decision-making framework for virtual simulation environment

S. I : Hybridization of Neural Computing with Nature Inspired Algorithms
Published: 21 April 2020

Volume 33, pages 10657–10671, (2021)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Hyeonjun Jang¹,
Shujia Hao¹,
Phuong Minh Chu²,
Pradip Kumar Sharma¹,
Yunsick Sung¹ &
…
Kyungeun Cho ORCID: orcid.org/0000-0003-2219-0848¹

555 Accesses
14 Citations
Explore all metrics

Abstract

Deep learning improves the realistic expression of virtual simulations specifically to solve multi-criteria decision-making problems, which are generally rely on high-performance artificial intelligence. This study was inspired by the motivation theory and natural life observations. Recently, motivation-based control has been actively studied for realistic expression, but it presents various problems. For instance, it is hard to define the relation among multiple motivations and to select goals based on multiple motivations. Behaviors should generally be practiced to take into account motivations and goals. This paper proposes a deep Q-network (DQN)-based multi-criteria decision-making framework for virtual agents in real time to automatically select goals based on motivations in virtual simulation environments and to plan relevant behaviors to achieve those goals. All motivations are classified according to the five-level Maslow’s hierarchy of needs, and the virtual agents train a double DQN by big social data, select optimal goals depending on motivations, and perform behaviors relying on a predefined hierarchical task networks (HTNs). Compared to the state-of-the-art method, the proposed framework is efficient and reduced the average loss from 0.1239 to 0.0491 and increased accuracy from 63.24 to 80.15%. For behavioral performance using predefined HTNs, the number of methods has increased from 35 in the Q network to 1511 in the proposed framework, and the computation time of 10,000 behavior plans reduced from 0.118 to 0.1079 s.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep reinforcement learning in real-time strategy games: a systematic literature review

Article 30 December 2024

Playing First-Person Perspective Games with Deep Reinforcement Learning Using the State-of-the-Art Game-AI Research Platforms

Accelerating Deep Q Network by Weighting Experiences

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

References

Wang J, Gu X, Liu W, Sangaiah AK, Kim HJ (2019) An empower Hamilton loop based data collection algorithm with mobile agent for WSNs. Hum-Centric Comput Inf Sci 9(1):18
Article Google Scholar
Yao J, Ni Y, Zhao J, Niu H, Liu S, Zheng Y, Wang J (2019) Data based violated behavior analysis of taxi driver in metropolis in China. CMC Comput Mater Contin 60(3):1109–1122
Google Scholar
Long M, Zeng Y (2019) Detecting iris liveness with batch normalized convolutional neural network. Comput Mater Contin 58(2):493–504
Article Google Scholar
Luck M, d’Inverno M (1998) Motivated behaviour for goal adoption. In: Zhang C, Lukose D (eds) Multi-agent systems: theories, languages and applications Lecture notes in computer science, vol 1544. Springer, Berlin
Google Scholar
Maslow AH (1943) A theory of human motivation. Psychol Rev 50:370–396
Article Google Scholar
The Sims 4. https://www.ea.com/games/the-sims/the-sims-4
Assassin’s Creed Origins. https://www.ubisoft.com/en-gb/game/assassins-creed-origins
Thalmann D (2007) Crowd simulation. In: Wah BW (ed) Wiley encyclopedia of computer science and engineering. Wiley, New York
Google Scholar
Torren PM (2012) Moving agent pedestrians through space and time. Ann Assoc Am Geogr 102:35–66
Article Google Scholar
Pan X, Han CS, Law KH (2007) A multi-agent based simulation framework for the study of human and social behavior in egress analysis. Comput Civ Eng 22:113–132
Google Scholar
Musse SR, Thalmann D (2001) Hierarchical model for real time simulation of virtual human crowds. IEEE Trans Vis Comput Graph 7:152–164
Article Google Scholar
Murakami Y, Minami K, Kawasoe T, Ishida T (2002) Multi-agent simulation for crisis management. In: Knowledge media networking, pp 135–139
Lee WS (2012) Smart house simulator design using automatic scenario generation. M.S. thesis, Department of Multimedia Engineering, Dongguk University
Song W, Cho KE, Um KH (2008) Motivation-based hierarchical behavior planning. J Korea Game Soc 8:79–90
Google Scholar
Trescak T, Bogdanovych A, Simoff S (2014) City of Uruk 3000 BC: using genetic algorithms, dynamic planning and crowd simulation to re-enact everyday life of ancient Sumerians. In: Social simulation conference
Sevin ED, Thalmann D (2005) A motivational model of action selection for virtual humans. In: Computer graphics international, pp 213–220
Sevin ED, Thalmann D (2005) An affective model of action selection for virtual humans. In: Proceedings of agents that want and like: motivational and emotional roots of cognition and action symposium at the artificial intelligence and social behaviors 2005 conference
Nguyen HT (2018) A study on motivation generation based on Q-network for agent simulations. M.S. thesis, Department of Multimedia Engineering, Dongguk University
Guy SJ, Chhugani J, Kim C, Satish N, Lin M, Manocha D, Dubey P (2009) Clearpath: highly parallel collision avoidance for multi-agent simulation. In: 2009 ACM SIGGRAPH/Eurographics symposium on computer animation, pp 177–187
Bogdanovych A, Trescak T (2017) To plan or not to plan: lessons learned from building large scale social simulations. In: International conference on intelligent virtual agents, pp 53–62
Feldman R, Dagan I, Hirsh H (1998) Mining text using keyword distributions. J Intell Inf Syst 10:281–300
Article Google Scholar
Heimerl F, Lohmann S, Lange S, Ertl T (2014) Word cloud explorer: text analytics based on word clouds. In: 2014 47th Hawaii international conference on system sciences, pp 1833–1842
Gambette P, Véronis J (2010) Visualising a text with a tree cloud. In: Locarek-Junge H, Weihs C (eds) Classification as a tool for research. Springer, Berlin, pp 561–569
Chapter Google Scholar
Aggarwal CC, Wang H (2011) Text mining in social networks. In: Aggarwal C (ed) Social network data analytics. Springer, Boston, pp 353–378
Chapter Google Scholar
Rose S, Engel D, Cramer N, Cowley W (2010) Automatic keyword extraction from individual documents. In: Berry MW, Kogan J (eds) Text mining: applications and theory. Wiley, New York, pp 1–20
Google Scholar
Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E (2015) Deep learning applications and challenges in big data analytics. J Big Data 2:1
Article Google Scholar
Lim CU, Harrell FD (2013) Modeling player preferences in avatar customization using social network data: a case-study using virtual items in Team Fortress 2. In: 2013 IEEE conference on computational intelligence in games (CIG), pp 1–8
Thawonmas R, Kurashige M, Iizuka K, Kantardzic M (2006) Clustering of online game users based on their trails using self-organizing map. In: International conference on entertainment computing, pp 366–369
Christakis NA, Fowler JH (2013) Social contagion theory: examining dynamic social networks and human behavior. Stat Med 32:556–577
Article MathSciNet Google Scholar
Youyou W, Michal K, David S (2015) Computer-based personality judgments are more accurate than those made by humans. Proc Natl Acad Sci 112:1036–1040
Article Google Scholar
Xu Z, Tang J, Meng J, Zhang W, Wang Y, Liu CH, Yang D (2018) Experience-driven networking: a deep reinforcement learning based approach. In: IEEE Infocom 2018, 15–19 April 2018, Honolulu, HI, USA, pp 1–9
Song Y, Yang G, Xie H, Zhang D, Xingming S (2017) Residual domain dictionary learning for compressed sensing video recovery. Multimed Tools Appl 76(7):10083–10096
Article Google Scholar
Rao Y, Lu J, Zhou J (2017) Attention-aware deep reinforcement learning for video face recognition. In: 2017 IEEE international conference on computer vision (ICCV), 22–29 October 2017, Venice, Italy, pp 3931–3940
Xia Z, Hu Z, Luo J (2017) UPTP vehicle trajectory prediction based on user preference under complexity environment. Wirel Pers Commun 97(3):4651–4665
Article Google Scholar
Hossain MS, Muhammad G (2018) Environment classification for urban big data using deep learning. IEEE Commun Mag 56(11):44–50
Article Google Scholar
Gui Y, Zeng G (2019) Joint learning of visual and spatial features for edit propagation from a single image. Vis Comput 36:1–14
Google Scholar
He S, Li Z, Tang Y, Liao Z, Wang J, Kim HJ (2019) Parameters compressing in deep learning. Comput Mater Contin 62:1–16
Google Scholar
Zhou S, Ke M, Luo P (2019) Multi-camera transfer GAN for person re-identification. J Vis Commun Image Represent 59:393–400
Article Google Scholar
Meyer JA (1996) Artificial life and the animat approach to artificial intelligence. In: Artificial intelligence. Elsevier Inc, pp 325–354. https://doi.org/10.1016/B978-012161964-0/50013-3
Hasselt HV, Guez A, Silver D (2016) Deep reinforcement learning with double Q-learning. AAAI 2:5
Google Scholar

Download references

Acknowledgements

This research was supported by a grant from Defense Acquisition Program Administration and Agency for Defense Development, under contract #UE171095RD, and this work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (2018R1A2B2007934).

Author information

Authors and Affiliations

Department of Multimedia Engineering, Dongguk University, Seoul, 04620, South Korea
Hyeonjun Jang, Shujia Hao, Pradip Kumar Sharma, Yunsick Sung & Kyungeun Cho
Institute of Simulation Technology, Le Quy Don Technical University, 236 Hoang Quoc Viet Street, Hanoi, 100000, Vietnam
Phuong Minh Chu

Authors

Hyeonjun Jang
View author publications
You can also search for this author inPubMed Google Scholar
Shujia Hao
View author publications
You can also search for this author inPubMed Google Scholar
Phuong Minh Chu
View author publications
You can also search for this author inPubMed Google Scholar
Pradip Kumar Sharma
View author publications
You can also search for this author inPubMed Google Scholar
Yunsick Sung
View author publications
You can also search for this author inPubMed Google Scholar
Kyungeun Cho
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Kyungeun Cho.

Ethics declarations

Conflict of interest

We certify that there is no actual or potential conflict of interest in relation to this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

The virtual simulation environment for this study, shown in Fig. 15, was implemented using the commercial game engine Unity 2017 (Unity Technologies ApS, San Francisco, CA, USA). The virtual environment demonstrates a small city. In the virtual city, we simulate actions of humans and animals on the road areas. In addition, there are eight junctions including six T junctions and two cross-junctions. Each T junction contains nine traffic lights, and each cross junction contains twelve. Therefore, a total of 78 traffic lights are utilized in the simulator.

Table 19 lists screens of behaviors, results, motivations, goals, and behavior outcomes when the virtual human agent was activated in the simulation after learning using the proposed framework.

Table 19 Motivation, goal, and behavior result of the virtual human gent in the proposed framework

Full size table

For instance, Fig. 16a shows virtual human agents commuting to offices or schools in the morning. Figure 16b–d illustrate the virtual human agents having meals at noon, returning home in the evening, and going to sleep later in the evening, respectively.

Table 20 lists the behaviors, goals, and top-priority motivations of the virtual animal during simulation after learning using the proposed framework.

Table 20 Motivation–goal–behavior results of the virtual animal in the proposed framework

Full size table

Figure 17a, b illustrate that multiple virtual animal agents tend to move in groups using the proposed framework. An interaction between virtual animal and human agents is shown in Fig. 17c, and the behavior of only one virtual animal agent is illustrated in Fig. 17d.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jang, H., Hao, S., Chu, P.M. et al. Deep Q-network-based multi-criteria decision-making framework for virtual simulation environment. Neural Comput & Applic 33, 10657–10671 (2021). https://doi.org/10.1007/s00521-020-04918-3

Download citation

Received: 04 January 2020
Accepted: 06 April 2020
Published: 21 April 2020
Issue Date: September 2021
DOI: https://doi.org/10.1007/s00521-020-04918-3

Keywords

Part of a collection:

S.I.: Hybridization between Neural Computing and Nature Inspired Algorithms for Solving Multi-Criteria Decision-Making Problems (vol 33, issue 17)

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Q-network-based multi-criteria decision-making framework for virtual simulation environment

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Deep reinforcement learning in real-time strategy games: a systematic literature review

Playing First-Person Perspective Games with Deep Reinforcement Learning Using the State-of-the-Art Game-AI Research Platforms

Accelerating Deep Q Network by Weighting Experiences

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now