Simulation and validation of a reinforcement learning agent-based model for multi-stakeholder forest management
Introduction
Forest management requires the ability to integrate numerous objectives in order to satisfy the goals of different stakeholders (Bettinger et al., 2003, Kangas et al., 2005, Shao et al., 2005). Forest companies, for example, are typically driven by economic incentives that involve harvesting high quality timber and minimizing the construction of logging roads, while conservation-minded groups are interested in preserving the long term ecological functions of the forest. In addition, government agencies are motivated by the need to spur economic growth while avoiding the exhaustion of available timber.
A broad range of spatial optimization procedures exist that have the ability to integrate multiple and often-times conflicting objectives that can potentially exist across spatial scales. Heuristic modeling techniques represent the most recognized collection of approaches because of their ability to produce feasible solutions to large-scale spatial problems (Baskent & Keles, 2005). Such methods include simulated annealing (Baskent and Jordon, 2002, Ohman and Lamas, 2003), tabu search (Caro et al., 2003, Richards and Gunn, 2003) and genetic algorithms (Ducheyne et al., 2006, Venema et al., 2005), all of which can evaluate sets of spatial patterns with the aim of improving forest harvesting strategies in order to meet different objectives. However, the implementation of spatial optimization procedures can prove to be challenging when patterns resulting from harvesting processes are largely dictated by the complex interactions between stakeholders and various system components (Cerda & Mitchell, 2004). Economic markets, ecological processes, political and social dynamics lead to uncertainty in our ability to implement strategic plans (Nelson, 2003). As such, the dynamics that shape the forest harvesting process can be in direct conflict with spatially-optimal patterns derived from heuristic modeling techniques.
In order to investigate the dynamics of forest management, agent-based modeling (ABM) offers a simulation approach in which computer agents represent the decision-making behaviors of individual entities that influence their surrounding environment (Brown et al., 2004, Parker et al., 2003). Agents can possess different strategies for responding to the actions of other agents and to the dynamic components of the landscape. Positive feedbacks are formed as agent actions cause a system to become entrenched along a specific trajectory, while negative feedbacks can exist due to constraints that prevent a system from entering certain states (Manson, 2006a). As a consequence, the results of an agent-based model are viewed as patterns that emerge from the various dynamics of the system (Li, Brimicombe, & Li, 2008). The ability to simulate the behaviors and complex interactions between humans has led to the widespread use of ABM for modeling a variety of spatial phenomena, including urban dynamics (Brown et al., 2008, Guzy et al., 2008, Li and Liu, 2007, Maoh and Kanaroglou, 2007, Xie et al., 2007, Yin and Muller, 2007), agricultural land use transition (Acosta-Michlik and Espaldon, 2008, Bakker and van Doorn, 2009, Millington et al., 2008) and human mobility (Batty, 2001). Such applications focus on determining how specific human behaviors react and contend with their surrounding spatial structures and produce different landscape patterns over time. In addition, ABM has been employed for examining the influence of various policies with regards to how they dictate human behavior and the consequential patterns that arise. This includes implementing ABM for evaluating sustainable land use planning strategies (Li and Liu, 2008, Zellner et al., 2008), zoning policies (Zellner et al., 2009), water use regulations (Smugly, Morris, & Heckbert, 2009) and conservation programs (Hartig and Drechsler, 2009, Janssen et al., 2000, Sengupta et al., 2005).
From a forest management perspective, ABM is a useful approach for simulating the behavior of stakeholders such as forest companies, conservation groups and government agencies in order to evaluate how the interactions amongst these interest groups lead to the emergence of different forest harvesting patterns (Purnomo, Mendoza, Prabhu, & Yasmi, 2005). The utility of ABM is evident in the number of applications for simulating forest-related processes that have surfaced in the literature in recent years. Examples include the use of ABM for simulating the emerging patterns of forest–agriculture transition (Bithell and Brasington, 2009, Castella et al., 2005, Deadman et al., 2004, Evans and Kelley, 2008), and multi-stakeholder management of tropical forests (Purnomo and Guizol, 2006, Purnomo et al., 2005). Yet, there still remain challenges when considering how to implement the information gained from an agent-based model into practical forest management strategies due to the perception that ABM remains largely a mechanism for exploring system dynamics rather than providing predictive results (Brown, Aspinall, & Bennett, 2006). This perception stems from the fact that simulating system complexity often involves the inclusion of various stochastic components that can lead a single model to produce numerous results that are all plausible outcomes of a system process (Batty & Torrens, 2005). It is difficult to determine which outcome is most representative of the process given what the agents are trying to achieve, and how, if at all, the generated harvesting patterns satisfy the objectives of the different agents. As such, a paradox exists within the attempt to develop computational models for assisting forest management because of our conflicting desires to generate optimal spatial patterns while acknowledging the spatial and temporal complexities of the system.
The objective of this study is to address this optimization-complexity paradox through the development and validation of a model for multi-stakeholder forest management that integrates ABM and reinforcement learning (RL). RL is a computation approach stemming from the literature on machine learning and artificial intelligence that is used to improve model outcomes by providing numeric reinforcing rewards to those actions in a system that lead towards the achievement of a set of defined objectives (Barto et al., 1981, Sutton, 1988). In this study, RL provides a means to incorporate optimization procedures into an agent-based model that allows agents to interact with each other and their environment while learning how to improve their decision-making behavior. RL algorithms evaluate landscape patterns and relay information to the agents that describes where and when forest harvesting strategies should take place in order for them to achieve their objectives. Furthermore, the RL agent-based model is parameterized as a multi-objective optimization model, which facilitates the use of traditional multi-objective evaluation methods for validating the ability of the model to produce optimal results given the complexity of the system. While agent-based models have previously been integrated with artificial intelligence for spatial applications such as optimal site selection problems (Li, He, & Liu, 2009), simulating animal migration within natural landscapes (Bennett & Tang, 2006) and modeling agricultural land use decision-making (Manson, 2006b), this study offers a novel approach for bridging complexity and optimization by explicitly and independently representing the knowledge acquisition of each agent in order to simulate the interactions and learning of different stakeholders in a forest management context.
Section snippets
The RL–ABM multi-stakeholder framework
The framework for integrating RL and ABM for multi-stakeholder forest management is presented in Fig. 1. Agents representing forest companies (F) harvest trees in the landscape based on the availability and price of timber for a specified number of time steps. The period from the first to the final time step is referred to as an episode; the forest harvesting pattern resulting from an episode represents a single harvesting solution. For the first episode, the forest company agents have no
Methods
The model developed in this study consists of three types of stakeholder agents that interact in a forest landscape. Forest dynamics are represented by tree growth and fluctuating timber prices. The RL algorithms ensure that the stakeholder agents learn to contend with forest changes and the actions of other stakeholders when attempting to achieve their objectives.
Model implementation
The model is implemented through the simulation of forest harvesting in the Chilliwack Forest District in south western British Columbia. The area provides opportunities for harvesting due to the availability of desired timber and proximity to timber processing and shipping locations. However, the area also lies within the habitat of the Northern Spotted Owl, a species that has been placed on Canada’s Endangered Species list due to declining populations as a result of habitat loss.
The study
Results
The model was run for 10,000 episodes as this lead to each stand being selected at least 50 times, which was found to provide a sufficient level of sampling and avoided simulating too many episodes at the expense of no significant change to the results.
Discussion
The results from this study reveal that agent behavior has an influence on the ability of the agents to learn about forest harvesting patterns that are beneficial for achieving their objectives. However, it can be safely concluded that specific changes in agent behavior are not directly manifest in the results as the forest companies’ willingness to cooperate does not always lead to an improved outcome for the conservationist. Information extracted from the comparison of solutions, the outcomes
Conclusion
The model developed in this study provides three outputs that are of important use to forest management. The first is the harvesting solution that is generated in the final episode of the model, which depicts the optimal decision making behavior of the forest company agents given their interactions with the conservationist and the government agents. Decision makers can utilize such information for determining if the resulting spatial patterns conflict with management policies that dictate the
Acknowledgements
The authors would like to thank the Natural Sciences and Engineering Research Council of Canada (NSERC) for full support of this study under the Canadian Graduate Scholarship awarded to the first author and the Discovery Grant awarded to the second author. Acknowledgement is also given to the Government of British Columbia for providing the British Columbia Forest Cover Data.
References (49)
- et al.
Assessing vulnerability of selected farming communities in the Philippines based on a behavioural model of agent’s adaptation to global environmental change
Global Environmental Change-Human and Policy Dimensions
(2008) - et al.
Farmer-specific relationships between land use change and landscape factors: Introducing agents in empirical land use modelling
Land Use Policy
(2009) - et al.
Forest landscape management modeling using simulated annealing
Forest Ecology and Management
(2002) - et al.
Spatial forest planning: A review
Ecological Modelling
(2005) - et al.
Modelling and prediction in a complex world
Futures
(2005) - et al.
Spatial forest plan development with ecological and economic goals
Ecological Modelling
(2003) - et al.
Coupling agent-based models of subsistence farming with individual-based forest models and dynamic models of water distribution
Environmental Modelling and Software
(2009) - et al.
Agent-based and analytical modeling to evaluate the effectiveness of greenbelts
Environmental Modelling and Software
(2004) - et al.
Exurbia from the bottom-up: Confronting empirical challenges to characterizing a complex system
Geoforum
(2008) - et al.
Agrarian transition and lowland-upland interactions in mountain areas in northern Vietnam: Application of a multi-agent simulation model
Agricultural Systems
(2005)
Assessing the transition from deforestation to forest regrowth with an agent-based model of land cover change for south-central Indiana (USA)
Geoforum
Smart spatial incentives for market-based conservation
Biological Conservation
An adaptive agent model for analysing co-evolution of management and policies in a complex rangeland system
Ecological Modelling
Socioecological landscape planning approach and multicriteria acceptability analysis in multiple-purpose forest management
Forest Policy and Economics
Agent-based services for the validation and calibration of multi-agent models
Computers, Environment and Urban Systems
Defining agents’ behaviors to simulate complex residential development using multicriteria evaluation
Journal of Environmental Management
Land use in the southern Yucatan peninsular region of Mexico: Scenarios of population and institutional change
Computers, Environment and Urban Systems
Clustering of harvest activities in multi-objective long-term forest planning
Forest Ecology and Management
Simulating forest plantation co-management with a multi-agent system
Mathematical and Computer Modelling
Integrating stand and landscape decisions for multi-purposes of forest harvesting
Forest Ecology and Management
Forest optimization using evolutionary programming and landscape ecology metrics
European Journal of Operational Research
Interactive evolutionary approaches to multiobjective spatial decision making: A synthetic review
Computers, Environment, and Urban Systems
The emergence of zoning policy games in exurban jurisdictions: Informing collective action theory
Land Use Policy
A new framework for urban sustainability assessments: Linking complexity, information and policy
Computers Environment and Urban Systems
Cited by (48)
Modelling forests as social-ecological systems: A systematic comparison of agent-based approaches
2024, Environmental Modelling and SoftwarePerfect assumptions in an imperfect world: Managing timberland in an oligopoly market
2022, Forest Policy and EconomicsAn agent-based procedure with an embedded agent learning model for residential land growth simulation: The case study of Nanjing, China
2019, CitiesCitation Excerpt :The learning behavior within human system, which is focused more in social, economic, and psychosocial studies, has not been fully explored in the context of geo-simulation (Filatova et al., 2013; Li et al., 2015; Meyfroidt, 2013). Furthermore, how agents learn from their past decisions regarding the landscape is more often studied rather than how agents learn from one another (recent work including Bone & Dragicevic, 2010a, 2010b, Bone et al., 2011). Second, in most of the ABMs for geo-simulation which consider agent's learning, collective learning, which has been nourished by machine learning algorithms (e.g. Bennett & Tang, 2006; Bone et al., 2011; Bone & Dragicevic, 2010a, 2010b; Bousquet & Le Page, 2004; Tang, 2008), is often simulated to a much higher degree than individual-level learning in the coupled human-environmental system.
Centrally located yet close to nature: A prescriptive agent-based model for urban design
2019, Computers, Environment and Urban SystemsCitation Excerpt :This is in line with what shown by Axelrod (1984) using the well-known Prisoner's Dilemma: in many situations, the individualistic approach may only bring in short term advantages, whereas everyone would be better off with mutual cooperation. Various scholars have proven this rationale using agent-based models (Axelrod, 1997; Bone & Dragićević, 2010; Kraus, 1997; Lo Nigro, Noto La Diega, Perrone, & Renna, 2003; Peppers & Smuts, 2000; Power, 2009). It is important to note here that the cooperation mechanism depicted above does not reflect any realistic residential choice behavior (i.e. people do not choose a house location to benefit others): it merely is a modeling device that guarantees the achievement of progressively better configurations during a simulation and is acceptable in the framework of a prescriptive model.