Innovative Applications of O.R.Multi-period orienteering with uncertain adoption likelihood and waiting at customers
Introduction
In many industries, the challenges of higher costs, shrinking margins, and increased customer expectations require manufacturers to adopt service-based strategies that can provide a differentiated source of value to customers (Bourne, 2016). As a result, many companies are incorporating service operations into product offerings more than in the past, with businesses shifting from providing only products to product-service bundles (Dibble, 2016). One way that companies pair service with their products is through a knowledgable sales force that helps match customers with the right products and trains customers in the products’ use. Such a business model requires companies to rely heavily on salespeople in developing customer relationships, increasing the importance of sales force management.
In this study, we consider the problem in which a salesperson must determine which customers to visit and when to visit the customers over a multi-period horizon to influence the likelihood of product adoption. Each meeting between the salesperson and a customer may increase the customer’s adoption likelihood by increasing the customer’s awareness and knowledge of the product. Thus, a salesperson may meet with a customer in multiple periods in an attempt to develop a stronger relationship that could lead to a greater chance of adoption. This problem is applicable to the sales of pharmaceuticals, textbooks, insurance, and investments. For example, to promote medications, pharmaceutical representatives may try to develop long-term relationships with physicians. To do so, the representatives visit physicians’ offices and offer complimentary drug samples, leaflets, and books (Schramm et al., 2007). Although skeptical physicians are persuaded more by evidence than charm, a better relationship between the representative and the physicians usually leads to more sales (Fugh-Berman & Ahari, 2007). Similarly, a textbook salesperson will visit faculty members on campus to build strong relationships in order to gain textbook adoptions. In the finance industry, both insurance sales representatives and the salespeople in investment banks carefully manage client relationships to secure clients.
We assume that the likelihood that a customer will adopt the salesperson’s product or service stochastically evolves from period to period and may be influenced by whether the salesperson meets with the customer. We base this assumption on the observation that, while the customer is collecting more information on the product or service from the salesperson during the decision-making process, the salesperson’s competitors may approach the customers and offer similar products or services. Thus, the customer is subject to the uncertain effects of competing sales forces (quality of competitors’ offers and persuasiveness of competitor marketing efforts). Additionally, the customer’s decision may be influenced by peers. For instance, when deciding whether to adopt a new textbook, a faculty member may consider suggestions from colleagues who have previously taught the same course. A physician’s decision on prescribing a drug may be changed by a conversation with another physician or a case presented at recent conference.
We also assume that each customer’s current adoption probability status is perfectly observed by the salesperson. For experienced sales representatives, accurate observations on how likely customers will adopt the product or service can be based communications with the customers. At the end of the problem horizon, each customer makes their product adoption decision and selects the salesperson’s product with a likelihood equal to the terminal value of the customer’s adoption probability.
To meet with a customer in a period, the salesperson must arrive within the customer’s time window. However, simply visiting a customer and arriving within a customer’s time window does not guarantee that the salesperson will be able to meet the customer. Upon arrival at a customer, the salesperson may observe a queue, in which case she has to wait if she wants to meet the customer. The queue, whose length is unknown to the salesperson before her arrival, may consist of competing salespeople (as often the case in pharmaceutical sales) or may include the customer’s other constituents (such as students or co-workers as often the case in textbook sales). Upon arriving and observing a queue, the salesperson needs to decide whether to join the queue or to balk. After joining a queue, the salesperson needs to decide whether to continue waiting in the queue or to renege and go to another customer location. Due to the uncertain wait time in a customer’s queue, the time window may end before the salesperson meets with the customer. In this case, the salesperson makes no impact on the customer’s adoption probability as there is no meeting. The salesperson then must decide which customer to visit next. Thus, we make a distinction between visiting and meeting with customers. A visit occurs when a salesperson travels to a customer’s location. A meeting occurs only if the salesperson finds herself at the front of the queue and enters service before the end of the customer’s time window.
Obtaining a solution to the problem considered in this paper is complicated by a large state space, a challenge that arises in many sequential decision making problems with uncertainty. However, the complexity of our problem is further compounded by a large action space, as the set of actions available at each state corresponds to a set of solutions to a single-period stochastic orienteering problem. To overcome this challenge, we develop a two-stage decomposition heuristic. In the first stage, we solve an assignment problem to determine the subset of customers to schedule in the current period. Although the assignment problem is a relaxation of an orienteering problem as it only specifies which customers to visit and not the order in which to visit them, we demonstrate that solving this assignment formulation complemented with additional constraints capably performs customer subset selection. These additional constraints leverage the customer locations and time windows to identify promising subsets of customers. Further, through these constraints, the first stage of the heuristic explicitly accounts for the evolution of customer acceptance probabilities. In the second stage, we complete the specification of the action for the current period by solving a routing problem to determine the order in which to visit the customers selected by the assignment problem.
Our study makes multiple contributions to the research literature. First, we introduce a new multi-period, stochastic orienteering problem motivated by industry practice. To the best of our knowledge, this is the first study in literature addressing a stochastic multi-period orienteering problem, particularly one in which the periods are linked by decisions made in previous periods. Second, we develop a tractable methodology to effectively solve the problem by considering a relaxed form of the Bellman equation. Our computational results suggest that our two-stage decomposition approach complemented with additional first-stage constraints is a viable way of overcoming the curse of dimensionality. In addition, the results of our computational experiments also demonstrate the value of explicitly considering this evolution when scheduling customers.
In Section 2, we review the related literature to position our contributions. In Section 3, we present our MDP model. In Section 4, we propose a heuristic approach, which consists of an assignment procedure and a routing process. We present the heuristic algorithm in Section 4.1 and two alternatives for the assignment procedure in Section 4.2. In Section 4.3, we present an a priori routing policy for routing the scheduled customers. In Section 5, we compare the two approaches for assigning customers via our computational experiments. We conclude the paper in Section 6 with a summary and discussion of future work.
Section snippets
Related literature
This paper considers a variant of the orienteering problem (OP). First introduced by Tsiligirides (1984), the OP is a routing problem in which the traveler selects a subset of known customers to visit. The goal of the problem is to select and visit customers such that the sum of the accumulated rewards from the tour is maximized. The OP with time windows (OPTW) introduces the requirement that the traveler must arrive within each customer’s time window to collect the customer’s reward. Most
Model formulation
In our problem, a salesperson must determine which customers to visit and the sequence in which to visit them for each period of a T-period discrete-time horizon, . We consider a set of geographically-dispersed customers described by a complete graph where the node set consists of the salesperson’s origin and n potential customers that the salesperson may visit and the set consists of edges associated with each pair of nodes. We assume the travel time on edge (i, j) is
Solution approach: a two-stage heuristic
The classical approach for finding an optimal policy to an MDP is to use backward dynamic programming to solve the Bellman equation associated with the MDP. However, as described in Section 3, the size of the state space in our MDP model is prohibitively large, . A common way in the approximate dynamic programming literature to handle such a problem is to focus the solution method on stepping forward in time and using some approximation of the future values (see Powell, 2011). The
Computational experiments
To test the effectiveness of the two-stage heuristic and gain insight into our multi-period orienteering problem, we present computational experiments. We describe problem instances generated from existing benchmark data sets in Section 5.1. We implement Algorithms 1 and 2 in C++ and execute the experiments on high performance computing system with 2.6 GHz Intel Xeon processor cores. We use Gurobi 5.6.3 and its C++ API to solve the assignment problem in the lookahead approach. In Section 5.2,
Conclusion and future work
This study introduces a multi-period stochastic orienteering problem in which a salesperson must decide in which period(s) to visit customers and in what sequence to visit them in an effort to obtain customer adoptions. We propose a MDP model in which the state includes each customer’s adoption likelihood. To address the intractable state and action space, we propose a two-stage heuristic that first solves an assignment problem to select customers for a period, and then solves an orienteering
Acknowledgement
This work is partially supported by the National Natural Science Foundation of China (Nos. 71971032 and 71601024), the Humanity and Social Science Foundation of Chinese Ministry of Education (No. 16YJC630169), and the Fundamental Research Funds for the Central Universities (Nos. 2019CDSKXYJG0037, 2018CDXYJG0040 and 2018CDQYJSJ0024).
References (38)
An artificial bee colony algorithm approach for the team orienteering problem with time windows
Computers & Industrial Engineering
(2014)- et al.
Solving the orienteering problem with time windows via the pulse framework
Computers & Operations Research
(2015) - et al.
A two-stage approach to the orienteering problem with stochastic weights
Computers & Operations Research
(2014) - et al.
Orienteering problem: A survey of recent variants, solution approaches and applications
European Journal of Operational Research
(2016) - et al.
An iterative three-component heuristic for the team orienteering problem with time windows
European Journal of Operational Research
(2014) - et al.
The team orienteering problem with time windows: An LP-based granular variable neighborhood search
European Journal of Operational Research
(2012) - et al.
A simulated annealing heuristic for the team orienteering problem with time windows
European Journal of Operational Research
(2012) - et al.
Decremental state space relaxation strategies and initialization heuristics for solving the orienteering problem with time windows with dynamic programming
Computers & Operations Research
(2009) - et al.
Record breaking optimization results using the ruin and recreate principle
Journal of Computational Physics
(2000) - et al.
Scheduled penalty variable neighborhood search
Computers & Operations Research
(2014)