Abstract
In recent years, mobile crowdsourcing has been integrated into people’s lives. A variety of mobile crowdsourcing services have emerged and been widely used, such as Gigwalk, Foursquare, and Uber. Due to the uncertainty of task distribution and workers’ trajectory, as well as diverse worker interests and capabilities, it is crucial to effectively predict the mobile workers’ trajectories such that they are willing to get to the location and perform their tasks with as little travel and time cost as possible. In this paper, we propose a context-sensitive prediction approach for workers’ moving path in mobile crowdsourcing services. We predict the upcoming location of workers through movement rules, real-time perception of workers’ moving path and contexts when assigning spatial tasks on a crowdsourcing platform, thereby pushing a task to the workers who will enter the region within the deadline of the task. Our location prediction method can avoid workers’ extra cost such as time and charges in performing tasks. The analysis and simulation experiments based on real data sets show that this method can effectively predict the location of a worker and achieve better results in task assignment and completion.
You have full access to this open access chapter, Download conference paper PDF
1 Introduction
Jeff Howe’s definition of ”crowdsourcing” is: “A company or organization that outsources tasks that were performed by employees in the past to a non-specific public of the network in a voluntary form [3]. Mobile crowdsourcing services extend traditional crowdsourcing patterns to mobile space, and it does not require workers to perform tasks on a fixed web platform, but increases the constraints of time and location. The core issue is task assignment. Mobile crowdsourcing task assignment aims at assigning spatial tasks (i.e., tasks related to location and time) to a worker set [4], and the workers will complete it in a separate or cooperative manners, while meeting the requirements for time, location and other constraints of the task [1, 2, 5].
With the rapid development of computer network and the number of mobile intelligent terminal increasing, mobile crowdsourcing service platform can receive the task by the task publisher or the request of the worker to perform task anytime and anywhere, which puts forward high requirements for whether the platform can allocate tasks in time and dynamic adaptability. Today, a majority of the task assignment strategies are simply assigning tasks directly to workers near the task without paying attention to the variation of the workers’ own trajectories and locations. If the workers are going away from the location of the task, they will probably refuse to accept the task. Because the location of workers will be far away from the task’s location, the platform needs to pay extra cost to encourage workers to perform. This will not only reduce the success rate of task assignment, but also increase the time cost of workers and the additional cost of the platform.
We know that diverse data from different sources and types hide much valuable information. This paper takes into account that the user’s historical track data can also bring us a lot of useful information. By analyzing these historical data, we can get the information about users’ behaviors, interests and preferences. The user’s location region is predicted based on the current location and information of the user and then tasks in the region are allocated to him.
When predicting, this paper is not only concerned with the binary relation of “user-location”, but also taking into consideration the context information of the user (such as time, task location, weather, etc.) to form a “context-user-location” relationship. It enables us to automatically discover and use context information when predicting, and to satisfy users’ personalized needs changing with the change of contextual information. For example, users are more willing to go to sports and eat after work rather than go to dinner first. Compared with workdays, users are more willing to go to entertainment plaza on weekend. In this paper, time is divided into workday and weekend. It is regarded as a kind of temporal context information and is integrated into the user’s location prediction process. The context information is appropriately fused into the location prediction algorithm. On the one hand, it accords with the practical significance of contextual information; on the other hand, it has substantial help in location prediction and improves the accuracy of prediction.
The main contributions of this paper are as follows:
-
(1)
Based on the discrete historical data, we mine the context dependent user movement pattern.
-
(2)
Based on the context dependent user movement pattern, we propose a location prediction algorithm for mobile crowdsourcing workers, which can provide support for spatial task assignment.
-
(3)
Based on the experiments on real data sets, we verify the validity and accuracy of the proposed mobile user location prediction algorithm.
The rest of this paper is organized as follows. In Sect. 2, we mainly discuss the related work; in Sect. 3 we give the definition; in Sect. 4, we propose the method, describe the specific algorithms and examples; in the Sect. 5, we show the experimental results; and in Sect. 6, we draw conclusions of this paper.
2 Related Work
There are usually two ways of location prediction. The first is to predict the current location based on the last access point of the user, and to predict the location by calculating the transfer probability. The Markov and hidden Markov algorithms in paper [10, 12] are used for location prediction, combined with the relationship of the user and the time matching, which is only related to the transfer probability of the previous location to the current location. Paper [7] uses the ramble algorithm and the Markov algorithm for the simultaneous prediction; the user access path and the time interval are also the influencing factors of the prediction. The paper [6] predicts its future cell based on the user’s current cell. Although the user’s current location information has the most important meaning to predict its future location, it can greatly improve the system’s pretest performance if the location information of the user’s previous period of location is also taken into consideration.
The second way is to collect historical location point information to predict the current location. Paper [13] model the location of historical activities, and take the moving trend of the user as an important factor in location prediction. SPM (Sampled Pattern Matching) algorithm [8], PPM (Prediction by Partial Matching) algorithm [9] also based on the Markov model for trajectory prediction. These algorithms are based on the trajectory prediction based on Markov expansion, and some improvements have been made in improving the prediction accuracy or optimizing the time space complexity. However, there are still some problems, such as the lack of historical information, which leads to the lack of prediction accuracy in simple Markov models.
Li et al. [14] referred to the historical trajectory of workers, and recommended a route that contains as many tasks as possible for workers. Although the paper overcomes the problem of dynamic programming path, it can update the route timely when new tasks arrive. However, the influence of contextual information on workers is not taken into consideration, and workers may refuse to accept the recommended route.
In this paper we improve the location prediction method proposed in paper [13], add the influence of contexts to the prediction of movement patterns, taking into account the differences in user movement patterns on weekends/holidays and workdays, and extract movement rules based on context-sensitive movement patterns to improve the accuracy and adaptability of location prediction.
3 Problem Definition
In order to enhance the understanding, this section introduces the relevant definitions of the methods in this paper (Table 1).
Definition 1
Workers’ Movement Patterns (WMPs). Context dependent movement pattern WMPs is a sequence composed of multiple region numbers, which indicates that workers have been visited one after another in a day, expressed as \(W^{mp}(w)\) = (\(<(r_{1}, t_{1}), (r_{2},t_{2}), \dots ,(r_{n},t_{n})>\),C,supp). Movement Patterns can describe the trajectory of workers in daily life. C is context information, this paper mainly considers the context of time, and the C is divided into workday and weekend; supp is support, which is used to measure the possibility of a route appearing in the user’s historical trajectory, \(supp\) \(\ge \)0. We refer to the Apriori algorithm for calculation and threshold setting. In this paper, the threshold is set to 1.33.
Definition 2
Workers’ Movement Rules(WMRs). A movement rule, WMRs, describes the transfer relationship between regions which workers arrived at, expressed as \(W^{mr}(w)\) = \(<(r_{1}, t_{1}), (r_{2},t_{2}),\dots , (r_{k-1},t_{k-1})>\) \(\rightarrow \) \(<(r_{k},t_{k})>\). \(<(r_{1}, t_{1}), (r_{2},t_{2}), \dots , (r_{k-1},t_{k-1})>\) is the rule head, which represents the worker’s current trajectory, and the tail of the rule \(<r_{k}>\) represents the region where the worker will arrive with the greatest probability. The movement rule is obtained on the basis of the movement pattern. The following table gives an example of the set of movement rules:
4 Location Prediction Based on the Mining of Movement Rules
The location prediction process is shown in the Fig.1.
4.1 Generate Regions
The method proposed in this paper is based on regional prediction, all discrete location points in the history log of the mobile crowdsourcing platform are first clustered into regions, so the transfer of locations in worker’s historical trajectory is converted to the transfer of regions. Assume that locations of all tasks in this paper can also be included in these regions, the location points can be aggregated into regions by using K-Means algorithm, thus to realize the transfer from points to regions.
Since the location points are discrete and relatively sparse, we use the k-means algorithm. The algorithm is simple and efficient for large datasets, and has low time complexity and space complexity.
4.2 Mining Workers’ Movement Patterns
In this section, we refer to Apriori algorithm, detailing how to mine workers’ movement patterns. It is known that multiple workers’ actual route, the first step is to determine the time context, then mining the workday and weekend movement pattern respectively. First we obtain a candidate pattern set \(C_1\) of length 1, calculate the support and add into the movement pattern set \(L_1\) of length 1 if the support is greater than the threshold 1.33 set in this paper. Observe which regions can be directly reached from the current region \(L_1\), and add their region numbers to the set and form a candidate pattern set \(C_2\) of length 2. Then calculate the support and add those greater than the threshold into the movement pattern set \(L_2\) of length 2. According to this rule, continue to generate movement pattern sets until no one is left. Finally, combine the pattern together. Table 2 gives an example of worker context dependent movement pattern (WD represents workday, and PD represents weekend).
4.3 Generate Movement Rules
For a rule R:\(<(r_{1},t_{1}),(r_{2},t_{2}),\dots ,(r_{k-1},t_{k-1})>\) \(\rightarrow \) \(<(r_{k},t_{k})>\), confidence is defined using the following formula:
If the confidence of a rule is higher than the pre-set confidence threshold (\(coff_{min}\)), it will be selected for the next regional prediction phase. Since the movement pattern is extracted based on different contexts (workday/weekend), each movement rule also needs a contextual label to indicate a specific context.
Assume that the confidence threshold is 50, then the set of movement rules is shown in Table 3 (WD represents workday, and PD represents weekend).
4.4 Predict Workers’ Regions
Prediction of regions is the last stage, and the pseudocode of the algorithm is described below:
When scanning, if the context information is inconsistent, skip the current rules and then scan the next rule which can improve the efficiency of the algorithm. After getting the matching rule set, they are first sorted according to the length of the header, and then sort according to the confidence. This ensures that the prediction based on the longest sequence is as much as possible, and the accuracy of the prediction is improved.
After predicting the location of workers, the tasks in the region are then recommended to workers. The purpose of task assignment is to achieve local optimization by allocating the maximum tasks within a period of time. We consider the change of the worker’s movement trajectory, and then allocate the task to him to avoid the extra time and travel cost of the workers. It will increase the success rate of the task assignment, maximize the number of assignment tasks and reduce the cost of the platform.
5 Experiment
5.1 Experimental Design
In order to test the proposed method in real-world environment, we use the data set of Gowalla, a location-based social network, on which users can sign-in at different locations, including user time, latitude, longitude, and ID of location. More than 644 million data from 2009 to October 2010 are collected; we selected the top 1 million data with user number from 0 to 4806 as our data set, containing more than 4,000 users and 45,000 different locations.
In the experiment, the locations and users of the data set are used to represent the spatial crowdsourcing tasks and the locations of workers. As long as the worker arrives at the designated place to sign in, it is considered that the crowdsourcing task has been accepted and completed. Although the data set do not come directly from spatial crowdsourcing, it provides the distributions of workers and tasks. Since the algorithms studied in this paper rely on locations, we use this data set to draw some reasonable conclusions about their relative performance.
5.2 Experimental Result
As shown in Fig. 2, we can see that after dividing into workdays and weekends, the success rate of workdays is significantly higher than that we don’t distinguish workdays. After our investigation and analysis, a worker has only two days in a week and has a weekend trajectory, the data volume is small, on the other hand, the choice of workers is too much, but the impact is small. The main reason is insufficient data. After dividing the workdays, not only the success rate is improved, but also the time complexity of the algorithm is lower. Owing to we can directly judge whether workdays are based on the context labels, reduce the time of scanning rules, and then efficiently predict the region and assign tasks.
5.3 Experimental Evaluation
Next, we compare the WMP-methods in this paper with the UMP-methods proposed by Yavas et al. [13] from the perspective of accuracy of the region number that the test set predicted, as shown in Fig. 3.
The accuracy is defined as follows:
\(Accuracy\#k\) represents the accuracy on condition that there are k-clusters; \(hit\#k\) represents the number of data items predicted successfully; |Total| represents the total number of the sign-in data items by k-clusters.
It is clear that the accuracy of this method is higher than the UMP-method in [13]. We consider the sequential of the historical track of the workers, the probability of a user going to the first 100 locations is 0.5 greater than that of the following locations, which indicates that there is some potential connection between locations, not only considering the last region when predicting [11]. We take into account the areas that workers have been visited in the history, which greatly improves the accuracy, and further increases the probability of success in task assignment.
6 Conclusion
In mobile crowdsourcing services, it is crucial to effectively predict the mobile workers’ trajectories, so that they are willing to get to the location and perform their tasks with travel and time costs as little as possible. We propose a context-sensitive prediction approach for workers’ moving path in mobile crowdsourcing services. Thereby, when assigning spatial tasks on a crowdsourcing service platform, a task can be pushed to the workers who will enter the region within the deadline of the task. Our approach can avoid workers’ extra time and travel cost in performing the spatial tasks, and as a result, it is expected to increase the probability that a task is accepted and completed, and ultimately improve the success rate of task assignment.
References
Dang, H., Nguyen, T., To, H.: Maximum complex task assignment: towards tasks correlation in spatial crowdsourcing. In: ACM International Conference Proceeding Series, pp. 77–81 (2013)
Deng, D., Shahabi, C., Demiryurek, U.: Maximizing the number of worker’s self-selected tasks in spatial crowdsourcing. In: ACM Sigspatial International Conference on Advances in Geographic Information Systems, pp. 324–333 (2013)
Howe, J.: The rise of crowdsourcing. Wired 14(14), 1–5 (2011)
Kazemi, L., Shahabi, C.: GeoCrowd: enabling query answering with spatial crowdsourcing. In: International Conference on Advances in Geographic Information Systems, pp. 189–198 (2012)
Kazemi, L., Shahabi, C., Chen, L.: GeoTruCrowd: trustworthy query answering with spatial crowdsourcing. In: ACM Sigspatial International Conference on Advances in Geographic Information Systems, pp. 314–323 (2013)
Laasonen, K.: Clustering and prediction of mobile user routes from cellular data. In: Jorge, A.M., Torgo, L., Brazdil, P., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 569–576. Springer, Heidelberg (2005). https://doi.org/10.1007/11564126_59
Li, W., Xia, S.X., Liu, F., Zhang, L.: Hybrid Markov location prediction algorithm based on dynamic social ties. IEICE Trans. Inform. Syst. E98.D(8), 1456–1464 (2015)
Meng, W., Fang, R., Li, C., Yu, Q.: Soft network coding design in two-way relay channel. In: Global Communications Conference, pp. 4441–4446 (2013)
Pang, L., Zhang, Y., Li, J., Ma, Y., Wang, J.: Power allocation and relay selection for two-way relaying systems by exploiting physical-layer network coding. IEEE Trans. Veh. Technol. 63(6), 2723–2730 (2014)
Robards, M.W., Sunehag, P.: Semi-Markov kmeans clustering and activity recognition from body-worn sensors. In: IEEE International Conference on Data Mining, pp. 438–446 (2009)
Wang, W., Yin, H., Sadiq, S., Chen, L., Xie, M., Zhou, X.: SPORE: a sequential personalized spatial item recommender system. In: IEEE International Conference on Data Engineering, pp. 954–965 (2016)
Yang, Y., Wang, Z., Zhang, Q., Yang, Y.: A time based Markov model for automatic position-dependent services in smart home. In: Chinese Control and Decision Conference, pp. 2771–2776 (2010)
Yavaş, G., Katsaros, D., Ulusoy, Ö., Manolopoulos, Y.: A data mining approach for location prediction in mobile environments \(\star \). Data Knowl. Eng. 54(2), 121–146 (2005)
Li, Y., Yiu, M.L., Xu, W.: Oriented online route recommendation for spatial crowdsourcing task workers. In: Claramunt, C., et al. (eds.) SSTD 2015. LNCS, vol. 9239, pp. 137–156. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-22363-6_8
Acknowledgements
This work is supported by Natural Science Foundation of Shandong Province under Grant No. ZR2018MF014 and No. ZR2017MF065.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Jiang, Y., He, W., Cui, L., Yang, Q. (2018). User Location Prediction in Mobile Crowdsourcing Services. In: Pahl, C., Vukovic, M., Yin, J., Yu, Q. (eds) Service-Oriented Computing. ICSOC 2018. Lecture Notes in Computer Science(), vol 11236. Springer, Cham. https://doi.org/10.1007/978-3-030-03596-9_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-03596-9_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03595-2
Online ISBN: 978-3-030-03596-9
eBook Packages: Computer ScienceComputer Science (R0)