Skip to main content

Generation of Synthetic Trajectory Microdata from Language Models

  • Conference paper
  • First Online:
Privacy in Statistical Databases (PSD 2022)

Abstract

Releasing and sharing mobility data, and specifically trajectories, is necessary for many applications, from infrastructure planning to epidemiology. Yet, trajectories are highly sensitive data, because the points visited by an individual can be identifying and also confidential. Hence, trajectories must be anonymized before releasing or sharing them. While most contributions to the trajectory anonymization literature take statistical approaches, deep learning is increasingly being used. We observe that natural language sentences and trajectories share a sequential nature that can be exploited in similar ways. In this paper, we present preliminary work on generating synthetic trajectories using machine learning models typically used for natural language processing. Our empirical results attest to the quality of the generated synthetic trajectories. Furthermore, our methods allow discovering natural neighborhoods based on trajectories.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A SF local’s guide to the neighborhoods of San Francisco. https://sfgal.com/sf-locals-guide-to-neighborhoods-of-san-francisco/.

References

  1. Abul, O., Bonchi, F., Nanni, M.: Never walk alone: Uncertainty for anonymity in moving objects databases. In: 2008 IEEE 24th International Conference on Data Engineering, pp. 376–385. IEEE, 7 April 2008

    Google Scholar 

  2. Al-Molegi, A,. Jabreel, M., Ghaleb, B.: STF-RNN: space time features-based recurrent neural network for predicting people next location. In: 2016 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–7. IEEE, 6 December 2006

    Google Scholar 

  3. Cunningham, T., Cormode, G., Ferhatosmanoglu, H., Srivastava, D.: Real-world trajectory sharing with local differential privacy. arXiv preprint arXiv:2108.02084. 4 August 2021

  4. De Montjoye, Y.-A., Hidalgo, C.A., Verleysen, M., Blondel, V.D.: Unique in the crowd: the privacy bounds of human mobility. Sci. Rep. 3(1), 1–5 (2013)

    Article  Google Scholar 

  5. Domingo-Ferrer, J., Sáinches, D., Blanco-Justicia, A. The limits of differential privacy (and its misuse in data release and machine learning). Commun. ACM 64(7), 33–35 (2021)

    Google Scholar 

  6. Domingo-Ferrer, J., Trujillo-Rasua, R.: Microaggregation- and permutation-based anonymization of movement data. Inf. Sci. 15(208), 55–80 (2012)

    Article  Google Scholar 

  7. Dong, Y., Pi, D.: Novel privacy-preserving algorithm based on frequent path for trajectory data publishing. Knowl. Based Syst. 15(148), 55–65 (2018)

    Article  Google Scholar 

  8. Feng, J., Yang, Z., Xu, F., Yu, H., Wang, M,. Li, Y.: Learning to simulate human mobility. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 3426–3433. 25 August 2020

    Google Scholar 

  9. Fiore, W., et al.: Privacy in trajectory micro-data publishing: a survey. Trans. Data Privacy 13, 91–149 (2020)

    Google Scholar 

  10. Gao, Q., Zhou, F., Zhang, K., Trajcevski, G., Luo, X., Zhang, F.: Identifying human mobility via trajectory embeddings. In: IJCAI, vol. 17, pp. 1689–1695, 19 August 2017

    Google Scholar 

  11. Goodfellow, I., et al.: Generative adversarial nets. Adv. Neural Inf. Process. Syst. 27 (2014)

    Google Scholar 

  12. Gramaglia, M., Fiore, M.: Hiding mobile traffic fingerprints with glove. In: Proceedings of the 11th ACM Conference on Emerging Networking Experiments and Technologies, pp. 1–13, 1 December 2015

    Google Scholar 

  13. Guerra-Balboa, P., Pascual, A.M., Parra-Arnau, J,. Forné, J.: Strufe. Anonymizing trajectory data: limitations and opportunities (2022)

    Google Scholar 

  14. Hua, J., Gao, Y., Zhong, S.: Differentially private publication of general time-serial trajectory data. In: 2015 IEEE Conference on Computer Communications (INFOCOM), pp. 549–557, IEEE, 26 April 2015

    Google Scholar 

  15. Huang, D., et al.: A variational autoencoder based generative model of urban human mobility. In: 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 425–430. IEEE, 28 March 2019

    Google Scholar 

  16. Jin, F., Hua, W., Francia, M., Chao, P., Orlowska, M., Zhou, X.: A survey and experimental study on privacy-preserving trajectory data publishing. TechRxiv (2021)

    Google Scholar 

  17. Kulkarni, V., Garbinato, B.: Generating synthetic mobility traffic using RNNs. In: Proceedings of the 1st Workshop on Artificial Intelligence and Deep Learning for Geographic Knowledge Discovery, pp. 1–4, 7 November 2017

    Google Scholar 

  18. Luca, M., Barlacchi, G., Lepri, B., Pappalardo, L.: A survey on deep learning for human mobility. ACM Comput. Surv. (CSUR) 55(1), 1–44 (2021)

    Article  Google Scholar 

  19. Mikolov, T., Karafiát, M., Burget, L., Cernocký, J., Khudanpur, S.: Recurrent neural network based language model. In Interspeech 2(3), 1045–1048 (2010)

    Google Scholar 

  20. Piorkowski, M, Sarafijanovic-Djukic, N., Grossglauser, M.: CRAWDAD data set EPFL/mobility (v. 2009–02-24). Traceset: cab, downloaded from February 2009

    Google Scholar 

  21. Rossi, A., Barlacchi, G., Bianchini, M., Lepri, B.: Modelling taxi drivers’ behaviour for the next destination prediction. IEEE Trans. Intell. Transp. Syst. 21(7), 2980–2989 (2019)

    Article  Google Scholar 

  22. Tu, Z., Zhao, K., Xu, F., Li, Y., Su, L., Jin, D.: Protecting trajectory from semantic attack considering \(k\)-anonymity, \(l\)-diversity, and \(t\)-loseness. IEEE Trans. Netw. Serv. Manag. 16(1), 264–78 (2018)

    Article  Google Scholar 

  23. Wang, X., Liu, X., Lu, Z., Yang, H.: Large scale GPS trajectory generation using map based on two stage GAN. J. Data Sci. 19(1), 126–41 (2021)

    Article  Google Scholar 

  24. Xi, L., Hanzhou, C., Clio, A.: trajGANs: using generative adversarial networks for geo-privacy protection of trajectory data. Vision paper (2018)

    Google Scholar 

  25. Xu, M., Han, J.: Next location recommendation based on semantic-behavior prediction. In: Proceedings of the 2020 5th International Conference on Big Data and Computing, pp. 65–73, 28 May 2020

    Google Scholar 

  26. Zheng, Y., Li, Q., Chen, Y., Xie, X., Ma, W.-Y.: Understanding mobility based on GPS data. In: Proceedings of ACM Conference on Ubiquitous Computing (UbiComp 2008), Seoul, Korea, pp. 312–321. ACM Press (2008)

    Google Scholar 

  27. Zheng, Y., Xie, X., Ma, W.-Y.: GeoLife: a collaborative social networking service among User, location and trajectory. IEEE Data Eng. Bull. 33(2), 32–40 (2010)

    Google Scholar 

  28. Zheng, Y., Zhang, L., Xie, X., Ma, W-.Y.: Mining interesting locations and travel sequences from GPS trajectories. In: Proceedings of International conference on World Wild Web (WWW 2009), Madrid, Spain, pp. 791–800. ACM Press (2009)

    Google Scholar 

Download references

Acknowledgements

This research was funded by the European Commission (projects H2020-871042 “SoBigData++” and H2020-101006879 “MobiDataLab”), and the Government of Catalonia (ICREA Acadèmia Prize to J. Domingo-Ferrer, FI grant to N. Jebreel). The authors are with the UNESCO Chair in Data Privacy, but the views in this paper are their own and are not necessarily shared by UNESCO.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alberto Blanco-Justicia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Blanco-Justicia, A., Jebreel, N.M., Manjón, J.A., Domingo-Ferrer, J. (2022). Generation of Synthetic Trajectory Microdata from Language Models. In: Domingo-Ferrer, J., Laurent, M. (eds) Privacy in Statistical Databases. PSD 2022. Lecture Notes in Computer Science, vol 13463. Springer, Cham. https://doi.org/10.1007/978-3-031-13945-1_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-13945-1_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-13944-4

  • Online ISBN: 978-3-031-13945-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics