Skip to main content

Estimating the Biasing Effect of Behavioural Patterns on Mobile Fitness App Data by Density-Based Clustering

  • Conference paper
  • First Online:
Geospatial Data in a Changing World

Abstract

Crowd-sourced data of high spatial and temporal resolution can provide a new basis for mobility analyses given that its various types of biases distorting the results are identified and adequately handled. In this paper, trajectory patterns that can affect the validity of mobile fitness app data are examined by means of cycling trajectories (n = 50,524) from the Helsinki Metropolitan Area, in Finland. In addition to mass events and group journeys, we evaluated the biasing effect of routes that have been repeatedly recorded by the same application user. Based on the results, repeatedly recorded commuting routes may skew fitness application data more than group patterns. Many of the changes in the frequencies and length distributions at different temporal granularities before and after extracting the ‘bias patterns’ were statistically significant. Also the skewed distribution of tracks among users (i.e. contribution inequality) became more even. The biases induced by behavioural patterns ought to be considered when evaluating the validity of fitness app data in analyses of general mobility behaviour and when designing value-added applications based on the data. Considering the trade-off between privacy and data accuracy regarding dissemination of sensitive crowd-sourced movement data, the findings emphasise the importance of preserving the possibility to detect individual-level phenomena in order to produce valid analysis results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Andrienko G, Andrienko N, Wrobel S (2007) Visual analytics tools for analysis of movement data. ACM SIGKDD Explor Newsl 9(2):38–46

    Article  Google Scholar 

  • Andrienko N, Andrienko G, Barrett L, Dostie M, Henzi P (2013) Space transformation for understanding group movement. IEEE Trans Visual Comput Graphics 19(12):2169–2178

    Article  Google Scholar 

  • Beecham R, Wood J (2014) Characterising group-cycling journeys using interactive graphics. Transp Res Part C: Emerg Technol 47:1–13

    Article  Google Scholar 

  • Bell B, Evans J, Mason C, Schliwa G (2014) Can cycling apps be used to inform smart infrastructure planning? http://efr.pbworld.com/publications/default.aspx?id=80 Accessed at 7 Dec 2015

  • Bergman C, Oksanen J (2016) Conflation of OSM and sports tracking data for automatic bicycle routing. Trans in GIS. doi:10.1111/tgis.12192

    Google Scholar 

  • Buchin K, Buchin M, van Kreveld M, Löffler M, Silveira RI (2013) Median trajectories. Algorithmica 66(3):595–614

    Article  Google Scholar 

  • Buchin M, Dodge S, Speckmann B (2014) Similarity of trajectories taking into account geographic context. J Spat Inform Sci 9:101–124

    Google Scholar 

  • Cao H, Mamoulis N, Cheung DW (2007) Discovery of periodic patterns in spatiotemporal sequences. IEEE Trans Knowl Data Eng 19(4):453–467

    Article  Google Scholar 

  • Damiani ML, Issa H, Fotino G, Heurich M, Cagnacci F (2015) Introducing ‘presence’ and ‘stationarity index’ to study partial migration patterns: an application of a spatio-temporal clustering technique. Int J Geogr Inf Sci. doi:10.1080/13658816.2015.1070267

    Google Scholar 

  • Dodge S (2011) Exploring movement using similarity analysis. Dissertation, University of Zürich

    Google Scholar 

  • Dodge S, Weibel R, Laube P (2011) Trajectory similarity analysis in movement parameter space. In: Proceedings of GISRUK, Plymouth, UK, 27–29 April 2011

    Google Scholar 

  • Dodge S, Laube P, Weibel R (2012) Movement similarity assessment using symbolic representation of trajectories. Int J Geogr Inf Sci 26(9):1563–1588

    Article  Google Scholar 

  • Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96(34):226–231

    Google Scholar 

  • Etienne L, Devogele T, Buchin M, McArdle G (2015) Trajectory Box Plot: a new pattern to summarize movements. Int J Geogr Inf Sci. doi:10.1080/13658816.2015.1081205

    Google Scholar 

  • Ferrari L, Mamei M (2013) Identifying and understanding urban sport areas using Nokia Sports Tracker. Pervasive Mobile Comput 9(5):616–628

    Google Scholar 

  • Griffin GP, Jiao J (2015) Where does bicycling for health happen? analysing volunteered geographic information through place and plexus. J Transport Health 2(2):238–247

    Article  Google Scholar 

  • Gudmundsson J, Laube P, Wolle T (2012) Computational movement analysis. In: Kresse W, Danko DM (eds) Handbook of geographic information. Springer, Heidelberg, pp 725–741

    Google Scholar 

  • Halkidi M, Batistakis Y, Vazirgiannis M (2001) On clustering validation techniques. J Intell Inform Syst 17(2–3):107–145

    Article  Google Scholar 

  • Kitchin R (2014) The Data Revolution: Big Data, Open Data, data infrastructures and their consequences. SAGE Publications Ltd

    Google Scholar 

  • Laube P, Imfeld S, Weibel R (2005) Discovering relative motion patterns in groups of moving point objects. Int J Geogr Inf Sci 19:639–668

    Article  Google Scholar 

  • Lee JG, Han J, Whang KY (2007) Trajectory clustering: a partition-and-group framework. In: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, Beijong, China, 11–14 June 2007

    Google Scholar 

  • Liu Y, Seah HS (2015) Points of interest recommendation from GPS trajectories. Int J Geogr Inf Sci. doi:10.1080/13658816.2015.1005094

    Google Scholar 

  • Liu W, Zheng Y, Chawla S, Yuan J, Xing X (2011) Discovering spatio-temporal causal interactions in traffic data streams. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, San Diego, CA, 21–24 Aug 2011

    Google Scholar 

  • Liu Q, Deng M, Shi Y, Wang J (2012) A density-based spatial clustering algorithm considering both spatial proximity and attribute similarity. Comput Geosci 46:296–309

    Article  Google Scholar 

  • Long JA, Nelson TA (2013) A review of quantitative methods for movement data. Int J Geogr Inf Sci 27(2):1–27

    Article  Google Scholar 

  • Nanni M, Pedreschi D (2006) Time-focused clustering of trajectories of moving objects. J Intell Inform Syst 27(3):267–289

    Article  Google Scholar 

  • Oksanen J, Bergman C, Sainio J, Westerholm J (2015) Methods for deriving and calibrating privacy-preserving heat maps from mobile sports tracking application data. J Transp Geogr 48:135–144

    Article  Google Scholar 

  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 999888:2825–2830

    Google Scholar 

  • Pucci P, Manfredini F, Tagliolato P (2015) Mapping urban practices through mobile phone data. Springer International Publishing

    Google Scholar 

  • Renso C, Trasarti R (2013) Understanding human mobility using mobility data mining. In: Renso C, Spaccapietra S, Zimányi E (eds) Mobility data. Cambridge University Press, pp 127–148

    Google Scholar 

  • Rinzivillo S, Pedreschi D, Nanni M, Giannotti F, Andrienko N, Andrienko G (2008) Visually driven analysis of movement data by progressive clustering. Inform Vis 7(3–4):225–239

    Article  Google Scholar 

  • Romanillos G, Austwick MZ, Ettema D, De Kruijf J (2015) Big data and cycling. Transport Rev. doi:10.1080/01441647.2015.1084067

    Google Scholar 

  • Sainio J, Westerholm J, Oksanen J (2015) Generating heat maps of popular routes online from massive mobile sports tracking application data in milliseconds while respecting privacy. ISPRS Int J Geo-Inform 4(4):1813–1826

    Article  Google Scholar 

  • Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proceedings of the 19th international conference on World wide web, Raleigh, NC, 26–30 April 2010

    Google Scholar 

  • Savage NS, Nishimura S, Chavez NE, Yan X (2010) Frequent trajectory mining on GPS data. In: Proceedings of the 3rd International Workshop on Location and the Web—LocWeb’10, Tokyo, Japan, 29 Nov 2010

    Google Scholar 

  • Shearmur R (2015) Dazzled by data: big data, the census and urban geography. Urban Geogr 36(7):965–968

    Article  Google Scholar 

  • Spaccapietra S, Parent C, Damiani ML, de Macedo JA, Porto F, Vangenot C (2008) A conceptual view on trajectories. Data Knowl Eng 65(1):126–146

    Article  Google Scholar 

  • Sun Y, Fan H (2014) Event identification from georeferenced images. In: Huerta J, Schade S, Granell C (eds) connecting a digital europe through location and place. lecture notes in geoinformation and cartography. Springer International Publishing, pp. 73–88

    Google Scholar 

  • Tam S-M, Clarke F (2015) Big data, official statistics and some initiatives by the Australian Bureau of statistics. Int Stat Rev 83(3):436–448

    Article  Google Scholar 

  • Traag V, Browet A, Calabrese F, Morlot F (2011) Social event detection in massive mobile phone data using probabilistic location inference. In: privacy, security, risk and trust (PASSAT) and IEEE Third International Conference on Social Computing (SocialCom), pp. 625–628

    Google Scholar 

  • Vickey TA, Breslin JG (2012) A study on twitter usage for fitness self-reporting via mobile apps. AAAI Spring Symposium—Technical Report, SS-12-05, pp.65–70

    Google Scholar 

  • Yang A, Fan H, Jing N, Sun Y, Zipf A (2016) Temporal analysis on contribution inequality in OpenStreetMap: a comparative study for four countries. ISPRS Int J Geo-Inform 5(1):5

    Article  Google Scholar 

  • Zhang L, Dalyot S, Sester M (2013) Travel-mode classification for optimizing vehicular travel route planning. In: Krisp JM (ed) Progress in location-based services, Lecture notes in geoinformation and cartography. Springer, Berlin Heidelberg, pp 277–295

    Google Scholar 

Download references

Acknowledgments

We gratefully thank Sports Tracking Technologies Ltd. (currently Amer Sports Digital Services Ltd.) for providing us the workout tracking data. This work was carried out as a part of the projects MyGeoTrust and SUPRA (Revolution of Location-Based Services: Embedded data refinement in Service Processes from Massive Geospatial Datasets) funded by Tekes, the Finnish Funding Agency for Technology and Innovation (grants 40302/14 and 40261/12).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cecilia Bergman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Bergman, C., Oksanen, J. (2016). Estimating the Biasing Effect of Behavioural Patterns on Mobile Fitness App Data by Density-Based Clustering. In: Sarjakoski, T., Santos, M., Sarjakoski, L. (eds) Geospatial Data in a Changing World. Lecture Notes in Geoinformation and Cartography. Springer, Cham. https://doi.org/10.1007/978-3-319-33783-8_12

Download citation

Publish with us

Policies and ethics