Abstract
Longitudinal linked data are RDF descriptions of observations from related sampling frames or sensors at multiple points in time, e.g., patient medical records or climate sensor data. Observations are expressed as measurements whose values can be repeated several times in a sampling frame, resulting in a considerable increase in data volume. We devise a factorized compact representation of longitudinal linked data to reduce repetition of same measurements, and propose algorithms to generate collections of factorized longitudinal linked data that can be managed by existing RDF triple stores. We empirically study the effectiveness of the proposed factorized representation on linked observation data. We show that the total data volume can be reduced by more than 30 % on average without loss of information, as well as improve compression ratio of state-of-the-art compression techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Prefixes are used as in https://www.w3.org/wiki/SRBench; additionally, we consider the prefix: ld:\(\mathtt{<}\) http://longitudinallinkeddata.com/ontology#\(\mathtt{>}\).
- 2.
Available at: http://wiki.knoesis.org/index.php/LinkedSensorData.
- 3.
References
Abowd, J.M., Woodcock, S.D.: Multiply-imputing confidential characteristics and file links in longitudinal linked data. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 290–297. Springer, Heidelberg (2004). doi:10.1007/978-3-540-25955-8_23
Bakibayev, N., Kociský, T., Olteanu, D., Zavodny, J.: Aggregation and ordering in factorised databases. PVLDB 6(14), 1990–2001 (2013)
Ceri, S., Gottlob, G., Tanca, L.: What you always wanted to know about datalog (and never dared to ask). IEEE Trans. Knowl. Data Eng. 1(1), 146–166 (1989)
Fernández, J.D., Martínez-Prieto, M.A., Gutiérrez, C., Polleres, A., Arias, M.: Binary RDF representation for publication and exchange (HDT). J. Web Sem. 19, 22–41 (2013)
Joshi, A.K., Hitzler, P., Dong, G.: Logical linked data compression. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 170–184. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38288-8_12
Neumann, T., Weikum, G.: The RDF-3X engine for scalable management of RDF data. VLDB J. 19(1), 91–113 (2010)
Pan, J.Z., Pérez, J.M.G., Ren, Y., Wu, H., Wang, H., Zhu, M.: Graph pattern based RDF data compression. In: Supnithi, T., Yamaguchi, T., Pan, J.Z., Wuwongse, V., Buranarach, M. (eds.) JIST 2014. LNCS, vol. 8943, pp. 239–256. Springer, Heidelberg (2015). doi:10.1007/978-3-319-15615-6_18
Acknowledgements
This work is supported in part by the EU H2020 programme for the project BigDataEurope (GA 644564). Farah Karim is supported by a scholarship of German Academic Exchange Service (DAAD).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Karim, F., Vidal, ME., Auer, S. (2016). Factorization Techniques for Longitudinal Linked Data (Short Paper). In: Debruyne, C., et al. On the Move to Meaningful Internet Systems: OTM 2016 Conferences. OTM 2016. Lecture Notes in Computer Science(), vol 10033. Springer, Cham. https://doi.org/10.1007/978-3-319-48472-3_42
Download citation
DOI: https://doi.org/10.1007/978-3-319-48472-3_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48471-6
Online ISBN: 978-3-319-48472-3
eBook Packages: Computer ScienceComputer Science (R0)