Theta Architecture: Preserving the Quality of Analytics in Data-Driven Systems

Theodorou, Vasileios; Gerostathopoulos, Ilias; Amini, Sasan; Scandariato, Riccardo; Prehofer, Christian; Staron, Miroslaw

doi:10.1007/978-3-319-67162-8_19

Vasileios Theodorou¹⁶,
Ilias Gerostathopoulos¹⁷,
Sasan Amini¹⁷,
Riccardo Scandariato¹⁸,
Christian Prehofer¹⁹ &
…
Miroslaw Staron¹⁸

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 767))

Included in the following conference series:

European Conference on Advances in Databases and Information Systems

1048 Accesses
1 Citations

Abstract

With the recent advances in Big Data storage and processing, there is a real potential of data-driven software systems, i.e., systems that employ analysis of large amounts of data to inform their runtime decisions. However, for these decisions to be trustworthy and dependable, one needs to deal with the well-known challenges on the data analysis domain: data scarcity, low-quality of data available for analysis, low veracity of data and subsequent analysis results, data privacy constraints that hinder the analysis. A promising solution is to introduce flexibility in the data analytics part of the system enabling optimization at runtime of the algorithms and data streams based on the combination of veracity, privacy and scarcity in order to preserve the target level of quality of the data-driven decisions. In this paper, we investigate this solution by providing an adaptive reference architecture and illustrate its applicability with an example from the traffic management domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.w3.org/TR/vocab-dqv/.

References

Apache Hadoop (2017). http://hadoop.apache.org/
Abedjan, Z., Golab, L., Naumann, F.: Data profiling: a tutorial. In: Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD 2017, pp. 1747–1751 (2017)
Google Scholar
Carey, P.W., Mehler, J., Bever, T.G.: Judging the veracity of ambiguous sentences. J. Verbal Learn. Verb. Behav. 9(2), 243–254 (1970)
Article Google Scholar
Cheng, S.W., Garlan, D., Schmerl, B.: Stitch: a language for architecture-based self-adaptation. J. Syst. Softw. 85(12), 1–38 (2012)
Article Google Scholar
Dong, X.L., Gabrilovich, E., Murphy, K., Dang, V., Horn, W., Lugaresi, C., Sun, S., Zhang, W.: Knowledge-based trust: estimating the trustworthiness of web sources. Proc. VLDB Endow. 8(9), 938–949 (2015)
Article Google Scholar
Dong, X.L., Saha, B., Srivastava, D.: Less is more: selecting sources wisely for integration. In: Proceedings of the 39th International Conference on Very Large Data Bases, PVLDB 2013, pp. 37–48. VLDB Endowment (2013)
Google Scholar
Dustdar, S., Pichler, R., Savenkov, V., Truong, H.L.: Quality-aware service-oriented data integration: requirements, state of the art and open challenges. SIGMOD Rec. 41(1), 11–19 (2012)
Article Google Scholar
Filieri, A., et al.: Software engineering meets control theory. In: Proceedings of SEAMS 2015, pp. 71–82. IEEE, May 2015
Google Scholar
Florescu, D., Koller, D., Levy, A.Y.: Using probabilistic information in data integration. In: Proceedings of the 23rd International Conference on Very Large Data Bases, VLDB 1997, Athens, Greece, pp. 216–225, 25–29 August 1997
Google Scholar
Garlan, D., Cheng, S.W., Huang, A.C., Schmerl, B., Steenkiste, P.: Rainbow: architecture-based self-adaptation with reusable infrastructure. Computer 37(10), 46–54 (2004)
Article Google Scholar
Geistefeldt, J.: Operational experience with temporary hard shoulder running in Germany. Transp. Res. Rec. J. Transp. Res. Board 2278(6), 67–73 (2012)
Article Google Scholar
Ghezzi, C., Pinto, L.S., Spoletini, P., Tamburrelli, G.: Managing non-functional uncertainty via model-driven adaptivity. In: Proceedings of ICSE 2013, pp. 33–42. IEEE (2013)
Google Scholar
Gladbach, B.: Bundesanstalt fr Straenwesen: Merkblatt fr die Ausstattung von Verkehrsrechnerzentralen und Unterzentralen (MARZ). Technical report, Ausgabe 1999 (1999)
Google Scholar
Kephart, J., Chess, D.: The vision of autonomic computing. Computer 36(1), 41–50 (2003)
Article MathSciNet Google Scholar
Kreps, J., Narkhede, N., Rao, J., et al: Kafka: a distributed messaging system for log processing. In: Proceedings of the 6th International Workshop on Networking Meets Databases (NetDB 2011), pp. 1–7 (2011)
Google Scholar
Krotofil, M., Larsen, J., Gollmann, D.: The process matters. In: Proceedings of the 10th ACM Symposium on Information Computer and Communications Security. Association for Computing Machinery (ACM) (2015)
Google Scholar
Levine, T.R., Park, H.S., McCornack, S.A.: Accuracy in detecting truths and lies: documenting the “veracity effect”. Commun. Monogr. 66(2), 125–144 (1999)
Article Google Scholar
Li, Q., Li, Y., Gao, J., Zhao, B., Fan, W., Han, J.: Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 1187–1198. ACM (2014)
Google Scholar
Lukoianova, T., Rubin, V.L.: Veracity roadmap: is Big Data objective, truthful and credible? (2014)
Google Scholar
Mann, S., Vrij, A.: Police officers’ judgements of veracity tenseness, cognitive load and attempted behavioural control in real-life police interviews. Psychol. Crime Law 12(3), 307–319 (2006)
Article Google Scholar
Marr, B.: Big Data: the 5 vs. everyone must know. https://www.linkedin.com/pulse/20140306073407-64875646-big-data-the-5-vs-everyone-must-know
Marz, N., Warren, J.: Big Data: Principles and Best Practices of Scalable Realtime Data Systems, 1st edn. Manning Publications Co., Greenwich (2015)
Google Scholar
Menzies, T., Zimmermann, T.: Software analytics: so what? IEEE Softw. 30(4), 31–37 (2013)
Article Google Scholar
Mihaila, G.A., Raschid, L., Vidal, M.: Using quality of data metadata for source selection and ranking. In: Proceedings of the Third International Workshop on the Web and Databases, pp. 93–98 (2000)
Google Scholar
Naumann, F., Freytag, J.C., Spiliopoulou, M.: Quality driven source selection using data envelope analysis. In: Third Conference on Information Quality (IQ 1998), pp. 137–152 (1998)
Google Scholar
Pautasso, C., Zimmermann, O., Leymann, F.: Restful web services vs. “Big” web services: making the right architectural decision. In: Proceedings of the 17th International Conference on World Wide Web, WWW 2008, pp. 805–814. ACM, New York (2008)
Google Scholar
Quix, C., Hai, R., Vatov, I.: Metadata extraction and management in data lakes with GEMMS. CSIMQ 9, 67–83 (2016)
Article Google Scholar
Salehie, M., Tahvildari, L.: Self-adaptive software: landscape and research challenges. ACM Trans. Auton. Adapti. Syst. 4(2), 1–40 (2009)
Article Google Scholar
Schmid, S., Gerostathopoulos, I., Prehofer, C., Bures, T.: Self-adaptation based on big data analytics: a model problem and tool. In: Proceedings of the 12th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS 2017), pp. 102–108. IEEE Press, Piscataway (2017). https://doi.org/10.1109/SEAMS.2017.20
Srinivasa, S., Bhatnagar, V. (eds.): BDA 2012. LNCS, vol. 7678. Springer, Heidelberg (2012)
Google Scholar
Staron, M., Scandariato, R.: Data veracity in intelligent transportation systems: the slippery road warning scenario. In: 2016 IEEE Intelligent Vehicles Symposium (IV), pp. 821–826. IEEE (2016)
Google Scholar
Zhang, Y., Wang, H., Gao, H., Li, J.: Efficient accuracy evaluation for multi-modal sensed data. J. Comb. Optim. 32(4), 1068–1088 (2016)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Intracom SA Telecom Solutions, Athens, Greece
Vasileios Theodorou
Technische Universität München, Munich, Germany
Ilias Gerostathopoulos & Sasan Amini
University of Gothenburg, Gothenburg, Sweden
Riccardo Scandariato & Miroslaw Staron
Fortiss GmbH, Munich, Germany
Christian Prehofer

Authors

Vasileios Theodorou
View author publications
You can also search for this author in PubMed Google Scholar
Ilias Gerostathopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Sasan Amini
View author publications
You can also search for this author in PubMed Google Scholar
Riccardo Scandariato
View author publications
You can also search for this author in PubMed Google Scholar
Christian Prehofer
View author publications
You can also search for this author in PubMed Google Scholar
Miroslaw Staron
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vasileios Theodorou .

Editor information

Editors and Affiliations

Riga Technical University , Riga, Latvia
Mārīte Kirikova
Norwegian University of Science and Technology, Trondheim, Norway
Kjetil Nørvåg
University of Cyprus , Nicosia, Cyprus
George A. Papadopoulos
Free University of Bozen-Bolzano , Bozen-Bolzano, Italy
Johann Gamper
Institute of Computing Science, Poznan University of Technology, Poznan, Poland
Robert Wrembel
Université Lumière Lyon 2, Lyon, France
Jérôme Darmont
University of Bologna , Bologna, Italy
Stefano Rizzi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Theodorou, V., Gerostathopoulos, I., Amini, S., Scandariato, R., Prehofer, C., Staron, M. (2017). Theta Architecture: Preserving the Quality of Analytics in Data-Driven Systems. In: Kirikova, M., et al. New Trends in Databases and Information Systems. ADBIS 2017. Communications in Computer and Information Science, vol 767. Springer, Cham. https://doi.org/10.1007/978-3-319-67162-8_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-67162-8_19
Published: 09 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67161-1
Online ISBN: 978-3-319-67162-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics