Data Warehousing and Exploratory Analysis for Market Monitoring

Geiger, Melanie; Stockinger, Kurt

doi:10.1007/978-3-030-11821-1_18

Data Warehousing and Exploratory Analysis for Market Monitoring

Melanie Geiger⁴ &
Kurt Stockinger⁴

Chapter
First Online: 14 June 2019

4627 Accesses
1 Citations

Abstract

With the growing trend of digitalization, many companies plan to use machine learning to improve their business processes or to provide new data-driven services. These companies often collect data from different locations with sometimes conflicting context. However, before machine learning can be applied, heterogeneous datasets often need to be integrated, harmonized, and cleaned. In other words, a data warehouse is often the foundation for subsequent analytics tasks.

In this chapter, we first provide an overview on best practices of building a data warehouse. In particular, we describe the advantages and disadvantage of the major types of data warehouse architectures based on Inmon and Kimball. Afterward, we describe a use case on building an e-commerce application where the users of this platform are provided with information about healthy products as well as products with sustainable production. Unlike traditional e-commerce applications, where users need to log into the system and thus leave personalized traces when they search for specific products or even buy them afterward, our application allows full anonymity of the users in case they do not want to log into the system. However, analyzing anonymous user interactions is a much harder problem than analyzing named users. The idea is to apply modern data warehousing, big data technologies, as well as machine learning algorithms to discover patterns in the user behavior and to make recommendations for designing new products.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Apache Mahout. Retrieved August 24, 2018., from http://mahout.apache.org/
Bernstein, P. A. (1976). Synthesizing third normal form relations from functional dependencies. ACM Transactions on Database Systems, 1(4), 277–298.
Article Google Scholar
Casper, J., & Olukotun, K. (2014). Hardware acceleration of database operations. In Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (pp. 151–160). ACM.
Google Scholar
Clifton, B. (2012). Advanced web metrics with Google analytics. Hoboken, NJ: Wiley.
Google Scholar
Ehrenmann, M., Pieringer, R., & Stockinger, K. (2012). Is there a cure-all for business analytics case studies of exemplary businesses in banking, telecommunications, and retail. Business Intelligence Journal, 17(3). TDWI.
Google Scholar
Ester, M., & Kriegel, H.P., & Sander, J. & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In International Conference on Knowledge Discovery and Data Mining, 1996.
Google Scholar
Feuerstein, S., & Pribyl, B. (2005). Oracle Pl/SQL programming. O’Reilly Media, Newton, MA.
Google Scholar
Hultgren, H. (2012). Modeling the agile data warehouse with data vault. Denver, CO: New Hamilton.
Google Scholar
Inmon, B. (1992). Building the data warehouse. Hoboken, NJ: Wiley.
Google Scholar
Ioannidis, Y. E. (1996). Query optimization. ACM Computing Surveys (CSUR), 28(1), 121–123.
Article Google Scholar
JasperSoft. Retrieved July 21, 2017, from https://www.jaspersoft.com/
Kimball, R. (2002). The data warehouse toolkit. Hoboken, NJ: Wiley.
Google Scholar
Larson, P. Å., & Levandoski, J. (2016). Modern main-memory database systems. Proceedings of the VLDB Endowment, 9(13), 1609–1610.
Article Google Scholar
Lawton, G. (2005). LAMP lights enterprise development efforts. Computer, 38(9), 18–20.
Article Google Scholar
MacQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. In Berkeley Symposium on Mathematical Statistics and Probability, University of California Press.
Google Scholar
McCallum, A., Nigam, K., & Ungar, L. H. (2000). Efficient clustering of high dimensional data sets with application to reference matching. In SIGKDD International Conference on Knowledge Discovery and Data Mining.
Google Scholar
Pentaho. Retrieved July 21, 2017, from http://www.pentaho.com/
Postgres. Retrieved July 21, 2017, from https://www.postgresql.org/
Talend. Retrieved July 21, 2017, from https://www.talend.com/
Wang, W., Zhang, M., Chen, G., Jagadish, H. V., Ooi, B. C., & Tan, K. L. (2016). Database meets deep learning: Challenges and opportunities. ACM SIGMOD Record, 45(2), 17–22.
Article Google Scholar

Download references

Acknowledgment

The work was funded by the Swiss Commission for Technology and Innovation (CTI) under grant 16053.2.

Author information

Authors and Affiliations

ZHAW Zurich University of Applied Sciences, Winterthur, Switzerland
Melanie Geiger & Kurt Stockinger

Authors

Melanie Geiger
View author publications
You can also search for this author in PubMed Google Scholar
Kurt Stockinger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kurt Stockinger .

Editor information

Editors and Affiliations

Inst. of Applied Information Technology, ZHAW Zurich University of Applied Sciences, Winterthur, Switzerland
Martin Braschler
Inst. of Applied Information Technology, ZHAW Zurich University of Applied Sciences, Winterthur, Switzerland
Thilo Stadelmann
Inst. of Applied Information Technology, ZHAW Zurich University of Applied Sciences, Winterthur, Switzerland
Kurt Stockinger

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Geiger, M., Stockinger, K. (2019). Data Warehousing and Exploratory Analysis for Market Monitoring. In: Braschler, M., Stadelmann, T., Stockinger, K. (eds) Applied Data Science. Springer, Cham. https://doi.org/10.1007/978-3-030-11821-1_18

Download citation

DOI: https://doi.org/10.1007/978-3-030-11821-1_18
Published: 14 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11820-4
Online ISBN: 978-3-030-11821-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics