Warehousing and Mining Massive RFID Data Sets

Han, Jiawei; Gonzalez, Hector; Li, Xiaolei; Klabjan, Diego

doi:10.1007/11811305_1

Jiawei Han²²,
Hector Gonzalez²²,
Xiaolei Li²² &
…
Diego Klabjan²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4093))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

3034 Accesses
25 Citations

Abstract

Radio Frequency Identification (RFID) applications are set to play an essential role in object tracking and supply chain management systems. In the near future, it is expected that every major retailer will use RFID systems to track the movement of products from suppliers to warehouses, store backrooms and eventually to points of sale. The volume of information generated by such systems can be enormous as each individual item (a pallet, a case, or an SKU) will leave a trail of data as it moves through different locations. We propose two data models for the management of this data. The first is a path cube that preserves object transition information while allowing muti-dimensional analysis of path dependent aggregates. The second is a workflow cube that summarizes the major patterns and significant exceptions in the flow of items through the system. The design of our models is based on the following observations: (1) items usually move together in large groups through early stages in the system (e.g., distribution centers) and only in later stages (e.g., stores) do they move in smaller groups, (2) although RFID data is registered at the primitive level, data analysis usually takes place at a higher abstraction level, (3) many items have similar flow patterns and only a relatively small number of them truly deviate from the general trend, and (4) only non-redundant flow deviations with respect to previously recorded deviations are interesting. These observations facilitate the construction of highly compressed RFID data warehouses and the exploration of such data warehouses by scalable data mining. In this study we give a general overview of the principles driving the design of our framework. We believe warehousing and mining RFID data presents an interesting application for advanced data mining.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Gunopulos, D., Leymann, F.: Mining process models from workflow logs. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 469–483. Springer, Heidelberg (1998)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithm for mining association rules in large databases. In Research Report RJ 9839, IBM Almaden Research Center, San Jose, CA (June 1994)
Google Scholar
Beyer, K., Ramakrishnan, R.: Bottom-up computation of sparse and iceberg cubes. In: Proc. 1999 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD 1999), June 1999, pp. 359–370. Philadelphia, PA (1999)
Google Scholar
Carrasco, R.C., Oncina, J.: Learning stochastic regular grammars by means of a state merging method. In: Carrasco, R.C., Oncina, J. (eds.) ICGI 1994. LNCS, vol. 862, pp. 139–152. Springer, Heidelberg (1994)
Chapter Google Scholar
Chaudhuri, S., Dayal, U.: An overview of data warehousing and OLAP technology. SIGMOD Record 26, 65–74 (1997)
Article Google Scholar
Gonzalez, H., Han, J., Li, X.: Flowcube: Constructuing RFID flowcubes for multi-dimensional analysis of commodity flows. In: Proc. 2006 Int. Conf. Very Large Data Bases (VLDB 2006), September 2006, Seoul, Korea (2006)
Google Scholar
Gonzalez, H., Han, J., Li, X., Klabjan, D.: Warehousing and analysis of massive RFID data sets. In: Proc. 2006 Int. Conf. Data Engineering (ICDE 2006), April 2006, Atlanta, Georgia (2006)
Google Scholar
Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: A relational aggregation operator generalizing group-by, cross-tab and sub-totals. Data Mining and Knowledge Discovery 1, 29–54 (1997)
Article Google Scholar
Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2006)
Google Scholar
Harinarayan, V., Rajaraman, A., Ullman, J.D.: Implementing data cubes efficiently. In: Proc. 1996 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD 1996), June 1996, pp. 205–216. Montreal, Canada (1996)
Google Scholar
Venture development corporation (vdc). http://www.vdc-corp.com/
Jeffery, S.R., Alonso, G., Franklin, M.J.: Adaptive cleaning for RFID data streams. Technical Report UCB/EECS-2006-29, EECS Department, University of California, Berkeley (March 2006)
Google Scholar
Jeffery, S.R., Alonso, G., Franklin, M.J., Hong, W., Widom, J.: A pipelined framework for online cleaning of sensor data streams. In: Proc. 2006 Int. Conf. Data Engineering (ICDE 2006), April 2006, Atlanta, Georgia (2006)
Google Scholar
Sarma, S., Brock, D.L., Ashton, K.: The networked physical world. In: White paper, MIT Auto-ID Center (2000), http://archive.epcglobalinc.org/publishedresearch/MIT-AUTOID-WH-001.pdf
Sarma, S.E., Weis, S.A., Engels, D.W.: RFID systems, security & privacy implications. In: White paper, MIT Auto-ID Center (2002), http://archive.epcglobalinc.org/publishedresearch/MIT-AUTOID-WH-014.pdf
Shukla, A., Deshpande, P.M., Naughton, J.F.: Materialized view selection for multidimensional datasets. In: Proc. 1998 Int. Conf. Very Large Data Bases (VLDB 1998), August 1998, pp. 488–499. New York (1998)
Google Scholar
Thollard, F., Dupont, P., dela Higuera, C.: Probabilistic DFA inference using kullback-leibler divergence and minimality. In: Probabilistic, D.F.A. (ed.) Proc. 2000 Int. Conf. Machine Learning (ICML 2000), June 2000, pp. 975–982. Stanford, CA (2000)
Google Scholar
van der Aalst, W., Weijters, T., Maruster, L.: Workflow mining: Discovering process models from event logs. IEEE Trans. Knowl. Data Eng. 16, 1128–1142 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
Jiawei Han, Hector Gonzalez, Xiaolei Li & Diego Klabjan

Authors

Jiawei Han
View author publications
You can also search for this author in PubMed Google Scholar
Hector Gonzalez
View author publications
You can also search for this author in PubMed Google Scholar
Xiaolei Li
View author publications
You can also search for this author in PubMed Google Scholar
Diego Klabjan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology and Electronic Engineering, The University of Queensland, Queensland, Australia
Xue Li
University of Alberta, Canada
Osmar R. Zaïane
Northwest Polytechnical University, China
Zhanhuai Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Han, J., Gonzalez, H., Li, X., Klabjan, D. (2006). Warehousing and Mining Massive RFID Data Sets. In: Li, X., Zaïane, O.R., Li, Z. (eds) Advanced Data Mining and Applications. ADMA 2006. Lecture Notes in Computer Science(), vol 4093. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11811305_1

Download citation

DOI: https://doi.org/10.1007/11811305_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37025-3
Online ISBN: 978-3-540-37026-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics