Abstract
Radio Frequency Identification (RFID) applications are set to play an essential role in object tracking and supply chain management systems. In the near future, it is expected that every major retailer will use RFID systems to track the movement of products from suppliers to warehouses, store backrooms and eventually to points of sale. The volume of information generated by such systems can be enormous as each individual item (a pallet, a case, or an SKU) will leave a trail of data as it moves through different locations. We propose two data models for the management of this data. The first is a path cube that preserves object transition information while allowing muti-dimensional analysis of path dependent aggregates. The second is a workflow cube that summarizes the major patterns and significant exceptions in the flow of items through the system. The design of our models is based on the following observations: (1) items usually move together in large groups through early stages in the system (e.g., distribution centers) and only in later stages (e.g., stores) do they move in smaller groups, (2) although RFID data is registered at the primitive level, data analysis usually takes place at a higher abstraction level, (3) many items have similar flow patterns and only a relatively small number of them truly deviate from the general trend, and (4) only non-redundant flow deviations with respect to previously recorded deviations are interesting. These observations facilitate the construction of highly compressed RFID data warehouses and the exploration of such data warehouses by scalable data mining. In this study we give a general overview of the principles driving the design of our framework. We believe warehousing and mining RFID data presents an interesting application for advanced data mining.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Gunopulos, D., Leymann, F.: Mining process models from workflow logs. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 469–483. Springer, Heidelberg (1998)
Agrawal, R., Srikant, R.: Fast algorithm for mining association rules in large databases. In Research Report RJ 9839, IBM Almaden Research Center, San Jose, CA (June 1994)
Beyer, K., Ramakrishnan, R.: Bottom-up computation of sparse and iceberg cubes. In: Proc. 1999 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD 1999), June 1999, pp. 359–370. Philadelphia, PA (1999)
Carrasco, R.C., Oncina, J.: Learning stochastic regular grammars by means of a state merging method. In: Carrasco, R.C., Oncina, J. (eds.) ICGI 1994. LNCS, vol. 862, pp. 139–152. Springer, Heidelberg (1994)
Chaudhuri, S., Dayal, U.: An overview of data warehousing and OLAP technology. SIGMOD Record 26, 65–74 (1997)
Gonzalez, H., Han, J., Li, X.: Flowcube: Constructuing RFID flowcubes for multi-dimensional analysis of commodity flows. In: Proc. 2006 Int. Conf. Very Large Data Bases (VLDB 2006), September 2006, Seoul, Korea (2006)
Gonzalez, H., Han, J., Li, X., Klabjan, D.: Warehousing and analysis of massive RFID data sets. In: Proc. 2006 Int. Conf. Data Engineering (ICDE 2006), April 2006, Atlanta, Georgia (2006)
Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: A relational aggregation operator generalizing group-by, cross-tab and sub-totals. Data Mining and Knowledge Discovery 1, 29–54 (1997)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2006)
Harinarayan, V., Rajaraman, A., Ullman, J.D.: Implementing data cubes efficiently. In: Proc. 1996 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD 1996), June 1996, pp. 205–216. Montreal, Canada (1996)
Venture development corporation (vdc). http://www.vdc-corp.com/
Jeffery, S.R., Alonso, G., Franklin, M.J.: Adaptive cleaning for RFID data streams. Technical Report UCB/EECS-2006-29, EECS Department, University of California, Berkeley (March 2006)
Jeffery, S.R., Alonso, G., Franklin, M.J., Hong, W., Widom, J.: A pipelined framework for online cleaning of sensor data streams. In: Proc. 2006 Int. Conf. Data Engineering (ICDE 2006), April 2006, Atlanta, Georgia (2006)
Sarma, S., Brock, D.L., Ashton, K.: The networked physical world. In: White paper, MIT Auto-ID Center (2000), http://archive.epcglobalinc.org/publishedresearch/MIT-AUTOID-WH-001.pdf
Sarma, S.E., Weis, S.A., Engels, D.W.: RFID systems, security & privacy implications. In: White paper, MIT Auto-ID Center (2002), http://archive.epcglobalinc.org/publishedresearch/MIT-AUTOID-WH-014.pdf
Shukla, A., Deshpande, P.M., Naughton, J.F.: Materialized view selection for multidimensional datasets. In: Proc. 1998 Int. Conf. Very Large Data Bases (VLDB 1998), August 1998, pp. 488–499. New York (1998)
Thollard, F., Dupont, P., dela Higuera, C.: Probabilistic DFA inference using kullback-leibler divergence and minimality. In: Probabilistic, D.F.A. (ed.) Proc. 2000 Int. Conf. Machine Learning (ICML 2000), June 2000, pp. 975–982. Stanford, CA (2000)
van der Aalst, W., Weijters, T., Maruster, L.: Workflow mining: Discovering process models from event logs. IEEE Trans. Knowl. Data Eng. 16, 1128–1142 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Han, J., Gonzalez, H., Li, X., Klabjan, D. (2006). Warehousing and Mining Massive RFID Data Sets. In: Li, X., Zaïane, O.R., Li, Z. (eds) Advanced Data Mining and Applications. ADMA 2006. Lecture Notes in Computer Science(), vol 4093. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11811305_1
Download citation
DOI: https://doi.org/10.1007/11811305_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37025-3
Online ISBN: 978-3-540-37026-0
eBook Packages: Computer ScienceComputer Science (R0)