Abstract
The development of Streaming Mining technologies as a hotspot entered the limelight, which is more effectively to avoid big data and distributed streams mining problems. Especially for the IoT and Ubiquitous Computing may interact with the real world’s humans and physical objects in a sensory manner. They require quantitative guarantees regarding the precision of approximate answers and support distributed processing of high-volume, fast, and variety streams. Recent works on mining Top-k synopsis processing over data streams is that utilize all the data between a particular point of landmark and the current time for mining. Actually, the landmark and parameter k are two more important factors to obtain high-quality approximate results. Therefore, we proposed a Proper-Wavelet Function (PWF) algorithm to smooth the approximate approach, in order to reduce k-effect to the final approximate results. Finally, we demonstrate the effectiveness of our algorithm in achieving high-quality k-nearest neighbors mining results with applying wider proper k values.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Arasu, A., Babcock, B., Babu, S., Cieslewica, J., Datar, M., Ito, K., Motwani, R., Srivastava, U., Widom, J.: STREAM: the Stanford Data Stream Management System. Technical Report, Stanford University (2004)
Botan, I., Derakhshan, R., Dindar, N., Haas, L., Miller, R.J., Tatbul, N.: SECRET: a model for analysis of the execution semantics of stream processing systems. In: Very Large Data Base, pp. 232–243. VLDB Press (2010)
Jin, C., Yi, K., Chen, L., Yu, J.X., Lin, X.: Sliding-window Top-k Queries on Uncertain Streams. In: Very Large Data Base, pp. 301–312. VLDB Press, New Zealand (2008)
Teng, W.G., Chen, M.S., Yu, P.S.: Resource-aware Mining with Variable Granularities in Data Steams. In: 2004 Fourth SIAM International Conference on Data Mining, pp. 527–531. SIAM Press, Lake Buena Vista (2004)
Tong, Y., Chen, L., Cheng, Y., Yu, P.S.: Mining Frequent Itemsets over Uncertain Databases, pp. 1650–1661. VLDB Press, Turkey (2012)
Guha, S., Harb, B.: Wavelet synopsis for data streams: minimizing non-euclidean error. In: 2005 Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 88–97. ACM Press, Chicago (2005)
Hung, H.P., Chen, M.S.: Efficient range-constrained similarity search on wavelet synopses over multiple streams. In: 2006 Fifteenth ACM International Conference on Information and Knowledge Management, pp. 327–336. ACM Press, Arlington (2006)
Sacharidis, D.: Constructing Optimal Wavelet Synopses. In: Grust, T., Höpfner, H., Illarramendi, A., Jablonski, S., Fischer, F., Müller, S., Patranjan, P.-L., Sattler, K.-U., Spiliopoulou, M., Wijsen, J. (eds.) EDBT 2006. LNCS, vol. 4254, pp. 97–104. Springer, Heidelberg (2006)
Zhu, Y., Shasha, D.: Efficient Elastic Burst Detection in Data Streams. In: 2003 Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 336–345. ACM Press, Washington (2003)
Teng, W.G., Chen, M.S., Yu, P.S.: Resource-aware mining with variable granularities in data streams. In: 2004 Fourth SIAM International Conference on Data Mining, pp. 527–531. SIAM Press, Lake Buena Vista (2004)
Tao, Y., Yi, K., Sheng, C., Kalnis, P.: Quality and Efficiency in High Dimensional Nearest Neighbor Search. In: 2009 ACM SIGMOD International Conference on Management of Data, pp. 563–576. ACM Press, Providence (2009)
Sharifzadeh, M., Shahabi, C.: VoR-Tree: R-trees with Voronoi Diagrams for Efficient Processing of Spatial Nearest Neighbor Queries, pp.1231–1242.VLDB Press (2010)
Wang, L., Zhou, T.H., Kim, K.A., Cha, E.J., Ryu, K.H.: Adaptive Approximation-based Streaming Skylines for Similarity Search Query. J. Software Engineering and Its Applications, 113–118 (2012)
Yao, B., Li, F., Kumar, P.: Reverse Furthest Neighbors in Spatial Database, pp. 664–675. IEEE Press, Shanghai (2009)
Aly, A.M., Aref, W.G., Ouzzani, M.: Spatial Queries with Two kNN Predicates, vol. 5(11), pp. 1100–1111. VLDB Press (2012)
Zhang, Y., Lin, X., Zhu, G., Zhang, W., Lin, Q.: Efficient Rank based kNN Query Processing over Uncertain Data, pp. 28–39. Long Beach (2010)
Wang, L., Zhou, T.H., Shon, H.S., Lee, Y.K., Ryu, K.H.: Extract and Maintain the Most Helpful Wavelet Coefficients for Continuous k-Nearest Neighbor Queries in Stream Processing, pp. 358–363. Springer-Verlag Press, Changsha (2010)
Garofalakis, M., Gibbons, P.B.: Wavelet synopses with error guarantees. In: 2002 ACM SIGMOD International Conference on Management of Data, pp. 476–487. ACM Press, Madison (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, L., Qu, Z.Y., Zhou, T.H., Yu, X.M., Ryu, K.H. (2014). A PWF Smoothing Algorithm for K-Sensitive Stream Mining Technologies over Sliding Windows. In: Hwang, D., Jung, J.J., Nguyen, NT. (eds) Computational Collective Intelligence. Technologies and Applications. ICCCI 2014. Lecture Notes in Computer Science(), vol 8733. Springer, Cham. https://doi.org/10.1007/978-3-319-11289-3_51
Download citation
DOI: https://doi.org/10.1007/978-3-319-11289-3_51
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11288-6
Online ISBN: 978-3-319-11289-3
eBook Packages: Computer ScienceComputer Science (R0)