An Efficient Approach for Query Processing of Incomplete High Dimensional Data Streams

Najib, Fatma M.; Ismail, Rasha M.; Badr, Nagwa L.; Gharib, Tarek F.

doi:10.1007/978-3-030-69717-4_57

Fatma M. Najib¹⁷,
Rasha M. Ismail¹⁷,
Nagwa L. Badr¹⁷ &
…
Tarek F. Gharib¹⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1339))

Included in the following conference series:

International Conference on Advanced Machine Learning Technologies and Applications

1575 Accesses

Abstract

Most recent applications such as sensor networks generate continuous data streams. Additional constraints are faced for efficient query processing of such data streams that have uncertain nature and require fast and timely processing. Traditional query processing techniques of static data process the whole data without partitioning them, which is not applicable to data streams. Applying data clustering is demanded as a preprocessing step of data streams. Thus, in this paper, we propose the Incomplete High dimensional Data streams Query processing (IHDQ) algorithm for efficiently answering data streams queries. Obtained results reveal the efficiency of clustering and query processing of the proposed IHDQ compared to the alternative state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Najib, F.M., Ismail, R.M., Badr, N.L., Gharib, T.: Clustering based approach for incomplete data streams processing. J. Intell. Fuzzy Syst. 38(3), 3213–3227 (2020)
Article Google Scholar
Najib, F.M., Ismail, R.M., Badr, N.L., Tolba, M.F.: Multiple queries optimization for data streams on cloud computing. In: Tenth International Conference on Computer Engineering & Systems (ICCES), pp. 28–33. IEEE (2015)
Google Scholar
Liu, Y., Li, X., Chen, X., Wang, X., Li, H.: High-performance machine learning for large-scale data classification considering class imbalance. Sci. Program. (2020)
Google Scholar
Najib, F.M., Ismail, R.M., Badr, N.L., Tolba, M.F.: Cloud-based data streams optimization. WIREs Data Min. Knowl. Discov. 8(3), e1247 (2018)
Article Google Scholar
Datta, S., Bhattacharjee, S., Das, S.: Clustering with missing features: a penalized dissimilarity measure based approach. Mach. Learn. 107(12), 1987–2025 (2018)
Article MathSciNet Google Scholar
Bu, F., Chen, Z., Zhang, Q., Yang, L.T.: Incomplete high-dimensional data imputation algorithm using feature selection and clustering analysis on cloud. J. Supercomput. 72(8), 2977–2990 (2016)
Article Google Scholar
Dzulkalnine, M.F., Sallehuddin, R.: Missing data imputation with fuzzy feature selection for diabetes dataset. SN. Appl. Sci. 1(4), 362 (2019)
Article Google Scholar
Kaur, A., Datta, A.: A novel algorithm for fast and scalable subspace clustering of high-dimensional data. J. Big Data 2(1), 17 (2015)
Article Google Scholar
Jain, N., Murthy, C.A.: Connectedness-based subspace clustering. Knowl. Inf. Syst. 58(1), 9–34 (2019)
Article Google Scholar
Wang, X., Lei, Z., Guo, X., Zhang, C., Shi, H., Li, S.Z.: Multi-view subspace clustering with intactness-aware similarity. Pattern Recogn. 88, 50–63 (2019)
Article Google Scholar
Struski, L., Śmieja, M., Tabor, J.: Pointed subspace approach to incomplete data. J. Classif. 28, 1–6 (2019)
MATH Google Scholar
Khalifa, S., Martin, P., Young, R.: Label-aware distributed ensemble learning: a simplified distributed classifier training model for big data. Big Data Res. 15, 1 (2019)
Article Google Scholar
van Rijn, J.N., Holmes, G., Pfahringer, B., Vanschoren, J.: The online performance estimation framework: heterogeneous ensemble learning for data streams. Mach. Learn. 107(1), 149–176 (2018)
Article MathSciNet Google Scholar
Yin, C., Xia, L., Zhang, S., Sun, R., Wang, J.: Improved clustering algorithm based on high-speed network data stream. Soft Comput. 22(13), 4185–4195 (2018)
Article Google Scholar
Shaikh, S.A., Watanabe, Y., Wang, Y., Kitagawa, H.: Smart scheme: an efficient query execution scheme for event-driven stream processing. Knowl. Inf. Syst. 58(2), 341–370 (2019)
Article Google Scholar
Zhang, L., Lu, W., Liu, X., Pedrycz, W., Zhong, C., Wang, L.: A global clustering approach using hybrid optimization for incomplete data based on interval reconstruction of missing value. Int. J Intell. Syst. 31(4), 297–313 (2016)
Article Google Scholar
Daily and Sports Activities Data Set. https://archive.ics.uci.edu/ml/datasets/Daily+and+Sports+Activities

Download references

Author information

Authors and Affiliations

Faculty of Computer and Information Sciences, Ain Shams University, Cairo, Egypt
Fatma M. Najib, Rasha M. Ismail, Nagwa L. Badr & Tarek F. Gharib

Authors

Fatma M. Najib
View author publications
You can also search for this author in PubMed Google Scholar
Rasha M. Ismail
View author publications
You can also search for this author in PubMed Google Scholar
Nagwa L. Badr
View author publications
You can also search for this author in PubMed Google Scholar
Tarek F. Gharib
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fatma M. Najib .

Editor information

Editors and Affiliations

Information Technology Department, Cairo University, Computer and Information Faculty, Giza, Egypt
Aboul-Ella Hassanien
College of Information Science and Engineering, Fujian University of Technology, Fuzhou, Fujian, China
Kuo-Chi Chang
International Center for Informatics Research, Beijing Jaiotong University, Beijing, China
Tang Mincong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Najib, F.M., Ismail, R.M., Badr, N.L., Gharib, T.F. (2021). An Efficient Approach for Query Processing of Incomplete High Dimensional Data Streams. In: Hassanien, AE., Chang, KC., Mincong, T. (eds) Advanced Machine Learning Technologies and Applications. AMLTA 2021. Advances in Intelligent Systems and Computing, vol 1339. Springer, Cham. https://doi.org/10.1007/978-3-030-69717-4_57

Download citation

DOI: https://doi.org/10.1007/978-3-030-69717-4_57
Published: 05 March 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69716-7
Online ISBN: 978-3-030-69717-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics