Abstract
Multidimensional data are representatives in a wide range of applications, from those in the latest state-of-the-art science and technology to specific social issues. And they have been subject to analysis using methods such as regression analysis and machine learning. However, they are rarely obtained as complete data and contain more or less biases and deficiencies. In this study, we form a network from a multidimensional dataset and use its degree distribution to detect data sparsity. Although model analysis based on the degree distribution has been conducted for many years, sparsity detection has not been a target of the degree distribution analysis. Furthermore, we attempt to increase the accuracy and precision of supervised learning by applying regressive weighting according to node grouping in the degree distribution spectrum. By making use of this algorithm, we can expand the range of utilization of incomplete data together with other promising progresses in complex networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Fitzmaurice, F.M., Laird, N.M., Ware, J.H.: Applied Longitudinal Analysis. Wiley, New York (2011)
Blomeke, R.C., Elliott, J.S., Senjaya, B., Hales, G.T.: A comparison of fingerprint image quality and matching performance between healthcare and general populations. In: Proceedings of 2009 IEEE 3rd International Conference on BTAS, vol. 9, pp. 1-4, IEEE, Washington DC (2009)
Morris, D., Coyle, S., Wu, Y., Lau, T.K., Wallace, G., Diamond, D.: Bio-sensing textile based patch with integrated optical detection system for sweat monitoring. Sens. Actuators B Chem. 139, 231–236 (2009)
Jiang, Z., Hu, M., Gao, Z., Fan, L., Dai, R., Pan, Y., Tang, W., Zhai, G., Lu, Y.: Detection of respiratory infections using RGB-infrared sensors on portable device. IEEE Sens. J. 20, 13674–13681 (2020)
Lee, S.J., Kim, H.M., Kim, S.I., Lee, H.M.: Evaluation of structural integrity of rail-way bridge using acceleration data and semi-supervised learning approach. Eng. Struct. 239, 1–16 (2021)
Shim, S., Kim, J., Lee, S.W., Cho, G.C.: Road damage detection using super-resolution and semi-supervised learning with generative adversarial network. Autom. Constr. 135, 1–16 (2022)
Chandy, R.P., Scully, P.J., Thomas, D.: A novel technique for online measurement of scaling using a multimode optical fibre sensor for industrial applications. Sens. Actuators B Chem. 71, 19–23 (2000)
Zhou, Z.-K., Wang, U.-K., Gong, H.-G., Shi, Y., Wang, Z., Zhang, B.: A fully-integrated optoelectronic detector with high gain bandwidth product. IEEE Access 7, 53032–53039 (2019)
Wotrnba, H.: Sensor sorting technology-is the minerals industry missing a chance? In: Proceedings XIII IMPC Istanbul 2006, pp. 21-29. IMPC, Istanbul (2006)
Leelasattarathkul, T., Liawruangrath, S., Rayanakorn, M., Liawruangrath, B., Oungpipat, W., Youngvises, N.: Greener analytical method for the determination of copper(II) in wastewater by micro flow system with optical sensor. Talanta 72, 126–131 (2007)
Ramprasad, R., Batra, R., Pilania, G., Mannodi-Kanakkithodi, A., Kim, C.: Machine learning in materials informatics: recent applications and prospects. Comput. Mater. 3(54), 1–13 (2017)
Sakai, O., Morita, T., Ueda, Y., Sano, N., Tachibana, K.: Chemical filters by non-thermal atmospheric pressure plasmas for reactive fields. Thin Solid Films 519, 6999–7004 (2011)
Urabe, K., Hiraoka, Y., Sakai, O.: Hydrazine generation for the reduction process using small-scale plasmas in an argon/ammonia mixed gas flow. Plasma Sources Sci. Technol. 22, 032003-1-4 (2013)
Urabe, K., Sakai, O.: Multiheterodyne interference spectroscopy using a probing optical frequency comb and a reference single-frequency laser. Phys. Rev. A 88, 023856-1-5 (2013)
Girolami, M., Mischak, H., Krebs, R.: Analysis of complex, multidimensional datasets. Drug Discovery Today: Technol. 3(1), 13–19 (2006)
Song, X., Wu, M., Jermaine, C., Ranka, S.: Statistical change detection for multi-dimensional data. In: KDD’07, SIGKDD, pp. 667-676. California (2007)
Dempster, A.P.: An overview of multivariate data analysis. J. Multivar. Anal. 1, 316–346 (1970)
Zaidan, M.A., Motalagh, N.H., Fung, P.L., Lu, D., Timonen, H., Kuula, J., Niemi, J.V., Tarkoma, S., Petaja, T., Kulmala, M., Hussein, T.: Intelligent calibration and virtual sensing for integrated low-cost air quality sensors. IEEE Sens. J. 20, 13638–13652 (2020)
Goodacre, R., Neal, M.J., Kell, D.B.: Quantitative analysis of multivariate data using artificial neural networks: a tutorial review and applications to the deconvolution of pyrolysis mass spectra. Zentralbl Bakteriol 284, 516–539 (1996)
Fang, J., Yang, F., Tong, R., Yu, Q., Dai, X.: Fault diagnosis of electric transformers based on infrared image processing and semi-supervised learning. Glob. Energy Interconnection 4, 596–607 (2021)
Ueno, S., Sakai, O.: Data driven calibration of color-sensitive optical sensor by supervised learning for botanical application. IEEE Sens. J. 22, 11915–11927 (2022). https://doi.org/10.1109/JSEN.2022.3171221
Ueno, S., Sakai, O.: Low-cost color-sensitive optical sensor calibrated by sparse training data. In: Proceedings of the 2021 IEEE 10th GCCE, pp. 402-403. IEEE Consumer Technology Society, Kyoto (2021)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)
Albert, R., Barabasi, A.-L.: Statistical mechanics of complex networks. Rev. Mod. Phys. 74, 47–97 (2002)
Hasan, M.A., Chaoji, V., Salem, S., Zaki, M.: Link prediction using supervised learning. In: SDM06: Workshop on Link Analysis. Counter-Terrorism and Security, pp. 798–805. SIAM, Maryland (2005)
Rubin, D.B.: Multiple Imputation for Nonresponse in Surveys. Wiley, New York, NY, USA (1987)
Tanner, M.A., Wong, W.H.: The calculation of posterior distributions by data augmentation. J. Am. Stat. Assoc. 82, 528–550 (1987)
Horvitz, D.G., Thompson, D.J.: A generalization of sampling without replacement from a finite universe. J. Am. Stat. Assoc. 47, 663–685 (1952)
Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for causal effects. Biometrika 70, 41–55 (1983)
Scharfstein, D.O., Rotnitzky, A., Robins, J.M.: Adjusting for nonignorable drop-out using semiparametric nonresponse models. J. Am. Stat. Assoc. 94, 1096–1146 (1999)
Ma, M., Korniss, G., Szymanski, B.K.: Learning parameters for balanced index influence maximization. In: Processing 9th International Conference on Complex Networks and Their Applications, pp. 167–177. Springer, Madrid (2020)
Xue, J.-H., Hall, P.: Why does rebalancing class-unbalanced data improve AUC for linear discriminant analysis? IEEE Trans. Pattern Anal. Mach. Intell. 37(5), 1109–1112 (2015)
Itten, J.: The Elements of Color. Van Nostrand Reinhold, New York, USA (1970)
Cytoscape open API. https://cytoscape.org/
The R Project for Statistical Computing. https://www.r-project.org/
Cover, T.M., Thomas, J.A.: Elements of Information Theory, 2nd edn. Wiley, Hoboken (2006)
Acknowledgements
The authors thank the members of in Checkers Co., Ltd., in particular Dr. K. Taguchi, for his useful comments. This work is partially supported by the Regional ICT Research Center of Human, Industry and Future at The University of Shiga Prefecture, by the Cabinet Office, Government of Japan, and by a Grant-in-Aid for Scientific Research from the Japanese Ministry of Education, Culture, Sports, Science and Technology (MEXT/JSPS KAKENHI) with Grant No. 22K18704.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ueno, S., Sakai, O. (2023). Detection of Sparsity in Multidimensional Data Using Network Degree Distribution and Improved Supervised Learning with Correction of Data Weighting. In: Cherifi, H., Mantegna, R.N., Rocha, L.M., Cherifi, C., Miccichè, S. (eds) Complex Networks and Their Applications XI. COMPLEX NETWORKS 2016 2022. Studies in Computational Intelligence, vol 1077. Springer, Cham. https://doi.org/10.1007/978-3-031-21127-0_32
Download citation
DOI: https://doi.org/10.1007/978-3-031-21127-0_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21126-3
Online ISBN: 978-3-031-21127-0
eBook Packages: EngineeringEngineering (R0)