Abstract
Dimension reduction provides a powerful means of reducing the number of random variables under consideration. However, there were many similar tuples in large datasets, and before reducing the dimension of the dataset, we removed some similar tuples to retain the main information of the dataset while accelerating the dimension reduction. Accordingly, we propose a dimension reduction technique based on biased sampling, a new procedure that incorporates features of both dimensional reduction and biased sampling to obtain a computationally efficient means of reducing the number of random variables under consideration. In this paper, we choose Principal Components Analysis(PCA) as the main dimensional reduction algorithm to study, and we show how this approach works.
This paper was supported by The National Key Research and Development Program of China (2020YFB1006104), The Opening Project of Intelligent Policing Key Laboratory of Sichuan Province (ZNJW2023KFZD004), Sichuan Police College (CJKY202001) and NSFC grant (62232005).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdiscip. Rev. Computat. Statist. 2(4), 433–459 (2010)
Blackard, J.A., Dean, D.J., Anderson, C.: The forest covertype dataset (1998)
Cochran, W.G.: Sampling techniques, 3rd edition. DBLP (1977)
Deshpande, A.J.: Sampling-based algorithms for dimension reduction, Ph. D. thesis, Massachusetts Institute of Technology (2007)
Gardner, M.W., Dorling, S.: Artificial neural networks (the multilayer perceptron)-a review of applications in the atmospheric sciences. Atmos. Environ. 32(14), 2627–2636 (1998)
Jolliffe, I.: Principal component analysis. Wiley Online Library (2002)
Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P., et al.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983)
Łukasik, S., Kulczycki, P.: An algorithm for sample and data dimensionality reduction using fast simulated annealing. In: Tang, J., King, I., Chen, L., Wang, J. (eds.) ADMA 2011. LNCS (LNAI), vol. 7120, pp. 152–161. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25853-4_12
Vitter, J.S.: Random sampling with a reservoir. ACM Trans. Math. Softw. (TOMS) 11(1), 37–57 (1985)
Yu, J., Tian, Q., Rui, T., Huang, T.S.: Integrating discriminant and descriptive information for dimension reduction and classification. IEEE Trans. Circuits Syst. Video Technol. 17(3), 372–377 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Li, Z., Yang, D., Li, M., Guo, H., Ye, T., Wang, H. (2023). Dimension Reduction Based on Sampling. In: Yu, Z., et al. Data Science. ICPCSEE 2023. Communications in Computer and Information Science, vol 1879. Springer, Singapore. https://doi.org/10.1007/978-981-99-5968-6_15
Download citation
DOI: https://doi.org/10.1007/978-981-99-5968-6_15
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-5967-9
Online ISBN: 978-981-99-5968-6
eBook Packages: Computer ScienceComputer Science (R0)