A Novel Chinese Points of Interest Classification Method Based on Weighted Quadratic Surface Support Vector Machine

Luo, An; Yan, Xin; Luo, Jian

doi:10.1007/s11063-021-10725-1

A Novel Chinese Points of Interest Classification Method Based on Weighted Quadratic Surface Support Vector Machine

Published: 10 January 2022

Volume 54, pages 2181–2200, (2022)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

An Luo¹,
Xin Yan² &
Jian Luo³

274 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Points of interest (POIs) are some focused geographic entities or specific locations that a considerable group of persons find useful or interesting. They are always the basis for supporting location-based applications such as navigation systems, recommendation systems and so on. And these applications always rely on the accurate POIs classification. In this paper, a novel classification method based on weighted quadratic surface support vector machine (WQSSVM) is proposed to classify Chinese POIs from different websites. We first utilize the large number of Chinese POIs to build sparse feature vectors. Then, a weight function is designed to calculate the relative importance of each sample, which is the input to the WQSSVM model. Finally, the proposed WQSSVM model is trained to obtain a suitable classifier supporting by a small proportion of the high-quality samples, and classify the rest large portion of POIs automatically. The WQSSVM model avoids the disadvantages induced by the kernel functions used in classic support vector machine models with kernels. The numerical results on thirteen real-life Chinese POIs datasets indicate that the WQSSVM model not only outperforms the QSSVM model due to the designed weight function but also outperforms other state-of-the-art text classification models in terms of classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modeling Hybrid Feature-Based Phishing Websites Detection Using Machine Learning Techniques

Article 21 March 2022

A novel feature and class-based globalization technique for text classification

Article 25 April 2023

Toward an intelligent tourism recommendation system based on artificial intelligence and IoT using Apriori algorithm

Article 20 October 2023

References

Aggarwal CC, Zhai C (2012) A survey of text classification algorithms. Springer, Boston, pp 163–222
Google Scholar
Ahlawat S, Choudhary A (2020) Hybrid CNN-SVM classifier for handwritten digit recognition. Proc Comput Sci 167:2554–2560 (International Conference on Computational Intelligence and Data Science)
Article Google Scholar
Alsaleem S (2011) Automated Arabic text categorization using SVM and NB. Int Arab J Inf Technol 2(2):124–128
Google Scholar
Aseervatham Sujeevan Gaussier Éric AABMDY (2012) Logistic regression and text classification. Wiley, New York
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article Google Scholar
Cai D, Zhao H (2016) Neural word segmentation learning for Chinese. arXiv preprint arXiv:1606.04300
Chen X, Qiu X, Zhu C, Liu P, Huang XJ (2015) Long short-term memory neural networks for Chinese word segmentation. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1197–1206
Chiu CC, Xie ZX, Wei HW, Lee WT (2017) A study of content-aware classification of POI. In: 2017 31st international conference on advanced information networking and applications workshops (WAINA). IEEE, pp 591–596
Das S (2001) Filters, wrappers and a boosting-based hybrid for feature selection. In: International conference on machine learning, pp 74–81
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Domingos P (2012) A few useful things to know about machine learning. Commun ACM 55(10):78–87
Article Google Scholar
Duan H, Zheng Y (2011) A study on features of the CRFs-based Chinese named entity recognition. Int J Adv Intell 3(2):287–294
Google Scholar
Dy JG, Brodley CE (2004) Feature selection for unsupervised learning. J Mach Learn Res 5:845–889
MathSciNet MATH Google Scholar
Efron B, Hastie T (2016) Computer age statistical inference. Cambridge University Press, Cambridge
Book Google Scholar
Gouveia C, Fonseca A (2008) New approaches to environmental monitoring: the use of ICT to explore volunteered geographic information. GeoJournal 72:185–197
Article Google Scholar
Ha M, Wang C, Chen J (2013) The support vector machine based on intuitionistic fuzzy number and kernel function. Soft Comput 17(4):635–641
Article Google Scholar
Haklay M, Weber P (2008) Openstreetmap: user-generated street maps. IEEE Pervasive Comput 7(4):12–18
Article Google Scholar
He K, Cao X, Shi Y, Nie D, Gao Y, Shen D (2018) Pelvic organ segmentation using distinctive curve guided fully convolutional networks. IEEE Trans Med Imaging 38(2):585–595
Article Google Scholar
Huang CL, Chen MC, Wang CJ (2007) Credit scoring with a data mining approach based on support vector machines. Expert Syst Appl 33(4):847–856
Article Google Scholar
Jokar Arsanjani J, Mooney P, Zipf A, Schauss A (2015) Quality assessment of the contributed land use information from OpenStreetMap versus authoritative datasets. Springer, Cham, pp 37–58
Google Scholar
Kim Y (2014) Convolutional neural networks for sentence classification. In: 2014 international conference on empirical methods in natural language processing
Li P, Luo A, Liu J, Wang Y, Zhu J, Deng Y, Zhang J (2018) Bidirectional gated recurrent unit neural network for Chinese address element segmentation. arXiv preprint arXiv:1810.04805
Li X, Dick A, Wang H, Shen C, van den Hengel A (2011) Graph mode-based contextual kernels for robust SVM tracking. In: 2011 international conference on computer vision, pp 1156–1163
López J, Maldonado S, Montoya R (2017) Simultaneous preference estimation and heterogeneity control for choice-based conjoint via support vector machines. J Oper Res Soc 68(11):1323–1334
Article Google Scholar
Luo J, Fang SC, Deng Z, Guo X (2016) Soft quadratic surface support vector machine for binary classification. Asia-Pac J Oper Res 33(6):1650046
Article MathSciNet Google Scholar
Luo J, Hong T, Fang SC (2018) Benchmarking robustness of load forecasting models under data integrity attacks. Int J Forecast 34(1):89–104
Article Google Scholar
Mikolov T, Sutskever I, Kai C, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, vol 26
Moraes R, Valiati JF, Neto WPG (2013) Document-level sentiment classification: an empirical comparison between SVM and ANN. Expert Syst Appl 40(2):621–633
Article Google Scholar
Ng AY, Jordan MI (2002) On discriminative vs. generative classifiers: a comparison of logistic regression and Naive Bayes. In: Advances in neural information processing systems, pp 841–848
Raghavan VV, Wong SM (1986) A critical analysis of vector space model for information retrieval. J Am Soc Inf Sci 37(5):279–287
Article Google Scholar
Roche S, Propeck-Zimmermann E, Mericskay B (2013) GeoWeb and crisis management: issues and perspectives of volunteered geographic information. GeoJournal 78(1):21–40
Article Google Scholar
Rodrigues F, Pereira FC, Alves A, Jiang S, Ferreira J (2012) Automatic classification of points-of-interest for land-use analysis. In: Proceedings of the fourth international conference on advanced geographic information systems, applications, and services (GEOProcessing), pp 41–49
Rogati M, Yang Y (2002) High-performing feature selection for text classification. In: Proceedings of the eleventh international conference on Information and knowledge management, pp 659–661
Scholkopf Bernhard AJS, Bach F (2002) Learning with kernels support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge
Google Scholar
Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv (CSUR) 34(1):1–47
Article Google Scholar
Song Q, Ni J, Wang G (2011) A fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE Trans Knowl Data Eng 25(1):1–14
Article Google Scholar
Song W, Sun G (2010) The role of mobile volunteered geographic information in urban management. In: 2010 18th international conference on geoinformatics. IEEE, pp 1–5
Stein RA, Jaques PA, Valiati JF (2019) An analysis of hierarchical text classification using word embeddings. Inf Sci 471:216–232
Article Google Scholar
Sun J, Li H, Fujita H, Fu B, Ai W (2020) Class-imbalanced dynamic financial distress prediction based on Adaboost-SVM ensemble combined with smote and time weighting. Inf Fusion 54:128–144
Article Google Scholar
Tian JI, Zhao W (2010) Words similarity algorithm based on Tongyici Cilin in semantic web adaptive learning system. J Jilin Univ (Inf Sci Ed) 28(6):602–608
Google Scholar
Tian Y, Sun M, Deng Z, Luo J, Li Y (2017) A new fuzzy set and nonkernel SVM approach for mislabeled binary classification with applications. IEEE Trans Fuzzy Syst 25(6):1536–1545
Article Google Scholar
Wei Z, Miao D, Chauchat JH, Zhong C (2008) Feature selection on Chinese text classification using character n-grams. In: International conference on rough sets and knowledge technology, pp 500–507
Wilkins EL, Radley D, Morris MA, Griffiths C (2017) Examining the validity and utility of two secondary sources of food environment data against street audits in England. Nutr J 16(1):1–13
Article Google Scholar
Wu H, Li D, Cheng M (2019) Chinese text classification based on character-level CNN and SVM. Int J Intell Inf Database Syst 12(3):212–228
Google Scholar
Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Philip SY et al (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37
Article Google Scholar
Xia S, Wang G, Chen Z, Duan Y (2018) Complete random forest based class noise filtering learning for improving the generalizability of classifiers. IEEE Trans Knowl Data Eng 31(11):2063–2078
Xia S, Chen B, Wang G, Zheng Y, Gao X, Giem E, Chen Z (2021) mCRF and mRD: two classification methods based on a novel multiclass label noise filtering learning framework. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2020.3047046
Yan X, Jia M (2018) A novel optimized SVM classification algorithm with multi-domain feature and its application to fault diagnosis of rolling bearing. Neurocomputing 313:47–64
Article Google Scholar
Zheng YH, Zhang DZ (2012) A text feature selection method based on tongyici cilin. J Xiamen Univ (Nat Sci) 51(2):200–203
Yang J, Liu Y, Zhu X, Liu Z, Zhang X (2012) A new feature selection based on comprehensive measurement both in inter-category and intra-category for text categorization. Inf Process Manag 48(4):741–754
Article Google Scholar
Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: International conference on machine learning, vol 97, Nashville, pp 412–420
Yang Z, Wang J, Evans D, Mi N (2016) Autoreplica: automatic data replica manager in distributed caching and data processing systems. In: 2016 IEEE 35th international performance computing and communications conference (IPCCC). IEEE, pp 1–6
Yao Y, Huang Z (2016) Bi-directional LSTM recurrent neural network for Chinese word segmentation. In: International conference on neural information processing. Springer, pp 345–353
Yu HF, Ho CH, Arunachalam P, Somaiya M, Lin CJ (2012) Product title classification versus text classification. Csie Ntu Edu Tw, pp 1–25
Yu X, Ye X, G Q (2019) Pipeline image segmentation algorithm and heat loss calculation based on gene-regulated apoptosis mechanism. Int J Press Vessels Pip 172:329–336
Article Google Scholar
Yuan Q, Cong G, Thalmann NM (2012) Enhancing Naive Bayes with various smoothing methods for short text classification. In: Proceedings of the 21st international conference on world wide web, pp 645–646
Zhang M, Zhang Y, Fu G (2016) Transition-based neural word segmentation. In: Proceedings of the 54th annual meeting of the association for computational linguistics, pp 421–431
Zhao L, Zhang A, Liu Y, Fei H (2020) Encoding multi-granularity structural information for joint Chinese word segmentation and POS tagging. Pattern Recognit Lett 138:163–169
Article Google Scholar
Zhao S, King I, Lyu MR (2018) Aggregated temporal tensor factorization model for point-of-interest recommendation. Neural Process Lett 47(3):975–992
Article Google Scholar
Zheng X, Chen H, Xu T (2013) Deep learning for Chinese word segmentation and POS tagging. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 647–657

Download references

Author information

Authors and Affiliations

Chinese Academy of Surveying and Mapping, Beijing, 100360, China
An Luo
School of Statistics and Information, Shanghai University of International Business and Economics, Shanghai, 201620, China
Xin Yan
School of Management, Hainan University, Haikou, 570228, China
Jian Luo

Authors

An Luo
View author publications
You can also search for this author in PubMed Google Scholar
Xin Yan
View author publications
You can also search for this author in PubMed Google Scholar
Jian Luo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xin Yan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research has been supported by National Natural Science Foundation of China (No. 71901140) and Humanities and Social Science Fund of Ministry of Education of China (No. 18YJC630220).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Luo, A., Yan, X. & Luo, J. A Novel Chinese Points of Interest Classification Method Based on Weighted Quadratic Surface Support Vector Machine. Neural Process Lett 54, 2181–2200 (2022). https://doi.org/10.1007/s11063-021-10725-1

Download citation

Accepted: 18 December 2021
Published: 10 January 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s11063-021-10725-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Novel Chinese Points of Interest Classification Method Based on Weighted Quadratic Surface Support Vector Machine

Abstract

Access this article

Similar content being viewed by others

Modeling Hybrid Feature-Based Phishing Websites Detection Using Machine Learning Techniques

A novel feature and class-based globalization technique for text classification

Toward an intelligent tourism recommendation system based on artificial intelligence and IoT using Apriori algorithm

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Novel Chinese Points of Interest Classification Method Based on Weighted Quadratic Surface Support Vector Machine

Abstract

Access this article

Similar content being viewed by others

Modeling Hybrid Feature-Based Phishing Websites Detection Using Machine Learning Techniques

A novel feature and class-based globalization technique for text classification

Toward an intelligent tourism recommendation system based on artificial intelligence and IoT using Apriori algorithm

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation