Using K-Means Algorithm for Description Analysis of Text in RSS News Format

Ariza-Colpas, Paola; Oviedo-Carrascal, Ana Isabel; De-la-hoz-Franco, Emiro

doi:10.1007/978-981-32-9563-6_17

Paola Ariza-Colpas^10,11,
Ana Isabel Oviedo-Carrascal¹¹ &
Emiro De-la-hoz-Franco¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1071))

Included in the following conference series:

International Conference on Data Mining and Big Data

1213 Accesses

Abstract

This article shows the use of different techniques for the extraction of information through text mining. Through this implementation, the performance of each of the techniques in the dataset analysis process can be identified, which allows the reader to recommend the most appropriate technique for the processing of this type of data. This article shows the implementation of the K-means algorithm to determine the location of the news described in RSS format and the results of this type of grouping through a descriptive analysis of the resulting clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Comparative Study of Clustering Techniques for Extractive Text Summarization

A News Text Clustering Method Based on Similarity of Text Labels

Extractive Text Summarization on Large-scale Dataset Using K-Means Clustering

References

Palechor, F., De la hoz manotas, A., De la hoz franco, E., Colpas, P: Feature selection, learning metrics and dimension reduction in training and classification processes in intrusion detection systems. J. Theor. Appl. Inf. Technol. 82(2) (2015)
Google Scholar
Calabria-Sarmiento, J.C., et al.: Software applications to health sector: a systematic review of literature (2018)
Google Scholar
Sen, T., Ali, M.R., Hoque, M.E., Epstein, R., Duberstein, P.: Modeling doctor-patient communication with affective text analysis. In: 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 170–177. IEEE (2017)
Google Scholar
Jeon, S.W., Lee, H.J., Cho, S.: Building industry network based on business text: corporate disclosures and news. In: 2017 IEEE International Conference on Big Data (Big Data), pp. 4696–4704. IEEE (2017)
Google Scholar
Irfan, M., Zulfikar, W.B.: Implementation of fuzzy C-Means algorithm and TF-IDF on English journal summary. In: 2017 Second International Conference on Informatics and Computing (ICIC), pp. 1–5. IEEE (2017)
Google Scholar
De-La-Hoz-Franco, E., Ariza-Colpas, P., Quero, J.M., Espinilla, M.: Sensor-based datasets for human activity recognition–a systematic review of literature. IEEE Access 6, 59192–59210 (2018)
Article Google Scholar
Zhang, X., Yu, Q.: Hotel reviews sentiment analysis based on word vector clustering. In: 2017 2nd IEEE International Conference on Computational Intelligence and Applications (ICCIA), pp. 260–264. IEEE (2017)
Google Scholar
Vieira, A.S., Borrajo, L., Iglesias, E.L.: Improving the text classification using clustering and a novel HMM to reduce the dimensionality. Comput. Methods Programs Biomed. 136, 119–130 (2016)
Article Google Scholar
Wu, H., Zou, B., Zhao, Y.Q., Chen, Z., Zhu, C., Guo, J.: Natural scene text detection by multi-scale adaptive color clustering and non-text filtering. Neurocomputing 214, 1011–1025 (2016)
Article Google Scholar
Palechor, F.M., De la Hoz Manotas, A., Colpas, P.A., Ojeda, J.S., Ortega, R.M., Melo, M.P.: Cardiovascular disease analysis using supervised and unsupervised data mining techniques. JSW 12(2), 81–90 (2017)
Google Scholar
Aradhya, V.M., Pavithra, M.S.: A comprehensive of transforms, Gabor filter and k-means clustering for text detection in images and video. Appl. Comput. Inform. (2014)
Google Scholar
Bharti, K.K., Singh, P.K.: Opposition chaotic fitness mutation based adaptive inertia weight BPSO for feature selection in text clustering. Appl. Soft Comput. 43, 20–34 (2016)
Article Google Scholar
Li, C.H.: Confirmatory factor analysis with ordinal data: comparing robust maximum likelihood and diagonally weighted least squares. Behav. Res. Methods 48(3), 936–949 (2016)
Article Google Scholar
Melissa, A., François, R., Mohamed, N.: Graph modularity maximization as an effective method for co-clustering text data. Knowl.-Based Syst. 109(1), 160–173 (2016)
Google Scholar
Mendoza-Palechor, F.E., Ariza-Colpas, P.P., Sepulveda-Ojeda, J.A., De-la-Hoz-Manotas, A., Piñeres Melo, M.: Fertility analysis method based on supervised and unsupervised data mining techniques (2016)
Google Scholar
Wang, P., Xu, B., Xu, J., Tian, G., Liu, C.L., Hao, H.: Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification. Neurocomputing 174, 806–814 (2016)
Article Google Scholar
Shafiabady, N., Lee, L.H., Rajkumar, R., Kallimani, V.P., Akram, N.A., Isa, D.: Using unsupervised clustering approach to train the Support Vector Machine for text classification. Neurocomputing 211, 4–10 (2016)
Article Google Scholar
Zhang, W., Tang, X., Yoshida, T.: Tesc: an approach to text classification using semi-supervised clustering. Knowl.-Based Syst. 75, 152–160 (2015)
Article Google Scholar
De França, F.O.: A hash-based co-clustering algorithm for categorical data. arXiv preprint arXiv:1407.7753 (2014)
Echeverri-Ocampo, I., Urina-Triana, M., Patricia Ariza, P., Mantilla, M.: El trabajo colaborativo entre ingenieros y personal de la salud para el desarrollo de proyectos en salud digital: una visión al futuro para lograr tener éxito (2018)
Google Scholar
Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recognit. Lett. 31(8), 651–666 (2010)
Article Google Scholar
Drineas, P., Frieze, A.M., Kannan, R., Vempala, S., Vinay, V.: Clustering in large graphs and matrices. In: SODA, vol. 99, pp. 291–299 (1999)
Google Scholar
Meila, M., Shi, J.: Learning segmentation by random walks. In: NIPS, pp. 873–879 (2000)
Google Scholar
Jain, A.K., Dubes, R.C.: Algorithms for clustering data (1988)
Google Scholar
Guerrero Cuentas, H.R., Polo Mercado, S.S., Martinez Royert, J.C., Ariza Colpas, P.P.: Trabajo colaborativo como estrategia didáctica para el desarrollo del pensamiento crítico (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Universidad de La Costa, CUC, Barranquilla, Colombia
Paola Ariza-Colpas & Emiro De-la-hoz-Franco
Universidad Pontificia Bolivariana, Medellín, Colombia
Paola Ariza-Colpas & Ana Isabel Oviedo-Carrascal

Authors

Paola Ariza-Colpas
View author publications
You can also search for this author in PubMed Google Scholar
Ana Isabel Oviedo-Carrascal
View author publications
You can also search for this author in PubMed Google Scholar
Emiro De-la-hoz-Franco
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paola Ariza-Colpas .

Editor information

Editors and Affiliations

Department of Machine Intelligence, Peking University, Beijing, China
Ying Tan
Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, China
Yuhui Shi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ariza-Colpas, P., Oviedo-Carrascal, A.I., De-la-hoz-Franco, E. (2019). Using K-Means Algorithm for Description Analysis of Text in RSS News Format. In: Tan, Y., Shi, Y. (eds) Data Mining and Big Data. DMBD 2019. Communications in Computer and Information Science, vol 1071. Springer, Singapore. https://doi.org/10.1007/978-981-32-9563-6_17

Download citation

DOI: https://doi.org/10.1007/978-981-32-9563-6_17
Published: 26 July 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-32-9562-9
Online ISBN: 978-981-32-9563-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics