Combining Machine Learning and Knowledge-Based Systems for Summarizing Interviews

Garrido, Angel Luis; Cardiel, Oscar; Aleyxendri, Andrea; Quilez, Ruben

doi:10.1007/978-3-319-60438-1_24

Angel Luis Garrido¹⁹,
Oscar Cardiel²⁰,
Andrea Aleyxendri²⁰ &
…
Ruben Quilez²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10352))

Included in the following conference series:

International Symposium on Methodologies for Intelligent Systems

1724 Accesses

Abstract

Achieving optimal results of an automatic summarization process is frequently conditioned by the knowledge of the domain. The performance of general methods is always lower than what can be achieved by introducing custom modifications taking into account the context. Nevertheless, these type of custom adjustments represents a hard work by experts and developers, which is not always possible to achieve due to the high costs. In this work we aim to leverage the features of the documents in order to classify them by using machine learning methods. Once the typology is identified, the application of improvements is done by a knowledge-based system that allows users to easily customize both the summarization process, and the presentation to the final user. The proposed method has been applied with promising results to interviews in a real environment of a major Spanish media group.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://www.grupoheraldo.com.
2.
http://www.grupoheraldo.com.
3.
http://www.heraldo.es.
4.
E.g., http://dbpedia.org/page/Category:Spanish-language_surnames.
5.
http://swesum.nada.kth.se/index-eng.html.
6.
https://www.tools4noobs.com/summarize/.
7.
http://autosummarizer.com/.
8.
http://textsummarization.net/.
9.
ROUGE-L is one of the five evaluation metrics avaliable in ROUGE (a recall-based metric for fixed-length summaries), and it is based on founding the longest common subsequence.

References

Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2(2), 159–165 (1958)
Article MathSciNet Google Scholar
Edmundson, H.P.: New methods in automatic extracting. J. ACM (JACM) 16(2), 264–285 (1969)
Article MATH Google Scholar
Kupiec, J., Pedersen, J., Chen, F.: A trainable document summarizer. In: Proceedings of the 18th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1995), pp. 68–73. ACM (1995)
Google Scholar
Lin, C.Y.: Training a selection function for extraction. In: Proceedings of the 8th International Conference on Information and Knowledge Management (CIKM 1999), pp. 55–62. ACM (1999)
Google Scholar
Conroy, J.M., O’leary, D.P.: Text summarization via hidden Markov models. In: Proceedings of the 24th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2001), pp. 406–407. ACM (2001)
Google Scholar
Gruber, T.R.: A translation approach to portable ontology specifications. Knowl. Acquis. 5(2), 199–220 (1993)
Article Google Scholar
Bobed, C., Yus, R., Bobillo, F., Ilarri, S., Bernad, J., Mena, E., Trillo-Lado, R., Garrido, Á.L.: Emerging semantic-based applications. In: Workman, M. (ed.) Semantic Web, pp. 39–83. Springer, Cham (2016). doi:10.1007/978-3-319-16658-2_4
Chapter Google Scholar
Barbau, R., Krima, S., Rachuri, S., Narayanan, A., Fiorentini, X., Foufou, S., Sriram, R.D.: Ontostep: enriching product model data using ontologies. Comput.-Aided Des. 44(6), 575–590 (2012)
Article Google Scholar
Vogrinčič, S., Bosnić, Z.: Ontology-based multi-label classification of economic articles. Comput. Sci. Inf. Syst. 8, 101–119 (2011)
Article Google Scholar
Garrido, A.L., Gómez, O., Ilarri, S., Mena, E.: An experience developing a semantic annotation system in a media group. In: Bouma, G., Ittoo, A., Métais, E., Wortmann, H. (eds.) NLDB 2012. LNCS, vol. 7337, pp. 333–338. Springer, Heidelberg (2012). doi:10.1007/978-3-642-31178-9_43
Chapter Google Scholar
Kara, S., Alan, Ö., Sabuncu, O., Akpınar, S., Cicekli, N.K., Alpaslan, F.N.: An ontology-based retrieval system using semantic indexing. Inf. Syst. 37(4), 294–305 (2012)
Article Google Scholar
Borobia, J.R., Bobed, C., Garrido, A.L., Mena, E.: SIWAM: using social data to semantically assess the difficulties in mountain activities. In: 10th International Conference on Web Information Systems and Technologies (WEBIST 2014), pp. 41–48 (2014)
Google Scholar
Buey, M.G., Garrido, A.L., Bobed, C., Ilarri, S.: The AIS project: boosting information extraction from legal documents by using ontologies. In: Proceedings of the 8th International Conference on Agents and Artificial Intelligence (ICAART 2016), Rome, Italy, pp. 438–445. SCITEPRESS (2016)
Google Scholar
Garrido, A.L., Buey, M.G., Muñoz, G., Casado-Rubio, J.-L.: Information extraction on weather forecasts with semantic technologies. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds.) NLDB 2016. LNCS, vol. 9612, pp. 140–151. Springer, Cham (2016). doi:10.1007/978-3-319-41754-7_12
Chapter Google Scholar
Wimalasuriya, D.C., Dou, D.: Ontology-based information extraction: an introduction and a survey of current approaches. J. Inf. Sci. 36(3), 306–323 (2010)
Article Google Scholar
Evans, D.K., Klavans, J.L., McKeown, K.R.: Columbia newsblaster: multilingual news summarization on the web. In: Demonstration Papers at HLT-NAACL 2004, pp. 1–4. Association for Computational Linguistics (2004)
Google Scholar
Dalianis, H.: Swesum: a text summarizer for Swedish. KTH (2000)
Google Scholar
Das, A.S., Datar, M., Garg, A., Rajaram, S.: Google news personalization: scalable online collaborative filtering. In: Proceedings of the 16th International Conference on World Wide Web, pp. 271–280. ACM (2007)
Google Scholar
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988)
Article Google Scholar
Bell, A.: The discourse structure of news stories. In: Approaches to Media Discourse, pp. 64–104 (1998)
Google Scholar
Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998). doi:10.1007/BFb0026683
Chapter Google Scholar
Shin, K.S., Lee, T.S., Kim, H.J.: An application of support vector machines in bankruptcy prediction model. Expert Syst. Appl. 28(1), 127–135 (2005)
Article Google Scholar
Garrido, A.L., Gomez, O., Ilarri, S., Mena, E.: NASS: News annotation semantic system. In: Proceedings of the 23rd IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2011), pp. 904–905. IEEE (2011)
Google Scholar
Garrido, A.L., Buey, M.G., Ilarri, S., Mena, E.: GEO-NASS: a semantic tagging experience from geographical data on the media. In: Catania, B., Guerrini, G., Pokorný, J. (eds.) ADBIS 2013. LNCS, vol. 8133, pp. 56–69. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40683-6_5
Chapter Google Scholar
Garrido, A.L., Buey, M.G., Escudero, S., Peiro, A., Ilarri, S., Mena, E.: The GENIE project-a semantic pipeline for automatic document categorisation. In: Proceedings of the 10th International Conference on Web Information Systems and Technologies (WEBIST 2014), pp. 161–171. SCITEPRESS (2014)
Google Scholar
Silveira, S.B., Branco, A.: Extracting multi-document summaries with a double clustering approach. In: Bouma, G., Ittoo, A., Métais, E., Wortmann, H. (eds.) NLDB 2012. LNCS, vol. 7337, pp. 70–81. Springer, Heidelberg (2012). doi:10.1007/978-3-642-31178-9_7
Chapter Google Scholar

Download references

Acknowledgments

This research work has been supported by the CICYT project TIN2013-46238-C4-4-R, TIN2016-78011-C4-3-R (AEI/FEDER, UE), and DGA/FEDER.

Author information

Authors and Affiliations

IIS Department, University of Zaragoza, Zaragoza, Spain
Angel Luis Garrido
Computer Department, Grupo Heraldo, Zaragoza, Spain
Oscar Cardiel, Andrea Aleyxendri & Ruben Quilez

Authors

Angel Luis Garrido
View author publications
You can also search for this author in PubMed Google Scholar
Oscar Cardiel
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Aleyxendri
View author publications
You can also search for this author in PubMed Google Scholar
Ruben Quilez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Angel Luis Garrido .

Editor information

Editors and Affiliations

Warsaw University of Technology, Warsaw, Poland
Marzena Kryszkiewicz
University of Bari Aldo Moro, Bari, Italy
Annalisa Appice
Institute of Informatics, University of Warsaw, Warsaw, Poland
Dominik Ślęzak
Faculty of Electronics & Information, Warsaw University of Technology, Warsaw, Poland
Henryk Rybinski
Institute of Mathematics, Warsaw University, Warsaw, Poland
Andrzej Skowron
Department of Computer Science, University of North Carolina at Charlotte, North Carolina, USA
Zbigniew W. Raś

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Garrido, A.L., Cardiel, O., Aleyxendri, A., Quilez, R. (2017). Combining Machine Learning and Knowledge-Based Systems for Summarizing Interviews. In: Kryszkiewicz, M., Appice, A., Ślęzak, D., Rybinski, H., Skowron, A., Raś, Z. (eds) Foundations of Intelligent Systems. ISMIS 2017. Lecture Notes in Computer Science(), vol 10352. Springer, Cham. https://doi.org/10.1007/978-3-319-60438-1_24

Download citation

DOI: https://doi.org/10.1007/978-3-319-60438-1_24
Published: 14 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60437-4
Online ISBN: 978-3-319-60438-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics