Abstract
In this paper, we propose an approach that displays the results of a search engine query in a more effective way. Each web page retrieved by the search engine is subjected to a summarization process and the important content is extracted. The system consists of four stages. First, the hierarchical structures of documents are extracted. Then the lexical chains in documents are identified to build coherent summaries. The document structures and lexical chains are used to learn a summarization model by the next component. Finally, the summaries are formed and displayed to the user. Experiments on two datasets showed that the method significantly outperforms traditional search engines.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Alam, H., Kumar, A., Nakamura, M., Rahman, A.F.R., Tarnikova, Y., Wilcox, C.: Structured and Unstructured Document Summarization: Design of a Commercial Summarizer Using Lexical Chains. In: Proc. of the 7th International Conference on Document Analysis and Recognition, pp. 1147–1150 (2003)
Amini, M.R., Tombros, A., Usunier, N., Lalmas, M.: Learning Based Summarisation of XML Documents. Journal of Information Retrieval 10(3), 233–255 (2007)
Berker, M., Güngör, T.: Using Genetic Algorithms with Lexical Chains for Automatic Text Summarization. In: Proc. of the 4th International Conference on Agents and Artificial Intelligence (ICAART), Vilamoura, Portugal, pp. 595–600 (2012)
Cobra: Java HTML Renderer & Parser (2010), http://lobobrowser.org/cobra.jsp
Gonzàlez, E., Fuentes, M.: A New Lexical Chain Algorithm Used for Automatic Summarization. In: Proc. of the 12th International Congress of the Catalan Association of Artificial Intelligence (CCIA) (2009)
Guo, Y., Stylios, G.: An Intelligent Summarisation System Based on Cognitive Psychology. Information Sciences 174(1-2), 1–36 (2005)
Hobson, S.P., Dorr, B.J., Monz, C., Schwartz, R.: Task-based Evaluation of Text Summarisation Using Relevance Prediction. Information Processing and Management 43(6), 1482–1499 (2007)
Joachims, T.: Advances in Kernel Methods: Support Vector Learning. MIT (1999)
Otterbacher, J., Radev, D., Kareem, O.: News to Go: Hierarchical Text Summarisation for Mobile Devices. In: Proc. of 29th Annual ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 589–596 (2006)
Pembe, F.C., Güngör, T.: Structure-Preserving and Query-Biased Document Summarisation for Web Searching. Online Information Review 33(4) (2009)
Princeton University, About WordNet (2010), http://wordnet.princeton.edu
Roussinov, D.G., Chen, H.: Information Navigation on the Web by Clustering and Summarizing Query Results. Information Processing and Management 37 (2001)
Szlavik, Z., Tombros, A., Lalmas, M.: Investigating the Use of Summarisation for Interactive XML Retrieval. In: Proc. of ACM Symposium on Applied Computing (2006)
Xue, X.-B., Zhou, Z.-H.: Improving Web Search Using Image Snippets. ACM Transactions on Internet Technology 8(4) (2008)
Yang, C.C., Wang, F.L.: Hierarchical Summarization of Large Documents. Journal of American Society for Information Science and Technology 59(6), 887–902 (2008)
Yeh, J.Y., Ke, H.R., Yang, W.P., Meng, I.H.: Text Summarisation Using a Trainable Summariser and Latent Semantic Analysis. Information Processing and Management 41(1), 75–95 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Güngör, T. (2013). A Machine Learning Approach for Displaying Query Results in Search Engines. In: Wilson, R., Hancock, E., Bors, A., Smith, W. (eds) Computer Analysis of Images and Patterns. CAIP 2013. Lecture Notes in Computer Science, vol 8047. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40261-6_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-40261-6_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40260-9
Online ISBN: 978-3-642-40261-6
eBook Packages: Computer ScienceComputer Science (R0)