skip to main content
research-article

Advertising Keywords Recommendation for Short-Text Web Pages Using Wikipedia

Published: 01 February 2012 Publication History

Abstract

Advertising keywords recommendation is an indispensable component for online advertising with the keywords selected from the target Web pages used for contextual advertising or sponsored search. Several ranking-based algorithms have been proposed for recommending advertising keywords. However, for most of them performance is still lacking, especially when dealing with short-text target Web pages, that is, those containing insufficient textual information for ranking. In some cases, short-text Web pages may not even contain enough keywords for selection. A natural alternative is then to recommend relevant keywords not present in the target Web pages. In this article, we propose a novel algorithm for advertising keywords recommendation for short-text Web pages by leveraging the contents of Wikipedia, a user-contributed online encyclopedia. Wikipedia contains numerous entities with related entities on a topic linked to each other. Given a target Web page, we propose to use a content-biased PageRank on the Wikipedia graph to rank the related entities. Furthermore, in order to recommend high-quality advertising keywords, we also add an advertisement-biased factor into our model. With these two biases, advertising keywords that are both relevant to a target Web page and valuable for advertising are recommended. In our experiments, several state-of-the-art approaches for keyword recommendation are compared. The experimental results demonstrate that our proposed approach produces substantial improvement in the precision of the top 20 recommended keywords on short-text Web pages over existing approaches.

References

[1]
Abhishek, V. and Hosanagar, K. 2007. Keyword generation for search engine advertising using semantic similarity between terms. In Proceedings of the 9th International Conference on Electronic Commerce. 89--94.
[2]
Anagnostopoulos, A., Broder, A., Gabrilovich, E., Josifovski, V., and Riedel, L. 2007. Just-in-time contextual advertising. In Proceedings of the CIKM Conference.
[3]
Antonellis, I., Garcia-Molina, H., and Chang, C.-C. 2008. Simrank++: Query rewriting through link analysis of the click graph. In Proceedings of the International Conference on Very Large Databases (VLDB). 408--421.
[4]
Baeza-Yated, R. and Ribeiro-Neto, B. 2008. Modem Information Retrieval. Addison-Wesley Longman Publishing Co., Boston, MA.
[5]
Becker, H., Broder, A., Gabrilovich, E., Josifovski, V., and Pang, B. 2009. What happens after an ad click? Quantifying the impact of landing pages in web advertising. In Proceedings of the 18th ACM Conference on Information and Knowledge Management. 57--66.
[6]
Boldi, P., Bonchi, F., Castillo, C., Donato, D., Gionis, A., and Vigna, S. 2008. The query-flow graph: Model and applications. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM). 609--618.
[7]
Boldi, P., Bonchi, F., Castillo, C., Donato, D., and Vigna, S. 2009. Query suggestions using query-flow graphs. In Proceedings of the Workshop on Web Search Click Data (WSCD). 56--63.
[8]
Brin, S. and Page, L. 1997. The anatomy of a large-scale hypertextual web search engine. In Proceedings of the 7th International World Wide Web Conference. 107--117.
[9]
Brin, S., Motwani, R., Page, L., and Winograd., T. 1998. What can you do with a web in your pocket. Bull. IEEE.
[10]
Broder, A., Fontoura, M., Josifovski, V., and Riedel, L. 2007. A semantic approach to contextual advertising. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information. 559--566.
[11]
Broder, A. Z., Ciccolo, P., Fontoura, M., Gabrilovich, E., Josifovski, V., and Riedel, L. 2008. Search advertising using web relevance feedback. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM). New York, NY, 1013--1022.
[12]
Broder, A., Ciccolo, P., Gabrilovich, E., Josifovski, V., Metzler, D., Riedel, L., and Yuan, J. 2009. Online expansion of rare queries for sponsored search. In Proceedings of the 18th International Conference on World Wide Web.
[13]
Cao, H., Jiang, D., Pei, J., He, Q., Liao, Z., Chen, E., and Li, H. 2008. Context-aware query suggestion by mining click-through and session data. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 875--883.
[14]
Carmel, D., Roitman, H., and Zwerdling, N. 2009. Enhancing cluster labeling using wikipidia. In Proceedings of 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 139--146.
[15]
Chen, Y., Xue, G., and Yu, Y. 2008. Advertising keyword suggestion based on concept hierarchy. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. 251--260.
[16]
Choi, Y., Fontoura, M., Gabrilovich, E., Josifovski, V., Mediano, M., and Pang, B. 2010. Using landing pages for sponsored search ad selection. In Proceedings of the 19th International Conference on World Wide Web. 251--260.
[17]
Cristo, M., Ribeiro-Neto1, B., Golgher, P. B., and de Moura, E. 2006. Search advertising. In Proceedings of the StudFuzz Conference 197. 259--285.
[18]
Fang, Y., Wu, B., Li, Q., Bot, R., and Chen, X. 2005. Domain-specific keyphrase extraction. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management. 283--284.
[19]
Feng, J., Bhargava, H., and Pennock, D. 2003. Comparison of allocation rules for paid placement advertising in search engines. In Proceedings of the 5th International Conference on Electronic Commerce. 294--299.
[20]
Haveliwala, T. 2002. Topic-Sensitive pagerank. In Proceedings of the 14th World Wide Web Conference. 517--526.
[21]
Hillard, D., Schroedl, S., Manavoglu, E., Raghavan, H., and Leggetter, C. 2010. Improving ad relevance in sponsored search. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. 361--369.
[22]
Hu, J., Fang, L., Cao, Y., Zeng, H.-J., Li, H., Yang, Q., and Chen, Z. 2008. Enhancing text clustering by leveraging wikipedia semantics. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 179--186.
[23]
Hu, J., Wang, G., Lochovsky, F., Sun, J., and Chen, Z. 2009. Understanding user’s query intent with wikipedia. In Proceedings of the 18th World Wide Web Conference. 471--478.
[24]
IAB and PricewaterhouseCoopers. 2011. http://www.iab.net/media/file/IAB_Full_year_2010_0413_Final.pdf.
[25]
Jones, R., Rey, B., Madani, O., and Greiner, W. 2006. Generating query substitutions. In Proceedings of the 15th International Conference on World Wide Web. 387--396.
[26]
Jones, S. and Paynter, G. 2001. Human evaluation of kea, an automatic keyphrasing system. In Proceedings of the 1st ACM/IEE-CS Joint Conference on Digital Libraries. 148--156.
[27]
Joshi, A. and Motwani, R. 2006. Keyword generation for search engine advertising. In Proceedings of the 6th IEEE International Conference on Data Mining (Workshops).
[28]
Kleinberg, J. M. 1999. Authoritative sources in a hyperlinked environment. J. ACM 46, 604--632.
[29]
Litvak, M. and Last, M. 2008. Graph-based keyword extraction for single-document summarization. In Proceedings of the Workshop on Multisource Multilingual Information Extraction and Summarization (Coling). 17--24.
[30]
Matsuo, Y. 2003. Keyword extraction from a single document using word co-occurrence statistical information. Int. J. Artif. Intell. Tools.
[31]
Medelyan, O., Milne, D., Legg, C., and Witten, I. 2009. Mining meaning from wikipedia. Int. J. Hum.-Comput. Studies. 716--754.
[32]
Mitra, M., Singhal, A., and Buckley, C. 1998. Improving automatic query expansion. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 206--214.
[33]
Page, L. 1997. Pagerank: Bringing order to the web. In Digital Libraries Working Paper.
[34]
Pandey, S., Punera, K., Fontoura, M., and Josifovski, V. 2010. Estimating advertisability of tail queries for sponsored search. http://arnetminer.org/viewpub.do?pid=2814327.
[35]
Radlinski, F., Broder, A., Ciccolo, P., Gabrilovich, E., Josifovski, V., and Riedel, L. 2008. Optimizing relevance and revenue in ad search: A query substitution approach. In Proceedings of the 31st International ACM SIGIR Conference on Research and Development in Information Retrieval. 403--410.
[36]
Raghavan, H. and Hillard, D. 2009. A relevance model based filter for improving ad quality. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 762--763.
[37]
Ravi, S., Broder, A., Gabrilovich, E., Josifovski, V., Pandey, S., and Pang, B. 2010. Automatic generation of bid phrases for online advertising. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. 341--350.
[38]
Ribeiro-Neto, B., Cristo, M., Golgher, P., and Moura., E. 2005. Impedance coupling in content-targeted advertising. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 496--503.
[39]
Schönhofen, P. 2006. Identifying document topics using the wikipedia category network. Web Intell. Agent Syst. 456--462.
[40]
Sweney, M. 2009. http://www.guardian.co.uk/media/2009/sep/30/internet-biggest-uk-advertising-sector.
[41]
Turney, P. D. 2000. Learning algorithms for keyphrase extraction. J. Inform. Retrieval, 303--336.
[42]
Turney, P. D. 2003. Coherent keyphrase extraction via web mining. In Proceedings of the IJCAI’03 Conference. 434--439.
[43]
Wang, C., Zhang, P., Choi, R., and Eredita, M. 2002. Understanding consumers attitude toward advertising. In Proceedings of the 8th Americas Conference on Informatino System. 1143--1148.
[44]
Wang, H., Ling, Y., Fu, L., Xue, G., and Yu, Y. 2009a. Efficient query expansion for advertisement search. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 51--58.
[45]
Wang, P. and Domeniconi, C. 2008. Building semantic kernels for text classification using wikipedia. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 713--721.
[46]
Wang, P., Hu, J., Zeng, H.-J., and Chen, Z. 2009b. Using wikipedia knowledge to improve text classification. Knowl. Inf. Syst. 19, 265--281.
[47]
Wen, J.-R., Nie, J.-Y., and Zhang, H.-J. 2001. Clustering user queries of a search engine. In Proceedings of the 10th International Conference on World Wide Web (WWW). 162--168.
[48]
Witten, I., Paynter, G., Frank, E., Gutwin, C., and Nevill-Manning, C. 1999. Kea: Practical automatic keyphrase extraction. In Proceedings of the 4th ACM Conference on Digital Libraries. 254--255.
[49]
Yih, W., Goodman, J., and Carvalho, V. 2006. Finding advertising keywords on web pages. In Proceedings of the 15th World Wide Web Conference. 213--222.
[50]
Yu, J., Thom, J., and Tam, A. 2007. Ontology evaluation using wikipedia categories for browsing. In Proceedings of the 6th ACM Conference on Information and Knowledge Management. 223--232.

Cited By

View all
  • (2023)A semantic transfer approach to keyword suggestion for search engine advertisingElectronic Commerce Research10.1007/s10660-021-09496-723:2(921-947)Online publication date: 1-Jun-2023
  • (2022)Combining statistical, structural, and linguistic features for keyword extraction from web pagesApplied Computing and Intelligence10.3934/aci.20220072:2(115-132)Online publication date: 2022
  • (2021)A Character-Word Graph Attention Networks for Chinese Text Classification2021 IEEE International Conference on Big Knowledge (ICBK)10.1109/ICKG52313.2021.00068(462-469)Online publication date: Dec-2021
  • Show More Cited By

Index Terms

  1. Advertising Keywords Recommendation for Short-Text Web Pages Using Wikipedia

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Intelligent Systems and Technology
    ACM Transactions on Intelligent Systems and Technology  Volume 3, Issue 2
    February 2012
    455 pages
    ISSN:2157-6904
    EISSN:2157-6912
    DOI:10.1145/2089094
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 February 2012
    Accepted: 01 October 2011
    Revised: 01 September 2011
    Received: 01 April 2011
    Published in TIST Volume 3, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Contextual advertising
    2. Wikipedia
    3. advertising keywords recommendation
    4. topic-sensitive PageRank

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 14 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)A semantic transfer approach to keyword suggestion for search engine advertisingElectronic Commerce Research10.1007/s10660-021-09496-723:2(921-947)Online publication date: 1-Jun-2023
    • (2022)Combining statistical, structural, and linguistic features for keyword extraction from web pagesApplied Computing and Intelligence10.3934/aci.20220072:2(115-132)Online publication date: 2022
    • (2021)A Character-Word Graph Attention Networks for Chinese Text Classification2021 IEEE International Conference on Big Knowledge (ICBK)10.1109/ICKG52313.2021.00068(462-469)Online publication date: Dec-2021
    • (2020)CovLetsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/335752516:1s(1-14)Online publication date: 17-Apr-2020
    • (2019)Short Text Analysis Based on Dual Semantic Extension and Deep Hashing in MicroblogACM Transactions on Intelligent Systems and Technology10.1145/332616610:4(1-24)Online publication date: 26-Aug-2019
    • (2019)AiAdsProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3292500.3330782(1881-1890)Online publication date: 25-Jul-2019
    • (2019)Keyword Generation for Sponsored Search Advertising: Balancing Coverage and RelevanceIEEE Intelligent Systems10.1109/MIS.2019.293888134:5(14-24)Online publication date: 1-Sep-2019
    • (2019)A probabilistic model for semantic advertisingKnowledge and Information Systems10.1007/s10115-018-1160-759:2(387-412)Online publication date: 1-May-2019
    • (2018)ParabelProceedings of the 2018 World Wide Web Conference10.1145/3178876.3185998(993-1002)Online publication date: 10-Apr-2018
    • (2018)Finding High Quality Documents through Link and Click Graphs2018 7th International Congress on Advanced Applied Informatics (IIAI-AAI)10.1109/IIAI-AAI.2018.00020(49-54)Online publication date: Jul-2018
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media