Abstract
A growing body of work has highlighted the important role that Wikipedia's volunteer-created content plays in helping search engines achieve their core goal of addressing the information needs of hundreds of millions of people. In this paper, we report the results of an investigation into the incidence of Wikipedia links in search engine results pages (SERPs). Our results extend prior work by considering three U.S. search engines, simulating both mobile and desktop devices, and using a spatial analysis approach designed to study modern SERPs that are no longer just "ten blue links". We find that Wikipedia links are extremely common in important search contexts, appearing in 67-84% of desktop SERPs for common and trending queries, but less often for medical queries. Furthermore, we observe that Wikipedia links often appear in "Knowledge Panel" SERP elements and are in positions visible to users without scrolling, although Wikipedia appears less often and in less prominent positions on mobile devices. Our findings reinforce the complementary notions that (1) Wikipedia content and research has major impact outside of the Wikipedia domain and (2) powerful technologies like search engines are highly reliant on free content created by volunteers.
- Imanol Arrieta Ibarra, Leonard Goff, Diego Jiménez Hernández, Jaron Lanier, and E Weyl. 2018. Should We Treat Data as Labor? Moving Beyond "Free." American Economic Association Papers & Proceedings 1, 1 (2018).Google Scholar
- Michael Barbaro and Tom Zeller Jr. 2006. A Face Is Exposed for AOL Searcher No. 4417749. N.Y. Times (August 2006). Retrieved from https://www.nytimes.com/2006/08/09/technology/09aol.htmlGoogle Scholar
- Danqi Chen, Weizhu Chen, Haixun Wang, Zheng Chen, and Qiang Yang. 2012. Beyond Ten Blue Links: Enabling User Click Modeling in Federated Web Search. In Proceedings of the Fifth ACM International Conference on Web Search and Data Mining (WSDM '12), ACM, New York, NY, USA, 463--472. DOI:https://doi.org/10.1145/2124295.2124351Google ScholarDigital Library
- Danny Goodwin. 2012. Wikipedia Appears on Page 1 of Google for 99% of Searches [Study] - Search Engine Watch. Retrieved from https://www.searchenginewatch.com/2012/02/13/wikipedia-appears-on-page-1-of-google-for-99-of-searches-studyGoogle Scholar
- Danny Goodwin. 2012. Bing, Not Google, Favors Wikipedia More Often in Search Results [Study] - Search Engine Watch. Retrieved from https://www.searchenginewatch.com/2012/03/19/bing-not-google-favors-wikipedia-more-often-in-search-results-studyGoogle Scholar
- Artem Grotov and Maarten de Rijke. 2016. Online learning to rank for information retrieval: Sigir 2016 tutorial. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 1215--1218.Google ScholarDigital Library
- Aniko Hannak, Piotr Sapiezynski, Arash Molavi Kakhki, Balachander Krishnamurthy, David Lazer, Alan Mislove, and Christo Wilson. 2013. Measuring personalization of web search. In Proceedings of the 22nd international conference on World Wide Web, ACM, 527--538.Google ScholarDigital Library
- Benjamin Mako Hill and Aaron Shaw. 2013. The Wikipedia gender gap revisited: characterizing survey response bias with propensity score estimation. PloS one 8, 6 (2013), e65782.Google ScholarCross Ref
- Chris Hughes. 2018. The wealth of our collective data should belong to all of us. The Guardian. Retrieved from https://www.theguardian.com/commentisfree/2018/apr/27/chris-hughes-facebook-google-data-tax-regulationGoogle Scholar
- Bernard J Jansen, Danielle L Booth, and Amanda Spink. 2007. Determining the user intent of web search engine queries. In Proceedings of the 16th international conference on World Wide Web, ACM, 1149--1150.Google ScholarDigital Library
- Greg Jarboe. 2020. YouTube's Organic Visibility Tops Wikipedia in Google SERPs. Search Engine Journal (January 2020). Retrieved from https://www.searchenginejournal.com/youtube-organic-visibility-google-serps/341419Google Scholar
- Adrianne Jeffries and Leon Yin. 2020. Google's Top Search Result? Surprise! It's Google. The Markup. Retrieved from https://themarkup.org/google-the-giant/2020/07/28/how-we-analyzed-google-search-results-web-assay-parsing-toolGoogle Scholar
- Isaac L Johnson, Yilun Lin, Toby Jia-Jun Li, Andrew Hall, Aaron Halfaker, Johannes Schöning, and Brent Hecht. 2016. Not at home on the range: Peer production and the urban/rural divide. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, ACM, 13--25.Google ScholarDigital Library
- Isaac Johnson, Florian Lemmerich, Diego Sáez-Trumper, Robert West, Markus Strohmaier, and Leila Zia. 2020. Global gender differences in Wikipedia readership. arXiv preprint arXiv:2007.10403 (2020).Google Scholar
- Chloe Kliman-Silver, Aniko Hannak, David Lazer, Christo Wilson, and Alan Mislove. 2015. Location, location, location: The impact of geolocation on web search personalization. In Proceedings of the 2015 ACM Conference on Internet Measurement Conference, ACM, 121--127.Google ScholarDigital Library
- Maya Kosoff. 2018. YouTube Slaps a Feel-Good Band-Aid on Its Fake-News Problem. Vanity Fair (March 2018). Retrieved from https://www.vanityfair.com/news/2018/03/youtube-wikipedia-conspiracy-theory-video-problemGoogle Scholar
- Jaron Lanier and E Glen Weyl. 2018. A Blueprint for a Better Digital Society. Harvard Business Review (2018).Google Scholar
- Quoc V Le and Mike Schuster. 2016. A neural network for machine translation, at production scale. Retrieved from https://ai.googleblog.com/2016/09/a-neural-network-for-machine.htmlGoogle Scholar
- Emma Lurie and Eni Mustafaraj. 2018. Investigating the Effects of Google's Search Engine Result Page in Evaluating the Credibility of Online News Sources. In Proceedings of the 10th ACM Conference on Web Science, 107--116.Google ScholarDigital Library
- Connor McMahon, Isaac L Johnson, and Brent Hecht. 2017. The Substantial Interdependence of Wikipedia and Google: A Case Study on the Relationship Between Peer Production Communities and Information Technologies. In ICWSM, 142--151.Google Scholar
- Daniel Oberhaus. 2017. Nearly All of Wikipedia Is Written By Just 1 Percent of Its Editors - Motherboard. Retrieved from https://motherboard.vice.com/en_us/article/7x47bb/wikipedia-editors-elite-diversity-foundationGoogle Scholar
- Alexandra Papoutsaki, James Laskey, and Jeff Huang. 2017. Searchgazer: Webcam eye tracking for remote studies of web search. In Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval, 17--26.Google ScholarDigital Library
- Parr, Ben. 2010. Google Gives $2 Million to Wikipedia's Foundation. Retrieved from https://mashable.com/2010/02/16/google-wikipedia-donationGoogle Scholar
- Eduardo Porter. 2018. Your Data Is Crucial to a Robotic Age. Shouldn't You Be Paid for It? New York Times. Retrieved from https://www.nytimes.com/2018/03/06/business/economy/user-data-pay.htmlGoogle Scholar
- Eric A Posner and E Glen Weyl. 2018. Radical Markets: Uprooting Capitalism and Democracy for a Just Society. Princeton University Press.Google Scholar
- Joseph Reagle and Lauren Rhue. 2011. Gender bias in Wikipedia and Britannica. International Journal of Communication 5, (2011), 21.Google Scholar
- Miriam Redi, Martin Gerlach, Isaac Johnson, Jonathan Morgan, and Leila Zia. 2020. A Taxonomy of Knowledge Gaps for Wikimedia Projects (First Draft). arXiv preprint arXiv:2008.12314 (2020).Google Scholar
- Luke Richards. 2018. Why Wikipedia is still visible across Google's SERPs in 2018 - Search Engine Watch. Retrieved from https://www.searchenginewatch.com/2018/11/13/why-wikipedia-is-still-visible-across-googles-serps-in-2018Google Scholar
- Ronald E Robertson, David Lazer, and Christo Wilson. 2018. Auditing the Personalization and Composition of Politically-Related Search Engine Results Pages. In Proceedings of the 2018 World Wide Web Conference on World Wide Web, International World Wide Web Conferences Steering Committee, 955--965.Google ScholarDigital Library
- Annabel Rothshild, Emma Lurie, and Eni Mustafaraj. 2019. How the Interplay of Google and Wikipedia Affects Perceptions of Online News Sources. In Computation+ Journalism Symposium.Google Scholar
- Jonathan Shieber. 2020. Google backtracks on search results design. TechCrunch (January 2020). Retrieved from https://techcrunch.com/2020/01/24/google-backtracks-on-search-results-designGoogle Scholar
- Amit Singhal. 2012. Introducing the knowledge graph: things, not strings. Official google blog 16, (2012). Retrieved from https://googleblog.blogspot.com/2012/05/introducing-knowledge-graph-things-not.htmlGoogle Scholar
- Luca Soldaini, Andrew Yates, Elad Yom-Tov, Ophir Frieder, and Nazli Goharian. 2016. Enhancing web search in the medical domain via query clarification. Information Retrieval Journal 19, 1--2 (2016), 149--173.Google ScholarDigital Library
- Tim Soulo. 2019. Top Google searches (as of October 2019). Retrieved from https://ahrefs.com/blog/top-google-searchesGoogle Scholar
- Dario Taraborelli. 2015. The Sum of All Human Knowledge in the Age of Machines: A New Research Agenda for Wikimedia. ICWSM-15 Workshop on Wikipedia, a Social Pedia: Research Challenges and Opportunities,.Google Scholar
- Maddy Varner and Sam Morris. 2021. Introducing Simple Search -- The Markup. The Markup. Retrieved from https://themarkup.org/google-the-giant/2020/11/10/introducing-simple-searchGoogle Scholar
- Nicholas Vincent and Brent Hecht. 2020. Can "Conscious Data Contribution" Help Users to Exert "Data Leverage" Against Technology Companies?Google Scholar
- Nicholas Vincent, Brent Hecht, and Shilad Sen. 2019. "Data Strikes": Evaluating the Effectiveness of New Forms of Collective Action Against Technology Platforms. In Proceedings of The Web Conference 2019.Google ScholarDigital Library
- Nicholas Vincent, Isaac Johnson, Patrick Sheehan, and Brent Hecht. 2019. Measuring the Importance of User-Generated Content to Search Engines. In Proceedings of AAAI ICWSM 2019.Google ScholarCross Ref
- Nicholas Vincent, Hanlin Li, Nicole Tilly, Stevie Chancellor, and Brent Hecht. 2021. Data Leverage: A Framework for Empowering the Public in its Relationship with Technology Companies. In ACM FAccT 2021 (formerly FAT*).Google ScholarDigital Library
- Claudia Wagner, David Garcia, Mohsen Jadidi, and Markus Strohmaier. 2015. It's a Man's Wikipedia? Assessing Gender Inequality in an Online Encyclopedia. In ICWSM, 454--463.Google Scholar
- Ryen W White, Fernando Diaz, and Qi Guo. 2017. Search result prefetching on desktop and mobile. ACM Transactions on Information Systems (TOIS) 35, 3 (2017), 1--34.Google ScholarDigital Library
- 2015. It's Official: Google Says More Searches Now On Mobile Than On Desktop - Search Engine Land. Retrieved from https://searchengineland.com/its-official-google-says-more-searches-now-on-mobile-than-on-desktop-220369Google Scholar
- 2018. Google Trends. Retrieved from https://trends.google.com/trends/hottrendsGoogle Scholar
- 2018. Popular Screen Resolutions - Media Genesis. Retrieved from https://mediag.com/blog/popular-screen-resolutions-designing-for-allGoogle Scholar
- 2020. Web search engine - Wikipedia. Retrieved from https://en.wikipedia.org/wiki/Web_search_engine#Market_shareGoogle Scholar
- 2020. StatCounter Global Stats - Browser, OS, Search Engine including Mobile Usage Share. Retrieved from https://gs.statcounter.comGoogle Scholar
- 2020. ComScore US Search Market Share. Retrieved from https://www.comscore.com/Insights/Rankings?country=US#tab_search_shareGoogle Scholar
- Protests against SOPA and PIPA - Wikipedia. Retrieved from https://en.wikipedia.org/wiki/Protests_against_SOPA_and_PIPAGoogle Scholar
Index Terms
- A Deeper Investigation of the Importance of Wikipedia Links to Search Engine Results
Recommendations
What users see - Structures in search engine results pages
This paper investigates the composition of search engine results pages. We define what elements the most popular web search engines use on their results pages (e.g., organic results, advertisements, shortcuts) and to which degree they are used for ...
The effect of user intent on the stability of search engine results
Previous work has established that search engine queries can be classified according to the intent of the searcher (i.e., why is the user searching, what specifically do they intend to do). In this article, we describe an experiment in which four sets ...
The influence of commercial intent of search results on their perceived relevance
iConference '11: Proceedings of the 2011 iConferenceWe carried out a retrieval effectiveness test on the three major web search engines (i.e., Google, Microsoft and Yahoo). In addition to relevance judgments, we classified the results according to their commercial intent and whether or not they carried ...
Comments