skip to main content
10.1145/2566486.2568022acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

From devices to people: attribution of search activity in multi-user settings

Published: 07 April 2014 Publication History

Abstract

Online services rely on unique identifiers of machines to tailor offerings to their users. An implicit assumption is made that each machine identifier maps to an individual. However, shared ma-chines are common, leading to interwoven search histories and noisy signals for applications such as personalized search and ad-vertising. We present methods for attributing search activity to individual searchers. Using ground truth data for a sample of almost four million U.S. Web searchers-containing both machine identifiers and person identifiers-we show that over half of the machine identifiers comprise the queries of multiple people. We characterize variations in features of topic, time, and other aspects such as the complexity of the information sought per the number of searchers on a machine, and show significant differences in all measures. Based on these insights, we develop models to accurately estimate when multiple people contribute to the logs ascribed to a single machine identifier. We also develop models to cluster search behavior on a machine, allowing us to attribute historical data accurately and automatically assign new search activity to the correct searcher. The findings have implications for the design of applications such as personalized search and advertising that rely heavily on machine identifiers to custom-tailor their services.

References

[1]
Amari, S.I., Cichocki, A., and Yang, H.H. (1996). A new learning algorithm for blind signal separation. Proc. NIPS, 757--763.
[2]
Allen, B. (2000). Individual differences and conundrums of user-centered design. JASIS, 51(6): 508--520.
[3]
Anand, K., Mathew, G., and Reddy, V. (1995). Blind separation of multiple co-channel BPSK signals arriving at an anten-na array. IEEE Signal Processing Letters, 2: 176--178.
[4]
Barabasi, A.L. (2005). The origin of bursts and heavy tails in human dynamics. Nature, 435(7039): 207--211.
[5]
Bennett, P.N., Svore, K., and Dumais, S.T. (2010). Classification-enhanced ranking. Proc. WWW, 111--120.
[6]
Bhavnani, S. (2001). Important cognitive components of domain-specific search knowledge. Proc. TREC, 571--578.
[7]
Bilenko, M. and Richardson, M. (2011). Predictive client-side profiles for personalized advertising. Proc. SIGKDD, 413--421.
[8]
Buscher, G., White, R.W., Dumais, S.T., and Huang, J. (2012). Large-scale analysis of individual and task differences on search result page examination strategies. Proc. WSDM, 373--424.
[9]
Cadez, I., Heckerman, D., Meek, C., Smyth, P., and White, S. (2003). Visualization of navigation patterns on a web site using model based clustering. Data Mining and Knowledge Dis-covery, 7: 399--424.
[10]
Cardoso, J.F. (1998). Blind signal separation: statistical prin-ciples. Proc. IEEE, 86(10): 2009--2025.
[11]
Chen, Y., Pavlov, D., and Canny, J. (2009). Large-scale be-havioral targeting. Proc. SIGKDD, 209--218.
[12]
Collins-Thompson, K., and Callan, J. (2004). A language modeling approach to predicting reading difficulty. Proc. HLT, 193--200.
[13]
Comon, P. (1994). Independent component analysis: a new concept? Signal Processing, 36(3): 287--314.
[14]
Dasgupta, A., Gurevich, M., Zhang, L., Tseng, B., and Thom-as, A.O. (2012). Overcoming browser cookie churn with clus-tering. Proc. WSDM, 83--92.
[15]
Dou, Z., Song, R., and Wen, J.R. (2007). A large-scale evaluation and analysis of personalized search strategies. Proc. WWW, 581--590.
[16]
Downey, D., Dumais, S.T., and Horvitz, E. (2007). Models of searching and browsing: languages, studies, and applications. Proc. IJCAI, 2740--2747
[17]
Dumais, S., Buscher, G., and Cutrell, E. (2010). Individual differences in gaze patterns for Web search. Proc. IIiX, 185--194.
[18]
Dupret, G. and Piwowarski, B. (2008). A user browsing model to predict search engine click data from past observations. Proc. SIGIR, 331--338.
[19]
Fawcett, T. and Provost, F. (1997). Adaptive fraud detection. Data Mining and Knowledge Discovery, 1(3): 291--316.
[20]
File, T. (2013) Computer and Internet Use in the United States. http://www.census.gov/prod/2013pubs/p20--569.pdf
[21]
Fulgoni, G.M. (2005). The "Professional Respondent" Problem in Online Survey Panels Today. Slides online at: http://www.sigmavalidation.com/tips/05_06_02_Online_Survey_Panels.ppt (Downloaded on October 3, 2013).
[22]
Hu, V, Stone, M., Pedersen, J., and White, R.W. (2011). Effects of search success on search engine re-use. Proc. CIKM, 1841--1846.
[23]
Friedman, J.H., Hastie, T., and Tibshirani, R. (1998). Additive logistic regression: A statistical view of boosting. Technical Report, Department of Statistics, Stanford University.
[24]
Joachims, T. (2002). Optimizing search engines using click-through data. Proc. SIGKDD, 133--142.
[25]
Kobsa, A. (2007). Privacy-enhanced personalization. CACM, 50(8): 24--33.
[26]
Krause, A. and Horvitz, E. (2008). A utility-theoretic approach to privacy and personalization. Proc. AAAI, 1181--1188.
[27]
Lau, T. and Horvitz, E. (1999). Patterns of search: analyzing and modeling web query refinement. Proc. UM, 119--128.
[28]
MacQueen, J.B. (1967). Some methods for classification and analysis of multivariate observations. Proc. Symposium on Math, Statistics, and Probability, 281--297.
[29]
Matthijs, N. and Radlinski, F. (2011). Personalizing web search using long term browsing history. Proc. WSDM, 25--34.
[30]
Richardson, M. (2009). Learning about the world from long-term query logs. ACM TWEB, 2(4): 21.
[31]
Saracevic, T. (1991). Individual differences in organizing, searching and retrieving information. Proc. ASIS, 82--86.
[32]
Shen, X., Dumais, S., and Horvitz, E. (2005). Analysis of topic dynamics in web search. Proc. WWW, 1102--1103.
[33]
Shen, X., Tan, B., and Zhai, C.X. (2005). Implicit user modeling for personalized search. Proc. CIKM, 824--831
[34]
Sontag, D., Collins-Thompson, K., Bennett, P.N., White, R.W., Dumais, S., and Billerbeck, B. (2012). Probabilistic models for personalizing web search. Proc. WSDM, 433--442.
[35]
Tan, B., Shen, X., and Zhai, C. (2006). Mining long-term search history to improve search accuracy. Proc. SIGKDD, 718--723.
[36]
Teevan, J., Dumais, S. T., and Horvitz, E. (2005). Personalizing search via automated analysis of interests and activities. Proc. SIGIR, 449--456.
[37]
Teevan, J., Liebling, D.J., and Geetha, G.R. (2011). Understanding and predicting personal navigation. Proc. WSDM, 85--94.
[38]
Thatcher, A. (2008). Web search strategies: The influence of web experience and task type. IP&M, 44(3): 1308--1329.
[39]
White, R.W. and Morris, D. (2007). Investigating the querying and browsing behavior of advanced search engine users. Proc. SIGIR, 255--262.
[40]
White, R.W. and Drucker, S. (2007). Investigating behavioral variability in web search. Proc. WWW, 21--30.
[41]
White, R.W., Bailey, P., and Chen, L. (2009). Predicting user interests from contextual information. Proc. SIGIR, 363--370.
[42]
White, R.W., Dumais, S.T., and Teevan, J. (2009). Character-izing the influence of domains expertise on web search behavior. Proc. WSDM, 132--141.
[43]
White, R.W., Bennett, P.N., and Dumais, S.T. (2010). Predicting short-term interests using activity-based search context. Proc. CIKM, 1009--1018.
[44]
Xiang, B., Jiang, D., Pei, J., Sun, X., Chen, E., and Li, H. (2010). Context-aware ranking in web search. Proc. SIGIR. 451--458.
[45]
Yan, J., Liu, N., Wang, G., Zhang, W., Jiang, Y., and Chen, Z. (2009). How much can behavioral targeting help online advertising? Proc. WWW, 261--270.
[46]
Zhao Y. and Karypis, G. (2002). Criterion functions for document clustering: Experiments and analysis. Proc. CIKM, 515--524.

Cited By

View all
  • (2023)Item Recommendation on Shared Accounts Through User IdentificationSocial Media Processing10.1007/978-981-99-7596-9_5(63-76)Online publication date: 15-Nov-2023
  • (2022)Bridging marketing theory and big data analyticsInternational Journal of Information Management: The Journal for Information Professionals10.1016/j.ijinfomgt.2020.10225356:COnline publication date: 22-Apr-2022
  • (2020)Towards Deployment of Robust Cooperative AI AgentsProceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3398761.3398817(447-455)Online publication date: 5-May-2020
  • Show More Cited By

Index Terms

  1. From devices to people: attribution of search activity in multi-user settings

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Other conferences
        WWW '14: Proceedings of the 23rd international conference on World wide web
        April 2014
        926 pages
        ISBN:9781450327442
        DOI:10.1145/2566486

        Sponsors

        • IW3C2: International World Wide Web Conference Committee

        In-Cooperation

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 07 April 2014

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. multi-user settings
        2. search activity attribution

        Qualifiers

        • Research-article

        Conference

        WWW '14
        Sponsor:
        • IW3C2

        Acceptance Rates

        WWW '14 Paper Acceptance Rate 84 of 645 submissions, 13%;
        Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)1
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 28 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2023)Item Recommendation on Shared Accounts Through User IdentificationSocial Media Processing10.1007/978-981-99-7596-9_5(63-76)Online publication date: 15-Nov-2023
        • (2022)Bridging marketing theory and big data analyticsInternational Journal of Information Management: The Journal for Information Professionals10.1016/j.ijinfomgt.2020.10225356:COnline publication date: 22-Apr-2022
        • (2020)Towards Deployment of Robust Cooperative AI AgentsProceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3398761.3398817(447-455)Online publication date: 5-May-2020
        • (2020)OB-WSPES: A Uniform Evaluation System for Obfuscation-based Web Search PrivacyIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2019.2962440(1-1)Online publication date: 2020
        • (2018)Identifying Users behind Shared Accounts in Online Streaming ServicesThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3210054(65-74)Online publication date: 27-Jun-2018
        • (2017)Robust advertisement allocationProceedings of the 26th International Joint Conference on Artificial Intelligence10.5555/3171837.3171904(4419-4425)Online publication date: 19-Aug-2017
        • (2016)Towards searching as a learning processJournal of Information Science10.1177/016555151561584142:1(19-34)Online publication date: 1-Feb-2016
        • (2016)Improving Advertisement Recommendation by Enriching User Browser Cookie AttributesProceedings of the 25th ACM International on Conference on Information and Knowledge Management10.1145/2983323.2983374(2401-2404)Online publication date: 24-Oct-2016
        • (2016)Synthesizing Plausible Privacy-Preserving Location Traces2016 IEEE Symposium on Security and Privacy (SP)10.1109/SP.2016.39(546-563)Online publication date: May-2016
        • (2015)Watch-it-nextProceedings of the 2015th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part III10.5555/3120539.3120552(180-195)Online publication date: 7-Sep-2015
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media