ABSTRACT
The web is trapped in the "perpetual now", and when users traverse from page to page, they are seeing the state of the web resource (i.e., the page) as it exists at the time of the click and not necessarily at the time when the link was made. Thus, a temporal discrepancy can arise between the resource at the time the page author created a link to it and the time when a reader follows the link. This is especially important in the context of social media: the ease of sharing links in a tweet or Facebook post allows many people to author web content, but the space constraints combined with poor awareness by authors often prevents sufficient context from being generated to determine the intent of the post. If the links are clicked as soon as they are shared, the temporal distance between sharing and clicking is so small that there is little to no difference in content. However, not all clicks occur immediately, and a delay of days or even hours can result in reading something other than what the author intended. We introduce the concept of a user's temporal intention upon publishing a link in social media. We investigate the features that could be extracted from the post, the linked resource, and the patterns of social dissemination to model this user intention. Finally, we analyze the historical integrity of the shared resources in social media across time. In other words, how much is the knowledge of the author's intent beneficial in maintaining the consistency of the story being told through social posts and in enriching the archived content coverage and depth of vulnerable resources?
- E. Adar, J. Teevan, S. T. Dumais, and J. L. Elsas. The web changes everything: understanding the dynamics of web content. In WSDM'09: Proceedings of the Second ACM International Conference on Web Search and Data Mining, pages 282--291, 2009. Google ScholarDigital Library
- D. Antoniades, I. Polakis, G. Kontaxis, E. Athanasopoulos, S. Ioannidis, E. Markatos, and T. Karagiannis. we. b: The web of short urls. In Proceedings of the 20th international conference on World Wide Web, pages 715--724, 2011. Google ScholarDigital Library
- Z. Bar-Yossef, A. Z. Broder, R. Kumar, and A. Tomkins. Sic transit gloria telae: towards an understanding of the web's decay. In Proceedings of the 13th international conference on World Wide Web, WWW'04, pages 328--337, New York, NY, USA, 2004. ACM. Google ScholarDigital Library
- M. Ben Saad and S. Gançarski. Archiving the Web using Page Changes Pattern: A Case Study. In JCDL'11: Proceedings of ACM/IEEE Joint Conference on Digital Libraries, Ottawa, Canada, 2011. Google ScholarDigital Library
- A. Bermingham and A. F. Smeaton. On using twitter to monitor political sentiment and predict election results.Google Scholar
- J. Bollen, H. Mao, and X.-J. Zeng. Twitter mood predicts the stock market. abs/1010.3003, 2010.Google Scholar
- M. S. Charikar. Similarity estimation techniques from rounding algorithms. In Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, STOC'02, pages 380--388, New York, NY, USA, 2002. ACM. Google ScholarDigital Library
- Z. Chen, F. Lin, H. Liu, Y. Liu, W.-Y. Ma, and L. Wenyin. User intention modeling in web applications using data mining. World Wide Web, 5(3):181--191, Nov. 2002. Google ScholarDigital Library
- J. Cho and H. Garcia-Molina. Estimating frequency of change. ACM Transactions on Internet Technology, 3(3):256--290, 2003. Google ScholarDigital Library
- N. Dai and B. D. Davison. Vetting the links of the web. In Proceedings of the 18th ACM conference on Information and knowledge management, CIKM'09, pages 1745--1748, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- Z. Dalal, S. Dash, P. Dave, L. Francisco-Revilla, R. Furuta, U. Karadkar, and F. Shipman. Managing distributed collections: evaluating web page changes, movement, and replacement. In JCDL'04: Proceedings of the 4th ACM/IEEE-CS Joint Conference on Digital Libraries, pages 160--168, 2004. Google ScholarDigital Library
- J. L. Elsas and S. T. Dumais. Leveraging temporal dynamics of document content in relevance ranking. In Proceedings of the third ACM international conference on Web search and data mining, WSDM'10, pages 1--10, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- Facebook.com. Facebook official fact sheet. http://newsroom.fb.com/content/default.aspx?NewsAreaId=22, 2012. {Online; accessed 17-December-2012}.Google Scholar
- B. J. Jansen, D. L. Booth, and A. Spink. Determining the user intent of web search engine queries. In Proceedings of the 16th international conference on World Wide Web, WWW'07, pages 1149--1150, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- A. Java, X. Song, T. Finin, and B. Tseng. Why we twitter: understanding microblogging usage and communities. In Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, WebKDD/SNA-KDD'07, pages 56--65, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- V. Jethava, L. Calderón-Benavides, R. Baeza-Yates, C. Bhattacharyya, and D. Dubhashi. Scalable multi-dimensional user intent identification using tree structured distributions. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, SIGIR'11, pages 395--404, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
- B. Kahle. Preserving the Internet. Scientific American, 276(3):82--83, March 1997.Google ScholarCross Ref
- A. Kathuria, B. J. Jansen, C. Hafernik, and A. Spink. Classifying the user intent of web queries using k-means clustering. Internet Research, 20(5):563--581, 2010.Google ScholarCross Ref
- M. Klein. Using the Web Infrastructure for Real Time Recovery of Missing Web Pages. PhD thesis, Old Dominion University Department of Computer Science, 2011. Google ScholarDigital Library
- M. Klein and M. L. Nelson. Revisiting lexical signatures to (re-)discover web pages. In Proceedings of the 12th European conference on Research and Advanced Technology for Digital Libraries, ECDL'08, pages 371--382, Berlin, Heidelberg, 2008. Springer-Verlag. Google ScholarDigital Library
- M. Klein and M. L. Nelson. Find, new, copy, web, page - tagging for the (re-)discovery of web pages. In Proceedings of TPDL, pages 27--39, 2011. Google ScholarDigital Library
- M. Klein, J. L. Shipman, and M. L. Nelson. Is This a Good Title? In HT'10: Proceedings of the 21st ACM Conference on Hypertext and Hypermedia, pages 3--12, 2010. Google ScholarDigital Library
- M. Klein, J. Ware, and M. L. Nelson. Rediscovering missing web pages using link neighborhood lexical signatures. In Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries, JCDL'11, pages 137--140, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
- C. Kohlschütter, P. Fankhauser, and W. Nejdl. Boilerplate detection using shallow text features. In Proceedings of the third ACM international conference on Web search and data mining, WSDM'10, pages 441--450, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- J. H. Lee and X. Hu. Generating ground truth for music mood classification using mechanical turk. In Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries, JCDL'12, pages 129--138, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
- X. Li, Y.-Y. Wang, and A. Acero. Learning query intent from regularized click graphs. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR'08, pages 339--346, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- E. Loper and S. Bird. Nltk: the natural language toolkit. In Proceedings of the ACL-02 Workshop on Effective tools and methodologies for teaching natural language processing and computational linguistics - Volume 1, ETMTNLP'02, pages 63--70, Stroudsburg, PA, USA, 2002. Association for Computational Linguistics. Google ScholarDigital Library
- A. Mogadala and V. Varma. Twitter user behavior understanding with mood transition prediction. In Proceedings of the 2012 workshop on Data-driven user behavioral modelling and mining from social media, DUBMMSM'12, pages 31--34, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
- M. L. Nelson and B. D. Allen. Object persistence and availability in digital libraries. D-Lib Magazine, 8(1), 2002.Google ScholarCross Ref
- M. E. J. Newman and J. Park. Why social networks are different from other types of networks. Physical Review E, 68(3):036122+, sep 2003.Google Scholar
- A. Ntoulas, J. Cho, and C. Olston. What's new on the web?: the evolution of the web from a search engine perspective. In WWW'04: Proceedings of the 13th international Conference on World Wide Web, pages 1--12, 2004. Google ScholarDigital Library
- H. M. SalahEldeen. Losing my revolution: A year after the egyptian revolution, 10% of the social media documentation is gone. http://ws-dl.blogspot.com/2012/02/2012-02--11-losing-my-revolution-year.html, 2012.Google Scholar
- H. M. SalahEldeen and M. L. Nelson. Losing my revolution: How much social media content has been lost? In Proceedings of TPDL, pages 125--137, 2012. Google ScholarDigital Library
- R. Sanderson, M. Phillips, and H. Van de Sompel. Analyzing the persistence of referenced web resources with Memento. In Proceedings of Open Repositories 2011, 2011.Google Scholar
- R. L. Santos, C. Macdonald, and I. Ounis. Intent-aware search result diversification. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, SIGIR'11, pages 595--604, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
- Y. Tian and J. Zhu. Learning from crowds in the presence of schools of thought. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD'12, pages 226--234, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
- Twitter.com. Twitter numbers. http://blog.Twitter.com/2011/03/numbers.html, 2012. {Online; accessed 17-December-2012}.Google Scholar
- H. Van de Sompel, M. L. Nelson, R. Sanderson, L. L. Balakireva, S. Ainsworth, and H. Shankar. Memento: Time Travel for the Web. Technical Report arXiv:0911.1112, 2009.Google Scholar
- M. Wu, R. C. Miller, and G. Little. Web wallet: preventing phishing attacks by revealing user intentions. In Proceedings of the second symposium on Usable privacy and security, SOUPS'06, pages 102--113, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
Index Terms
- Reading the correct history?: modeling temporal intention in resource sharing
Recommendations
Engage Early, Correct More: How Journalists Participate in False Rumors Online during Crisis Events
CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing SystemsJournalists are struggling to adapt to new conditions of news production and simultaneously encountering criticism for their role in spreading misinformation. Against the backdrop of this "crisis in journalism", this research seeks to understand how ...
Celebrity's self-disclosure on Twitter and parasocial relationships
This study investigated how celebrities' self-disclosure on personal social media accounts, particularly Twitter, affects fans' perceptions. An online survey was utilized among a sample of 429 celebrity followers on Twitter. Results demonstrated that ...
Fake News Reading on Social Media: An Eye-tracking Study
HT '19: Proceedings of the 30th ACM Conference on Hypertext and Social MediaThe online spreading of fake news (and misinformation in general) has been recently identified as a major issue threatening entire societies. Much of this spreading was enabled by new media formats, namely social networks and online media sites. ...
Comments