Abstract
The growing quantity of user-generated book reviews has opened up unprecedented opportunities for empirical research on books, reading, and readership. While there is an abundance of literature addressing the legal and ethical use of user-generated and social media data in general, for user-generated book reviews, such discussions have been mostly absent. From a library and information sciences perspective, user-generated book reviews can pose novel challenges because each book reviewer may simultaneously be (1) a presumably anonymous and safe online user; and, (2) an identifiable reader who can suffer real harm, e.g., cyber doxing and personal attack. This user/reader duality can create conflicting recommendations regarding which legal or ethical guidelines to follow. According to our review, potential legal issues include copyright infringement and violations of terms of service/end-user license agreements and privacy rights, while ethical concerns are centered on users’ expectations, informed consent, and institutional reviews. This paper reviews (1) potential legal and ethical pitfalls in leveraging user-generated book reviews; and, (2) professional and scholarly references that might serve as useful guidelines to avoid or manage these pitfalls.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In the context of this paper, user-generated book reviews include not only actual book reviews but also numerical ratings, crowdsourced tags, user-curated book lists, virtual collections of books, graphic content, etc.
- 2.
For example, book reviews may contain user names that overlap with real names, email addresses, identifying parts of addresses, or workplaces.
- 3.
Wattpad is a storytelling and social reading platform based in Canada [160].
- 4.
EULA is a contract between the licensor and the licensee, which establishes the licensee’s right to use a proprietary product. TOS refers to a contract between a provider and a user which defines the rules that a user should follow in order to use a service. In our research contexts, we consider them interchangeable terms, as both of them specify the permissions and prohibitions for using the book review platforms’ service, products, and/or data.
- 5.
- 6.
The Internet Archive is a large digital library that preserves and provides digitized content to the public [154].
- 7.
Due to length constraints of this paper, we only discussed some of the articles that we reviewed for this paper. The full list of references is available at https://github.com/Yuerong2/iConference2023appendix/blob/main/iconference2023referencesAppendix.pdf. Our literature review is limited to empirical research on user-generated book reviews based on computational and/or qualitative methods. We did not consider theoretical work on user-generated book reviews without empirical data involved.
- 8.
Amazon (Amazon.com: Books) is currently the largest online bookseller worldwide. Goodreads is one of the dominant social reading and book review platforms based in the United States, with 90 million registered members as of 2019. LibraryThing is one of the most impactful social cataloging platforms based in the United States, with 2.6 million users as of 2021 [54, 73, 95, 156,157,158,159].
- 9.
- 10.
Such considerations might not apply to studies on user-generated data. We elaborate on this issue in Sect. 3.4.
- 11.
In the United States, an Institutional Review Board (IRB) is an administrative unit formally designated to review and monitor research activities using human research subjects. IRBs approve or disapprove research proposals prior to their initiation to ensure the rights and welfare of human research subjects [144].
- 12.
- 13.
The US copyright law demands consideration of four factors for determining whether fair use is applicable: purpose and character of the use; nature of the copyrighted work; amount and substantiality of the portion used; and the effect of the use upon the potential market for the copyrighted work. For research based on user-generated book reviews, the first two conditions of fair use may be less of a concern, but researchers should pay more attention to the third and fourth conditions.
- 14.
“Transformative use” of the data alters original content to give it “new expression, meaning or message” [133]. “Non-consumptive use” refers to computer-assisted research, which has been found not to conflict with copyright holders’ interests. For instance, in transformative and non-consumptive research, digital humanities scholars can conduct computational text analysis of millions of books (copyrighted books included) without actually reading or re-disseminating (i.e., without human “consumption” of) any expressive content of those books [113].
- 15.
It should be noted that “these state laws, however, are overridden or trumped by federal laws that allow federal agencies to seek library records” [21, 90]. They vary by state, however, they reflect a consensus that library users’ data are confidential and should only be disclosed under certain circumstances (e.g., with the user’s informed consent, under a court order, etc.).
- 16.
Robots.txt files are developed and used primarily to inform search engines and web scrapers whether data on a webpage is prohibited or permitted for harvesting. They are widely adopted by the websites to regulate scraping, although their prohibitions “fall into a legal grey area” [123].
- 17.
Accessed in August 2022.
- 18.
In this case, hiQ scraped publicly available user data from LinkedIn’s website to supply its own business, in spite of LinkedIn’s no-data-scraping policies, letters specifically addressed to hiQ, and technical measures enacted against hiQ. LinkedIn claimed that hiQ’s scraping violated the CFAA, the Digital Millennium Copyright Act, and state trespass law, while hiQ denied these claims and asserted its right to scrape publicly accessible data [53].
- 19.
However, in practice, it is difficult for researchers to verify whether the reviewers are indeed aware of the public accessibility of their data. Researchers should not make assumptions about users’ awareness.
- 20.
Kosinski and colleagues argue that no consent is needed and user-generated online data can be conceptualized as archival data if (1) users consciously made their data public; (2) data collected is anonymized; (3) researchers do not interact with participants; and, (4) no identifiable user information is published. [87].
- 21.
Different IRBs might make different decisions on requests for exemption based on specific research proposals. For instance, we learned from our own research experience that analysis of publicly available and de-identified book review data without any interaction with the reviewers is mostly likely to be considered “Not Human Subjects Research” (NHSR) by the IRB at our home institution [142]. In this case, researchers who believe their work does not require IRB review or oversight should submit a request to their institution’s IRB for a designation as Not Human Subjects Research. They might also consider asking for an Exempt Status determination, in which case they are performing Human Subjects research but are exempt from regular oversight.
References
ACM Code 2018 Task Force: ACM code of ethics and professional conduct (2018). https://www.acm.org/code-of-ethics
ACM Technology Policy Council, ACM Europe Technology Policy Committee and ACM US Technology Policy Council: Statement on principles for responsible algorithmic systems (2022). https://www.acm.org/binaries/content/assets/public-policy/final-joint-ai-statement-update.pdf
Acquisti, A., Brandimarte, L., Loewenstein, G.: Privacy and human behavior in the age of information. Science 347(6221), 509–514 (2015)
Albrechtslund, A.M.B.: Negotiating ownership and agency in social media: community reactions to amazon’s acquisition of Goodreads. First Monday (2017)
American Civil Liberties Union: Federal court rules ‘big data’ discrimination studies do not violate federal anti-hacking law (2020). https://www.aclu.org/press-releases/federal-court-rules-big-data-discrimination-studies-do-not-violate-federal-anti
American Library Association: The USA patriot act (2009). https://www.ala.org/ala/washoff/WOissues/civilliberties/theusapatriotact/usapatriotact.htm
American Library Association: Intellectual freedom: issues and resources (2017). https://www.ala.org/advocacy/intfreedom
American Library Association: Ala statement on book censorship (2021). https://www.ala.org/advocacy/statement-regarding-censorship
American Library Association: State privacy laws regarding library records (2021). https://www.ala.org/advocacy/privacy/statelaws
American Library Association Council: Policy concerning confidentiality of personally identifiable information about library users (1991). https://www.ala.org/advocacy/intfreedom/statementspols/otherpolicies/policyconcerning
Annette Markham and Elizabeth Buchanan: Ethical decision-making and internet research: recommendations from the AoIR ethics working committee (version 2.0) (2012). https://aoir.org/reports/ethics2.pdf
Antoniak, M., Walsh, M., Mimno, D.: Tags, borders, and catalogs: social re-working of genre on librarything. Proc. ACM Hum.-Comput. Interact. 5(CSCW1), 1–29 (2021)
Asher, A., et al.: Ethics in research use of library patron data: glossary and explainer (2018). https://doi.org/10.17605/OSF.IO/XFKZ6
Association for Computing Machinery: Scraping by: reconsidering law & technology for online data collection - 19 May 2022 (2022). https://www.acm.org/public-policy/ustpc/hottopics/online-data-collection
Band, J.: LCA comments on authors guild v. hathitrust decision (2012). https://www.arl.org/news/lca-comments-on-authors-guild-v-hathitrust-decision/
Bartley, P.: Book tagging on LibraryThing: how, why, and what are in the tags? Proc. Am. Soc. Inf. Sci. Technol. 46(1), 1–22 (2009)
BBC News: Author Richard Brittain attacked reviewer with bottle (2015). https://www.bbc.com/news/uk-scotland-edinburgh-east-fife-34775814
Böhme, R., Köpsell, S.: Trained to accept? A field experiment on consent dialogs. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2403–2406 (2010)
Boot, P., Koolen, M.: Captivating, splendid or instructive?: assessing the impact of reading in online book reviews. Sci. Study Lit. 10(1), 35–63 (2020)
Bourrier, K., Thelwall, M.: The social lives of books: reading Victorian literature on goodreads. J. Cult. Anal. 1(1), 12049 (2020)
Bowers, S.L.: Privacy and library records. J. Acad. Librariansh. 32(4), 377–383 (2006)
Bruckman, A.: Studying the amateur artist: a perspective on disguising data collected in human subjects research on the internet. Ethics Inf. Technol. 4(3), 217–231 (2002)
California Legislative Information: Title 1.81.5. California consumer privacy act of 2018 (2018). https://leginfo.legislature.ca.gov/faces/codes_displayText.xhtml?division=3. &part=4. &lawCode=CIV &title=1.81.5
Carman, N.: LibraryThing tags and Library of Congress Subject Headings: A comparison of science fiction and fantasy works. School of Information Management at Victoria University of Wellington (2009)
Chang, K., et al.: Book reviews and the consolidation of genre. In: DH2020 (ADHO) Proceedings (2020). http://dx.doi.org/10.17613/02q2-1v27
Chen, P.Y., Dhanasobhon, S., Smith, M.D.: All reviews are not created equal: the disaggregate impact of reviews and reviewers at amazon.com (2008)
Chevalier, J.A., Mayzlin, D.: The effect of word of mouth on sales: online book reviews. J. Mark. Res. 43(3), 345–354 (2006)
Court of Appeal, Second District, Division 3, California.: Long v. Provide Commerce Inc (2016). https://caselaw.findlaw.com/ca-court-of-appeal/1729412.html
Crawford, K., Finn, M.: The limits of crisis data: analytical and ethical challenges of using social and mobile data to understand disasters. GeoJournal 80(4), 491–502 (2015)
Computer Crime and Intellectual Property Section Criminal Division: Prosecuting computer crimes manual (2010). https://www.justice.gov/criminal/file/442156/download
Dai, L.: From the history of the book to the history of reading: theories and methods for historical studies of reading. Xinxing (2017)
De Greve, L., Martens, G.: # bookstagram and beyond: the presence and depiction of the Bachmann literary prize on social media (2007–2017). Digit. Humanit. Benelux J. 3, 81–102 (2021)
Diesner, J., Chin, C.: Seeing the forest for the trees: considering applicable types of regulation for the responsible collection and analysis of human centered data. In: Human-Centered Data Science (HCDS) Workshop at 19th ACM Conference on Computer-Supported Cooperative Work and Social Computing (2016)
Diesner, J., Chin, C.L.: Usable ethics: practical considerations for responsibly conducting research with social trace data. In: Proceedings of Beyond IRBs: Ethical Review Processes for Big Data Research (2015)
Diesner, J., Chin, C.L.: Gratis, libre, or something else? Regulations and misassumptions related to working with publicly available text data. In: Actes du Workshop on Ethics In Corpus Collection, Annotation & Application (ETHI-CA2), LREC, Portoroz, Slovénie (2016)
Dimitrov, S., Zamal, F., Piper, A., Ruths, D.: Goodreads versus amazon: the effect of decoupling book reviewing and book selling. In: Ninth International AAAI Conference on Web and Social Media (2015)
Drew, C.: Data science ethics in government. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 374(2083), 20160119 (2016)
Driscoll, B., Rehberg Sedo, D.: Faraway, so close: seeing the intimacy in goodreads reviews. Qual. Inq. 25(3), 248–259 (2019)
Driscoll, B., Rehberg Sedo, D.: The transnational reception of bestselling books between Canada and Australia. Global Media Commun. 16(2), 243–258 (2020)
Ehrmann, T., Schmale, H.: The hitchhiker’s guide to the long tail: the influence of online-reviews and product recommendations on book sales-evidence from German online retailing. In: ICIS 2008 Proceedings, p. 157 (2008)
Ellis, D.: What charles and anti-charles reveal about goodreads homophobia (2020). https://bookriot.com/goodreads-homophobia/
English, J., Ungar, L., Dhakecha, R.H., Scott, E.: Mining goodreads (literary reception studies at scale) (2018). https://pricelab.sas.upenn.edu/projects/goodreads-project
Estabrook, L.S.: Sacred trust or competitive opportunity: using patron records. Libr. J. 121(2), 48–49 (1996)
European Union (EU): Complete guide to GDPR (general data protection regulation) compliance (2016). https://gdpr.eu/
Fiesler, C.: Ethical considerations for research involving (speculative) public data. Proc. ACM Hum.-Comput. Interact. 3(GROUP), 1–13 (2019)
Fiesler, C., Beard, N., Keegan, B.C.: No robots, spiders, or scrapers: legal and ethical regulation of data collection methods in social media terms of service. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 14, pp. 187–196 (2020)
Fiesler, C., Lampe, C., Bruckman, A.S.: Reality and perception of copyright terms of service for online content creation. In: Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing, pp. 1450–1461 (2016)
Fiesler, C., Proferes, N.: “Participant” perceptions of twitter research ethics. Soc. Media+ Soc. 4(1), 2056305118763366 (2018)
Fiesler, C.: Law & ethics of scraping: what HiQ v Linkedin could mean for researchers violating TOS (2017). https://cfiesler.medium.com/law-ethics-of-scraping-what-hiq-v-linkedin-could-mean-for-researchers-violating-tos-787bd3322540
Fornaciari, T., Poesio, M.: Identifying fake amazon reviews as learning from crowds. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 279–287. Association for Computational Linguistics (2014)
Franzke, A.S., Bechmann, A., Zimmer, M., Ess, C.: Internet research ethics guidelines (IRE 3.0 6.1) (2019). https://aoir.org/reports/ethics3.pdf
Gilbert, E., Karahalios, K.: Understanding deja reviewers. In: Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work, pp. 225–228 (2010)
Goldfein, S., Keyte, J.: Big data, web ‘scraping’ and competition law: the debate continues. New York Law J. 258(49), 1–3 (2017)
Goodreads: About goodreads (2022). https://www.goodreads.com/about/us
Goodreads: Goodreads robots.txt file (2022). https://www.goodreads.com/robots.txt
Goodreads: Terms of use (2022). https://www.goodreads.com/about/terms
Gray, J., Foong, C.: Publishers vs the internet archive: why the world’s biggest online library is in court over digital book lending (2022). https://theconversation.com/publishers-vs-the-internet-archive-why-the-worlds-biggest-online-library-is-in-court-over-digital-book-lending-187166
Greene, D., Hoffmann, A.L., Stark, L.: Better, nicer, clearer, fairer: a critical assessment of the movement for ethical artificial intelligence and machine learning. In: Proceedings of the Annual Hawaii International Conference on System Sciences, pp. 2122–2131 (2019)
Guan, X., Li, Y., Gong, H., Sun, H., Zhou, C.: An improved SVM for book review sentiment polarity analysis. In: 2018 International Conference on Transportation Logistics, Information Communication, Smart City (TLICSC 2018). Atlantis Press (2018)
Hajibayova, L.: Investigation of goodreads’ reviews: kakutanied, deceived or simply honest? J. Doc. 75(3), 612–626 (2019)
HathiTrust Digital Library: Our digital library (2022). https://www.hathitrust.org/digital_library
He, R., McAuley, J.: Ups and downs: modeling the visual evolution of fashion trends with one-class collaborative filtering. In: Proceedings of the 25th International Conference on World Wide Web, pp. 507–517 (2016)
Holur, P., Shahsavari, S., Ebrahimzadeh, E., Tangherlini, T.R., Roychowdhury, V.: Modelling social readers: novel tools for addressing reception from online book reviews. Roy. Soc. Open Sci. 8(12), 210797 (2021)
Hong, H., Xu, D., Xu, D., Wang, G.A., Fan, W.: An empirical study on the impact of online word-of-mouth sources on retail sales. Inf. Discov. Deliv. 45(1), 30–35 (2017)
Howison, J., Wiggins, A., Crowston, K.: Validity issues in the use of social network analysis with digital trace data. J. Assoc. Inf. Syst. 12(12), 2 (2011)
Howsam, L.: Old Books and New Histories: An Orientation to Studies in Book and Print Culture. University of Toronto Press, Toronto (2006)
Hu, N., Bose, I., Gao, Y., Liu, L.: Manipulation in digital word-of-mouth: a reality check for book reviews. Decis. Support Syst. 50(3), 627–635 (2011)
Hu, N., Bose, I., Koh, N.S., Liu, L.: Manipulation of online reviews: an analysis of ratings, readability, and sentiments. Decis. Support Syst. 52(3), 674–684 (2012)
Hu, N., Koh, N.S., Reddy, S.K.: Ratings lead you to the product, reviews help you clinch it? The mediating role of online review sentiments on product sales. Decis. Support Syst. 57, 42–53 (2014)
Hu, N., Liu, L., Sambamurthy, V.: Fraud detection in online consumer reviews. Decis. Support Syst. 50(3), 614–626 (2011)
Hu, N., Liu, L., Zhang, J.J.: Do online reviews affect product sales? The role of reviewer characteristics and temporal effects. Inf. Technol. Manag. 9(3), 201–214 (2008)
Hu, Y.: Synthesizing digital libraries and digital humanities perspectives for illuminating under-investigated complexities associated with user-generated book reviews. In: Proceedings of the 22nd ACM/IEEE Joint Conference on Digital Libraries, pp. 1–2 (2022)
Hu, Y., LeBlanc, Z., Diesner, J., Underwood, T., Layne-Worthey, G., Downie, J.S.: Complexities associated with user-generated book reviews in digital libraries: temporal, cultural, and political case studies. In: Proceedings of the 22nd ACM/IEEE Joint Conference on Digital Libraries, pp. 1–12 (2022)
Hudson, J.M., Bruckman, A.: “Go away”: participant objections to being studied and the ethics of chatroom research. Inf. Soc. 20(2), 127–139 (2004)
Hui, N.: Content-specific ranking prediction for online reviews-case of douban book reviews. Manag. Rev. 33(2), 176 (2021)
Hutton, L., Henderson, T.: Making social media research reproducible. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 9, pp. 2–7 (2015)
International Federation of Library Associations and Institutions: IFLA code of ethics for librarians and other information workers (full version) (2012). https://www.ifla.org/publications/ifla-code-of-ethics-for-librarians-and-other-information-workers-full-version/
International Federation of Library Associations and Institutions: IFLA statement on privacy in the library environment (2015). https://www.ifla.org/publications/ifla-statement-on-privacy-in-the-library-environment/
Jett, J., Cole, T., Maden, C., Downie, J.: The hathitrust research center workset ontology: a descriptive framework for non-consumptive research collections. J. Open Humanit. Data 2 (2016)
Jiang, M., Diesner, J.: Issue-focused documentaries versus other films: rating and type prediction based on user-authored reviews. In: Proceedings of the 27th ACM Conference on Hypertext and Social Media, pp. 225–230 (2016)
Jiang, M., Diesner, J.: Says who\(\ldots \)? Identification of expert versus layman critics’ reviews of documentary films. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 2122–2132 (2016)
Kaminski, M.: A recent renaissance in privacy law. Commun. ACM 63(9), 24–27 (2020)
Kayla: Book chat: Authors being negative towards reviewers (2017). https://gracelingaccountantblog.wordpress.com/2017/12/06/book-chat-authors-being-negative-towards-reviewers/
Klinefelter, A.: Reader privacy in digital library collaborations: signs of commitment, opportunities for improvement. ISJLP 13, 199 (2016)
Koolen, M., Neugarten, J., Boot, P.: ‘This book makes me happy and sad and i love it’. a rule-based model for extracting reading impact from English book reviews. J. Comput. Literary Stud. 1(1) (2022)
Koolena, M., Bootb, P., van Zundertb, J.J.: Online book reviews and the computational modelling of reading impact. In: Proceedings of Workshop on Computational Humanities Research (CHR), vol. 1613, p. 0073 (2020)
Kosinski, M., Matz, S.C., Gosling, S.D., Popov, V., Stillwell, D.: Facebook as a research tool for the social sciences: opportunities, challenges, ethical considerations, and practical guidelines. Am. Psychol. 70(6), 543 (2015)
Kuijpers, M.M.: Bodily involvement in readers’ online book reviews: applying text world theory to examine absorption in unprompted reader response. J. Lit. Semant. 51(2), 111–129 (2022)
Kutzner, K., Petzold, K., Knackstedt, R.: Characterising social reading platforms-a taxonomy-based approach to structure the field. In: Proceedings of the 14th International Conference on Wirtschaftsinformatik (2019)
Lambert, A.D., Parker, M., Bashir, M.: Library patron privacy in jeopardy an analysis of the privacy policies of digital content vendors. Proc. Assoc. Inf. Sci. Technol. 52(1), 1–9 (2015)
Lamdan, S.S.: Why library cards offer more privacy rights than proof of citizenship: librarian ethics and freedom of information act requestor policies. Gov. Inf. Q. 30(2), 131–140 (2013)
Lanjinger: One-star reviewing bombing started from the truce (the diary of martín santomé) (orginally in Chinese) (2021). https://k.sina.com.cn/article_5617041192_14ecd3f280200135ul.html
Lavin, M.J., et al.: Cultural analytics and the book review: models, methods, and corpora. In: DH2020(ADHO) Proceedings (2020). https://dh2020.adho.org/wp-content/uploads/2020/07/516_CulturalAnalyticsandtheBookReviewModelsMethodsandCorpora.html
LibraryThing: Privacy policy, community rules, and terms of service (2020). https://www.librarything.com/privacy
LibraryThing: About librarything (2022). https://www.librarything.com/about
Lin, E., Fang, S., Wang, J.: Mining online book reviews for sentimental clustering. In: 2013 27th International Conference on Advanced Information Networking and Applications Workshops, pp. 179–184. IEEE (2013)
Lu, C., Park, J.R., Hu, X.: User tags versus expert-assigned subject terms: a comparison of librarything tags and library of congress subject headings. J. Inf. Sci. 36(6), 763–779 (2010)
Lunnay, B., Borlagdan, J., McNaughton, D., Ward, P.: Ethical use of social media to facilitate qualitative research. Qual. Health Res. 25(1), 99–109 (2015)
Maity, S.K., Panigrahi, A., Mukherjee, A.: Book reading behavior on goodreads can predict the amazon best sellers. In: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, pp. 451–454 (2017)
Mannheimer, S., Pienta, A., Kirilova, D., Elman, C., Wutich, A.: Qualitative data sharing: data repositories and academic libraries as key partners in addressing challenges. Am. Behav. Sci. 63(5), 643–664 (2019)
Mannheimer, S., Young, S.W., Rossmann, D.: On the ethics of social network research in libraries. J. Inf. Commun. Ethics Soc. (2016)
Martens, M., Balling, G., Higgason, K.A.: # booktokmademereadit: young adult reading communities across an international, sociotechnical landscape. Inf. Learn. Sci. (ahead-of-print) (2022)
McAuley, J., Targett, C., Shi, Q., Van Den Hengel, A.: Image-based recommendations on styles and substitutes. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 43–52 (2015)
McCluskey, M.: Goodreads’ problem with extortion scams and review bombing (2021). https://time.com/6078993/goodreads-review-bombing/
McDonald, A.M., Cranor, L.F.: The cost of reading privacy policies. ISJLP 4, 543 (2008)
Mengting, W.: UCSD book graph: Goodreads datasets (2019). https://sites.google.com/eng.ucsd.edu/ucsdbookgraph/home
Metcalf, J., Crawford, K.: Where are human subjects in big data research? The emerging ethics divide. Big Data Soc. 3(1), 2053951716650211 (2016)
Milligan, I.: The problem of history in the age of abundance (2016). http://hdl.handle.net/10012/11817
Mishra, S., Saini, A., Makki, R., Mehta, S., Haghighi, A., Mollahosseini, A.: Tweetnerd-end to end entity linking benchmark for tweets. arXiv preprint arXiv:2210.08129 (2022)
Nakamura, L.: “Words with friends”: socially networked reading on goodreads. PMLA/Publ. Mod. Lang. Assoc. Am. 128(1), 238–243 (2013)
Nan, X., Li, M., Shi, J.: Using altmetrics for assessing impact of highly-cited books in Chinese book citation index. Scientometrics 122(3), 1651–1669 (2020)
Oltmann, S.M.: Intellectual freedom and freedom of speech: three theoretical perspectives. Libr. Q. 86(2), 153–171 (2016)
Organisciak, P., Downie, J.S.: Research access to in-copyright texts in the humanities. In: Information and Knowledge Organisation in Digital Humanities, pp. 157–177. Routledge (2021)
Pianzola, F., Rebora, S., Lauer, G.: Wattpad as a resource for literary studies. quantitative and qualitative examples of the importance of digital social reading and readers’ comments in the margins. PLoS ONE 15(1), e0226708 (2020)
Pianzola, F., et al.: Books’ impact in digital social reading: towards a conceptual and methodological framework. In: Digital Humanities 2022 Conference Abstracts, pp. 94–98 (2022). https://dh2022.dhii.asia/dh2022bookofabsts.pdf
Pinch, T.: Book reviewing for amazon.com: how socio-technical systems struggle to make less from more. In: Managing Overflow in Affluent Societies, pp. 80–99. Routledge (2012)
Reads with Rachel: Author attacks book reviewer |Richard Brittain | authors behaving badly (2022). https://www.youtube.com/watch?v=4Z5iIP8c5qs
Rebora, S., et al.: Digital humanities and digital social reading. Digit. Scholarsh. Humanit. 36(Supplement_2), ii230–ii250 (2021)
Rebora, S., Messerli, T., Herrmann, J.B.: Towards a computational study of German book reviews. A comparison between emotion dictionaries and transfer learning in sentiment analysis. 8. Jahrestagung «Digital Humanities im deutschsprachigen Raum»(DhD), Potsdam, D. (2022)
Rebora, S., Pianzola, F.: A new research programme for reading research: analysing comments in the margins on wattpad. DigitCult-Sci. J. Digit. Cult. 3(2), 19–36 (2018)
Rezapour, R., Diesner, J.: Classification and detection of micro-level impact of issue-focused documentary films based on reviews. In: Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing, pp. 1419–1431 (2017)
Sabri, N., Weber, I.: A global book reading dataset. Data 6(8), 83 (2021)
Samberg, R.G., Hennesy, C.: Law and literacy in non-consumptive text mining: guiding researchers through the landscape of computational text analysis (2019)
Sen, S., Lerman, D.: Why are you telling me this? an examination into negative consumer reviews on the web. J. Interact. Mark. 21(4), 76–94 (2007)
Shahsavari, S., et al.: An automated pipeline for character and relationship extraction from readers literary book reviews on goodreads.com. In: 12th ACM Conference on Web Science, pp. 277–286 (2020)
Sharma, R.: Black and LGBTQ+ authors say they’re being harassed on goodreads and trolled with one-star book reviews (2021). https://inews.co.uk/culture/books/goodreadsbookreviewsblacklgbtq-authorsharrassedtrolled949179
Sharmaa, A., Hu, Y., Wu, P., Shang, W., Singhal, S., Underwood, T.: The rise and fall of genre differentiation in English-language fiction. In: DH2020 (ADHO) Proceedings, vol. 1613, p. 0073 (2020)
Sheila (Book Journey): When authors attack\(\ldots \) (2011). https://bookjourney.net/2011/12/04/when-authors-attack/
Shen, X., Zhang, K.Z., Zhao, S.J.: Understanding information adoption in online review communities: the role of herd factors. In: 2014 47th Hawaii International Conference on System Sciences, pp. 604–613. IEEE (2014)
Shenglan, T., Haiqing, H., JIANG, L., Xu, Z., SELMAN, R.L.: Chinese and English reviews of a story about teenagers’ struggles: a multi-method analysis of cultural differences in narrative interpretation. Beijing Int. Rev. Educ. 2(3), 365–387 (2020)
Sourati Hassan Zadeh, Z., Sabri, N., Chamani, H., Bahrak, B.: Quantitative analysis of fanfictions’ popularity. Soc. Netw. Anal. Mining 12(1), 1–11 (2022)
Srivastava, A.K., Mishra, R.: Analyzing social media research: a data quality and research reproducibility perspective. IIM Kozhikode Soc. Manag. Rev. 12(1), 39–49 (2021)
Supreme Court: Campbell v. acuff-rose music (92-1292), 510 U.S. 569 (1994). https://www.law.cornell.edu/supct/html/92-1292.ZS.html
Szkolar, D.: The USA patriot act: should your library have an official policy? (2013). https://ischool.syr.edu/the-usa-patriot-act-should-your-library-have-an-official-policy/
The European Parliament and the Council of the European Union: Directive (EU) 2019/790 of the European parliament and of the council of 17 April 2019 on copyright and related rights in the digital single market and amending directives 96/9/EC and 2001/29/EC (2019). https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:32019L0790 &from=EN
Thelwall, M.: Book genre and author gender: romance \(>\) paranormal-romance to autobiography \(>\) memoir. J. Assoc. Inf. Sci. Technol. 68(5), 1212–1223 (2017)
Thelwall, M.: Reader and author gender and genre in goodreads. J. Librariansh. Inf. Sci. 51(2), 403–430 (2019)
Thelwall, M., Kousha, K.: Goodreads: a social network site for book readers. J. Am. Soc. Inf. Sci. 68(4), 972–983 (2017)
Thomas, M., Caudle, D.M., Schmitz, C.: Trashy tags: problematic tags in librarything. New Library World (2010)
Slee, T.J.: Who is the average goodreads user? You’ll be surprised! (2017). https://www.goodreads.com/author_blog_posts/14538341-who-is-the-average-goodreads-user-you-ll-be-surprised
Tsur, O., Rappoport, A.: Revrank: a fully unsupervised algorithm for selecting the most helpful book reviews. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 3 (2009)
University of Illinois Office for the Protection of Research Subjects: Decision trees (2022). https://oprs.research.illinois.edu/review-processes-checklists/decision-trees
US Copyright Office: Copyright law of the united states (title 17) (2021). https://www.copyright.gov/title17/
U.S. Food and Drug Administration: Institutional review boards frequently asked questions (1998). https://www.fda.gov/regulatory-information/search-fda-guidance-documents/institutional-review-boards-frequently-asked-questions
Vaccaro, K., Karahalios, K., Sandvig, C., Hamilton, K., Langbort, C.: Agree or cancel? Research and terms of service compliance. In: ACM CSCW Ethics Workshop: Ethics for Studying Sociotechnical Systems in a Big Data World (2015)
Verma, P.: The fight between authors and librarians tearing book lovers apart (2022). https://www.washingtonpost.com/technology/2022/07/25/internet-archive-digital-lending-lawsuit/
Vitak, J., Proferes, N., Shilton, K., Ashktorab, Z.: Ethics regulation in social computing research: examining the role of institutional review boards. J. Empir. Res. Hum. Res. Ethics 12(5), 372–382 (2017)
Vitak, J., Shilton, K., Ashktorab, Z.: Beyond the belmont principles: ethical challenges, practices, and beliefs in the online data research community. In: Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing, pp. 941–953 (2016)
Voorbij, H.: The value of librarything tags for academic libraries. Online Inf. Rev. 36(2), 196–217 (2012)
Walsh, M., Antoniak, M.: The goodreads ‘classics’: a computational study of readers, amazon, and crowdsourced amateur criticism. J. Cult. Anal. 4, 243–287 (2021)
Wan, M., McAuley, J.J.: Item recommendation on monotonic behavior chains. In: Pera, S., Ekstrand, M.D., Amatriain, X., O’Donovan, J. (eds.) Proceedings of the 12th ACM Conference on Recommender Systems, RecSys 2018, Vancouver, BC, Canada, 2–7 October 2018, pp. 86–94. ACM (2018). https://doi.org/10.1145/3240323.3240369
Wan, M., Misra, R., Nakashole, N., McAuley, J.J.: Fine-grained spoiler detection from large-scale review corpora. In: Korhonen, A., Traum, D.R., Màrquez, L. (eds.) Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, 28 July–2 August 2019, Volume 1: Long Papers, pp. 2605–2610. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/p19-1248
Wang, K., Liu, X., Han, Y.: Exploring goodreads reviews for book impact assessment. J. Informet. 13(3), 874–886 (2019)
Wikipedia contributors: Internet archive. Wikipedia (2022). https://en.wikipedia.org/wiki/Internet_Archive
Wikipedia contributors: Personal information protection law of the people’ s republic of china (2021). https://en.wikipedia.org/wiki/Personal_Information_Protection_Law_of_the_People%27s_Republic_of_China
Wikipedia contributors: Amazon books (2022). https://en.wikipedia.org/wiki/Amazon_Books
Wikipedia contributors: Amazon (company) (2022). https://en.wikipedia.org/wiki/Amazon_(company)
Wikipedia contributors: Goodreads (2022). https://en.wikipedia.org/wiki/Goodreads
Wikipedia contributors: Librarything (2022). https://en.wikipedia.org/wiki/LibraryThing
Wikipedia contributors: Wattpad (2022). https://en.wikipedia.org/wiki/Wattpad
World Intellectual Property Organization (WIPO): Wipo copyright treaty (1996). https://wipolex.wipo.int/en/text/295166
Worrall, A.: “like a real friendship”: translation, coherence, and convergence of information values in librarything and goodreads. In: iConference 2015 Proceedings (2015)
Worrall, A.: “connections above and beyond”: information, translation, and community boundaries in librarything and goodreads. J. Assoc. Inf. Sci. Technol. 70(7), 742–753 (2019)
Zhang, C., Tong, T., Bu, Y.: Examining differences among book reviews from various online platforms. Online Inf. Rev. 43(7), 1169–1187 (2019)
Zhou, Q., Zhang, C.: Relationship between scores and tags for Chinese books-in the case of douban book. J. Data Inf. Sci. 6(4), 40 (2013)
Zhou, Q., Zhang, C., Zhao, S.X., Chen, B.: Measuring book impact based on the multi-granularity online review mining. Scientometrics 107(3), 1435–1455 (2016). https://doi.org/10.1007/s11192-016-1930-5
Zimmer, M.: Addressing conceptual gaps in big data research ethics: an application of contextual integrity. Soc. Media+ Soc. 4(2), 2056305118768300 (2018)
Zimmer, M.: “But the data is already public”: on the ethics of research in Facebook. In: The Ethics of Information Technologies, pp. 229–241. Routledge (2020)
Zuccala, A.A., Verleysen, F.T., Cornacchia, R., Engels, T.C.: Altmetrics for the humanities: comparing goodreads reader ratings with citations to history books. Aslib J. Inf. Manag. (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Hu, Y., Layne-Worthey, G., Martaus, A., Downie, J.S., Diesner, J. (2023). Research with User-Generated Book Review Data: Legal and Ethical Pitfalls and Contextualized Mitigations. In: Sserwanga, I., et al. Information for a Better World: Normality, Virtuality, Physicality, Inclusivity. iConference 2023. Lecture Notes in Computer Science, vol 13971. Springer, Cham. https://doi.org/10.1007/978-3-031-28035-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-28035-1_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-28034-4
Online ISBN: 978-3-031-28035-1
eBook Packages: Computer ScienceComputer Science (R0)