Abstract
A book’s success/popularity depends on various parameters: extrinsic and intrinsic. In this paper, we study how the book reading characteristics might influence the popularity of a book. Towards this objective, we perform a cross-platform study of Goodreads entities and attempt to establish the connection between various Goodreads entities and the popular books (“Amazon best sellers”). We analyze the collective reading behavior on Goodreads platform and quantify various characteristic features of the Goodreads entities to identify differences between these Amazon best sellers (ABS) and the other non-best-selling books. We then develop a prediction model using the characteristic features to predict if a book shall become a best seller after 1 month (15 days) since its publication. On a balanced set, we are able to achieve a very high average accuracy of 88.72% (85.66%) for the prediction where the other competitive class contains books which are randomly selected from the Goodreads dataset. Our method primarily based on features derived from user posts and genre-related characteristic properties achieves an improvement of 16.4% over the traditional popularity factor (ratings, reviews)-based baseline methods. We also evaluate our model with two more competitive sets of books (a) that are both highly rated and have received a large number of reviews (but are not best sellers) (HRHR) and (b) Goodreads Choice Awards Nominated books which are non-best sellers (GCAN). We are able to achieve quite good results with very high average accuracy of 87.1% as well as high ROC for ABS vs GCAN. For ABS vs HRHR, our model yields a high average accuracy of 86.22%.
This research had been performed when all the researchers were at IIT Kharagpur, India.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
In Goodreads, a book shelf is a list where one can add or remove books to facilitate reading similar to real-life book shelf where one keep books.
- 2.
- 3.
- 4.
This research is an extension of our earlier published work [7] at ASONAM ’2017 and reporting a much more detailed analysis emphasizing various aspects of social book reading in more detail and perform detailed comparison of the best sellers with other kind of competitors
- 5.
- 6.
- 7.
References
E. Baumer, M. Sueyoshi, B. Tomlinson, Exploring the role of the reader in the activity of blogging, in CHI (2008), pp. 1111–1120
E.P. Baumer, M. Sueyoshi, B. Tomlinson, Bloggers and readers blogging together: collaborative co-creation of political blogs. Comput. Supported Coop. Work 20(1–2), 1–36 (2011)
S. Follmer, R.T. Ballagas, H. Raffle, M. Spasojevic, H. Ishii, People in books: Using a flashcam to become part of an interactive book for connected reading, in CSCW, 685–694 (2012)
B.A. Nardi, D.J. Schiano, M. Gumbrecht, Blogging as social activity, or, would you let 900 million people read your diary? in CSCW, 222–231 (2004)
H. Raffle, R. Ballagas, G. Revelle, H. Horii, S. Follmer, J. Go, E. Reardon, K. Mori, J. Kaye, M. Spasojevic, Family story play: Reading with young children (and elmo) over a distance, in CHI, pp. 1583–1592 (2010)
J.W. Hall, Hit Lit: Cracking the Code of the Twentieth Century’s Biggest Bestsellers (Random House, New York, 2012)
S.K. Maity, A. Panigrahi, A. Mukherjee, Book reading behavior on goodreads can predict the amazon best sellers, in Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017. ASONAM ’17 (2017), pp. 451–454
A. Ellegård, A Statistical Method for Determining Authorship: The Junius Letters, vol. 13 (Acta Universitatis Gothoburgensis, Göteborg, 1962), pp. 1769–1772
J. Harvey, The content characteristics of best-selling novels. Public Opin. Q. 17(1), 91–114 (1953)
J.J. McGann, The Poetics of Sensibility: A Revolution in Literary Style (Oxford University Press, Oxford, 1998)
C.J. Yun, Performance evaluation of intelligent prediction models on the popularity of motion pictures, in 2011 4th International Conference on Interaction Sciences (ICIS) (IEEE, New York, 2011), pp. 118–123
V.G. Ashok, S. Feng, Y. Choi, Success with style: using writing style to predict the success of novels, in Proceedings of EMNLP (2013), pp. 1753–1764
R. Gunning, The Technique of Clear Writing (McGraw-Hill, New York, 1952)
G.H. Mc Laughlin, Smog grading-a new readability formula. J. Read. 12(8), 639–646 (1969)
J.P. Kincaid, R.P. Fishburne Jr, R.L. Rogers, B.S. Chissom, Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. Technical report, DTIC Document (1975)
A. Stenner, I. Horabin, D.R. Smith, M. Smith, The Lexile Framework (MetaMetrics, Durham, 1988)
E. Fry, A readability formula for short passages. J. Read. 33(8), 594–597 (1990)
J.S. Chall, E. Dale, Readability Revisited: The New Dale-Chall Readability Formula (Brookline Books, Brookline, 1995)
A. Louis, Automatic metrics for genre-specific text quality, in Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop, Association for Computational Linguistics (2012), pp. 54–59
R.J. Kate, X. Luo, S. Patwardhan, M. Franz, R. Florian, R.J. Mooney, S. Roukos, C. Welty, Learning to predict readability using diverse linguistic features, in Proceedings of the 23rd International Conference on Computational Linguistics, Association for Computational Linguistics (2010), pp. 546–554
S.E. Schwarm, M. Ostendorf, Reading level assessment using support vector machines and statistical language models, in Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Association for Computational Linguistics (2005), pp. 523–530
M. Heilman, M. Eskenazi, Language learning: challenges for intelligent tutoring systems, in Proceedings of the Workshop of Intelligent Tutoring Systems for Ill-Defined Tutoring Systems. Eight International Conference on Intelligent Tutoring Systems (2006), pp. 20–28
K. Collins-Thompson, J.P. Callan, A language modeling approach to predicting reading difficulty, in HLT-NAACL (2004), pp. 193–200
E. Pitler, A. Nenkova, Revisiting readability: a unified framework for predicting text quality, in Proceedings of the Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics (2008), pp. 186–195
S. Raghavan, A. Kovashka, R. Mooney, Authorship attribution using probabilistic context-free grammars, in Proceedings of the ACL 2010 Conference Short Papers, Association for Computational Linguistics (2010), pp. 38–42
S. Feng, R. Banerjee, Y. Choi, Characterizing stylistic elements in syntactic structure, in Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Association for Computational Linguistics (2012), pp. 1522–1533
F. Peng, D. Schuurmans, S. Wang, V. Keselj, Language independent authorship attribution using character level language models, in Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics-Volume 1, Association for Computational Linguistics (2003), pp. 267–274
H.J. Escalante, T. Solorio, M. Montes-y Gómez, Local histograms of character n-grams for authorship attribution, in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, Association for Computational Linguistics (2011), pp. 288–298
E. Stamatatos, N. Fakotakis, G. Kokkinakis, Automatic authorship attribution, in Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics, Association for Computational Linguistics (1999), pp. 158–164
H. Baayen, H. Van Halteren, F. Tweedie, Outside the cave of shadows: using syntactic annotation to enhance authorship attribution. Lit. Linguist. Comput. 11(3), 121–132 (1996)
V.J. Rideout, E.A. Vandewater, E.A. Wartella, Zero to six: electronic media in the lives of infants, toddlers and preschoolers (2003)
H. Chen, X. Li, Z. Huang, Link prediction approach to collaborative filtering, in Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries, 2005, JCDL’05 (IEEE, New York, 2005), pp. 141–142
J. Kamps, The impact of author ranking in a library catalogue, in Proceedings of the 4th ACM Workshop on Online Books, Complementary Social Media and Crowdsourcing (ACM, New York, 2011), pp. 35–40
P.C. Vaz, D. Martins de Matos, B. Martins, P. Calado, Improving a hybrid literary book recommendation system through author ranking, in Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries (ACM, New York, 2012), pp. 387–388
Z. Zhu, J.Y. Wang, Book recommendation service by improved association rule mining algorithm, in 2007 International Conference on Machine Learning and Cybernetics, vol. 7 (IEEE, New York, 2007), pp. 3864–3869
P.C. Vaz, D. Martins de Matos, B. Martins, Stylometric relevance-feedback towards a hybrid book recommendation algorithm, in Proceedings of the fifth ACM Workshop on Research Advances in Large Digital Book Repositories and Complementary Media (ACM, New York, 2012), pp. 13–16
X. Yang, H. Zeng, Y. Huang, Artmap-based data mining approach and its application to library book recommendation, in 2009 International Symposium on Intelligent Ubiquitous Computing and Education (IEEE, New York, 2009), pp. 26–29
S. Givon, V. Lavrenko, Predicting social-tags for cold start book recommendations, in Proceedings of the Third ACM Conference on Recommender Systems (ACM, New York, 2009), pp. 333–336
M. Zhou, Book recommendation based on web social network, in International Conference on Artificial Intelligence and Education (ICAIE) (IEEE, New York, 2010), pp. 136–139
M.S. Pera, Y.K. Ng, What to read next?: making personalized book recommendations for k-12 users, in Proceedings of the 7th ACM Conference on Recommender Systems (ACM, New York, 2013), pp. 113–120
M.S. Pera, Y.K. Ng, Automating readers’ advisory to make book recommendations for k-12 readers, in Proceedings of the 8th ACM Conference on Recommender Systems (ACM, New York, 2014), pp. 9–16
M.S. Pera, Y.K. Ng, Analyzing book-related features to recommend books for emergent readers, in Proceedings of the 26th ACM Conference on Hypertext & Social Media (ACM, New York, 2015), pp. 221–230
S. Dimitrov, F. Zamal, A. Piper, D. Ruths, Goodreads vs amazon: the effect of decoupling book reviewing and book selling, in Proceedings of ICWSM ’15 (2015)
A. Worrall, “Back onto the tracks”: convergent community boundaries in librarything and goodreads, in 9th Annual Social Informatics Research Symposium (2013)
M. Thelwal, K. Kousha, Goodreads: a social network site for book readers. J. Assoc. Inf. Sci. Technol. 68(4), 972–983 (2017)
M. Thelwall, Book genre and author gender: Romance > paranormal-romance to autobiography > memoir. J. Assoc. Inf. Sci. Technol. 68(5), 1212–1223 (2017)
S. Rose, D. Engel, N. Cramer, W. Cowley, Automatic keyword extraction from individual documents, in Text Mining (2010), pp. 1–20
L. Deng, J. Wiebe, Mpqa 3.0: an entity/event-level sentiment corpus, in Conference of the North American Chapter of the Association of Computational Linguistics: Human Language Technologies (2015)
J.W. Pennebaker, M.E. Francis, R.J. Booth, Linguistic Inquiry and Word Count (Lawerence Erlbaum Associates, Mahwah, 2001)
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I.H. Witten, The weka data mining software: An update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Maity, S.K., Panigrahi, A., Mukherjee, A. (2019). Analyzing Social Book Reading Behavior on Goodreads and How It Predicts Amazon Best Sellers. In: Kaya, M., Alhajj, R. (eds) Influence and Behavior Analysis in Social Networks and Social Media. ASONAM 2018. Lecture Notes in Social Networks. Springer, Cham. https://doi.org/10.1007/978-3-030-02592-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-02592-2_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02591-5
Online ISBN: 978-3-030-02592-2
eBook Packages: Social SciencesSocial Sciences (R0)