ABSTRACT
The use of cloud computing writing tools, such as Google Docs, by students to write collaboratively provides unprecedented data about the progress of writing. This data can be exploited to gain insights on how learners' collaborative activities, ideas and concepts are developed during the process of writing. Ultimately, it can also be used to provide support to improve the quality of the written documents and the writing skills of learners involved. In this paper, we propose three visualisation approaches and their underlying techniques for analysing writing processes used in a document written by a group of authors: (1) the revision map, which summarises the text edits made at the paragraph level, over the time of writing. (2) the topic evolution chart, which uses probabilistic topic models, especially Latent Dirichlet Allocation (LDA) and its extension, DiffLDA, to extract topics and follow their evolution during the writing process. (3) the topic-based collaboration network, which allows a deeper analysis of topics in relation to author contribution and collaboration, using our novel algorithm DiffATM in conjunction with a DiffLDA-related technique. These models are evaluated to examine whether these automatically discovered topics accurately describe the evolution of writing processes. We illustrate how these visualisations are used with real documents written by groups of graduate students.
- Andrews, N. O. and Fox, E. A. 2007. Recent Developments in Document Clustering. TR-07-35. Computer Science, Virginia Tech.Google Scholar
- Blei, D. M. and Lafferty, J. D. 2009. Topic models, in Text Mining: Classification, Clustering, and Applications. A. Srivastava and M. Sahami, Eds.: Chapman & Hall/CRC Data Mining and Knowledge Discovery Series.Google ScholarDigital Library
- Blei, D. M., Ng, A. Y., and Jordan, M. I. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research. 3 (Mar. 2003), 993--1022. Google ScholarDigital Library
- Broniatowski, D. A. and Christopher, L. M. 2012. Studying Group Behaviours: A Tutorial on Text and Network Analysis Methods. IEEE SIGNAL PROCESSING MAGAZINE. (Mar. 2012), 22--32.Google Scholar
- Caporossi, G. and Leblay, C. 2011. Online Writing Data Representation: A Graph Theory Approach. In Proceedings of the 10th international conference on Advances in intelligent data analysis X (Porto, Portugal, October 29--31, 2011), 80--89. Google ScholarDigital Library
- Flower, L. and Hayes, J. 1981. A Cognitive Process Theory of Writing. College Composition and Communication. 32 (Dec. 1981), 365--387.Google ScholarCross Ref
- Griffiths, T. L. and Steyvers, M. 2004. Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America. 101 (Suppl. 1) (Apr. 2004), 5228--35.Google ScholarCross Ref
- Kay, J., Maisonneuve, N., Yacef, K., and Reimann, P. 2006. The big five and visualisations of team work activity. In Proceedings of the 8th International Conference on Intelligent Tutoring Systems (Jhongli, Taiwan, June 26--30, 2006), 197--206. Google ScholarDigital Library
- Kim, S. and Lebanon, G. 2010. Local Space-Time Smoothing for Version Controlled Documents. In Proceedings of the 23rd International Conference on Computational Linguistics (Beijing, China, August 23--27, 2010). Google ScholarDigital Library
- Liu, M. and Calvo, R. A. 2011. Question Taxonomy and Implications for Automatic Question Generation. In Proceedings of Artificial Intelligence in Education (Auckland, New Zealand, 2011), 504--506. Google ScholarDigital Library
- O'Rourke, S., Calvo, R. A., and McNamara, D. 2011. Visualizing Topic Flow in Students' Essays. Journal of Educational Technology and Society. 14 (Jul. 2011), 4--15.Google Scholar
- Perrin, D. and Wildi, M. 2010. Statistical modeling of writing processes, in Traditions of Writing Research. C. Bazerman, et al., Eds.: Routledge, 378--393.Google Scholar
- Rosen-Zvi, M., Chemudugunta, C., Griffiths, T., Smyth, P., and Steyvers, M. 2010. Learning author-topic models from text corpora. ACM Transactions on Information Systems. 28 (Jan. 2010), 1--38. Google ScholarDigital Library
- Rosen-zvi, M., Griffiths, T., Steyvers, M., and Smyth, P. 2003. The Author-Topic Model for Authors and Documents. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (Banff, Canada, July 7--11, 2003). Google ScholarDigital Library
- Shermis, M. D. and Burstein, J. 2003. Automated Essay Scoring: A Cross-disciplinary Perspective. MIT Press.Google Scholar
- Southavilay, V., Yacef, K., and Calvo, R. A. 2009. WriteProc: A Framework for Exploring Collaborative Writing Processes. In Proceedings of Australasian Document Computing Symposium (Sydney, Australia, 2009).Google Scholar
- Steinbach, M., Karypis, G., and Kumar, V. 2000. A Comparison of Document Clustering Techniques. In Proceedings of Proceedings of the International KDD Workshop on Text Mining 2000. (2000).Google Scholar
- Thomas, S. W., Adams, B., Hassan, A. E., and Blostein, D. 2010. DiffLDA: Topic Evolution in Software Projects. Technical Report 2010-574 2010-574. School of Computting, Queen's University.Google Scholar
- Thomas, S. W., Adams, B., Hassan, A. E., and Blostein, D. 2011. Modeling the Evolution of Topics in Source Code Histories. In Proceedings of the 8th IEEE working conf on mining software repositories (Honolulu, HI, USA, May 21--28, 2011), 173--182. Google ScholarDigital Library
- Toolbox, T. M. 2012. http://psiexp.ss.uci.edu/research/programs_data/toolbox.htm.Google Scholar
- Upton, K. and Kay, J. 2009. Narcissus: interactive activity mirror for small groups. In Proceedings of the 17 International Conference on User Modeling, Adaptation and Personalisation (Trento, Italy, June 22--26, 2009), 54--65. Google ScholarDigital Library
- Villalon, J. and Calvo, R. A. 2011. Concept maps as cognitive visualizations of writing assignments. Journal of Educational Technology and Society. 14 (Jul. 2011), 16--27.Google Scholar
Index Terms
- Analysis of collaborative writing processes using revision maps and probabilistic topic models
Recommendations
Topic sentiment change analysis
MLDM'11: Proceedings of the 7th international conference on Machine learning and data mining in pattern recognitionPublic opinions on a topic may change over time. Topic Sentiment change analysis is a new research problem consisting of two main components: (a) mining opinions on a certain topic, and (b) detect significant changes of sentiment of the opinions on the ...
Semantic-based topic detection using Markov decision processes
In the field of text mining, topic modeling and detection are fundamental problems in public opinion monitoring, information retrieval, social media analysis, and other activities. Document clustering has been used for topic detection at the document ...
Probabilistic topic models
KDD '11 Tutorials: Proceedings of the 17th ACM SIGKDD International Conference TutorialsProbabilistic topic modeling provides a suite of tools for the unsupervised analysis of large collections of documents. Topic modeling algorithms can uncover the underlying themes of a collection and decompose its documents according to those themes. ...
Comments