Unsupervised classification and visualization of unstructured text for the support of interdisciplinary collaboration

Published: 15 February 2014 Publication History


We present a computer supported tool for cooperative work in interdisciplinary fields, which we tested within the area of astrobiology. Our document classification and visualization system is fully automated and data driven, based on unsupervised learning algorithms and network visualization tools. A new feature selection algorithm was created to aid this process that indicates which words should be used for mutual information-based clustering. Our system can extract information about collaborations from unstructured databases with no meta-data and reveals structure that can aid the planning of collaborative research. We analyzed publications produced by researchers from NASA's Astrobiology Institute. We presented this analysis as a cultural probe and recorded reactions from researchers that indicated that our method can help scientists from different disciplines to work together. We have made an interactive version of our visualization and analysis available as a website for long-term use.

    Author Tags

    1. document analysis
    2. feature selection
    3. interdisciplinary science
    4. unsupervised learning


    CSCW'14: Computer Supported Cooperative Work
    February 15 - 19, 2014
    Maryland, Baltimore, USA

