AnchorViz: Facilitating Semantic Data Exploration and Concept Discovery for Interactive Machine Learning

Published: 09 August 2019


When building a classifier in interactive machine learning (iML), human knowledge about the target class can be a powerful reference to make the classifier robust to unseen items. The main challenge lies in finding unlabeled items that can either help discover or refine concepts for which the current classifier has no corresponding features (i.e., it has feature blindness). Yet it is unrealistic to ask humans to come up with an exhaustive list of items, especially for rare concepts that are hard to recall. This article presents AnchorViz, an interactive visualization that facilitates the discovery of prediction errors and previously unseen concepts through human-driven semantic data exploration. By creating example-based or dictionary-based anchors representing concepts, users create a topology that (a) spreads data based on their similarity to the concepts and (b) surfaces the prediction and label inconsistencies between data points that are semantically related. Once such inconsistencies and errors are discovered, users can encode the new information as labels or features and interact with the retrained classifier to validate their actions in an iterative loop. We evaluated AnchorViz through two user studies. Our results show that AnchorViz helps users discover more prediction errors than stratified random and uncertainty sampling methods. Furthermore, during the beginning stages of a training task, an iML tool with AnchorViz can help users build classifiers comparable to the ones built with the same tool with uncertainty sampling and keyword search, but with fewer labels and more generalizable features. We discuss exploration strategies observed during the two studies and how AnchorViz supports discovering, labeling, and refining of concepts through a sensemaking loop.


    Published In

    ACM Transactions on Interactive Intelligent Systems  Volume 10, Issue 1
    Special Issue on IUI 2018
    March 2020
    347 pages
    Issue’s Table of Contents
    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 August 2019
    Accepted: 01 November 2018
    Revised: 01 September 2018
    Received: 01 May 2018
    Published in TIIS Volume 10, Issue 1


    Author Tags

    1. Interactive machine learning
    2. concept discovery
    3. error discovery
    4. machine teaching
    5. semantic data exploration
    6. unlabeled data
    7. visualization


