skip to main content
10.1145/3025453.3025868acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

TagRefinery: A Visual Tool for Tag Wrangling

Published: 02 May 2017 Publication History

Abstract

We present TagRefinery, an interactive visual application aiding the cleaning and processing of open tag spaces, such as those in Last.fm or YouTube. Our pre-design analysis showed a need to support a spectrum of user expertise from novice to advanced, which resulted in two distinct interface modes. Summative evaluations of TagRefinery showed that it could effectively guide the novice users through the workflow by giving them brief but helpful explanations on why each step was required, and providing visual and statistical aids to help them in making important decisions. This is while our more expert users greatly appreciated the amount of control and granularity over the workflow that our more advanced interface mode offered. Both the underlying tag cleaning workflow and the interface were designed iteratively in a participatory design process in collaboration with research on a music recommendation interface based on Last.fm tags.

Supplementary Material

ZIP File (pn3128-file4.zip)
suppl.mov (pn3128-file3.mp4)
Supplemental video
MP4 File (p2928-kralj.mp4)

References

[1]
Aaron Bangor, Philip Kortum, and James Miller. 2009. Determining What Individual SUS Scores Mean: Adding an Adjective Rating Scale. J. Usability Studies 4, 3 (May 2009), 114--123. http://dl.acm.org/citation.cfm?id=2835587.2835589
[2]
Dominikus Baur, Jennifer Bttgen, and Andreas Butz. 2012. Listening Factors: A Large-Scale Principal Components Analysis of Long-Term Music Listening Histories. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI'09). 1273--1276. http://dl.acm.org/citation.cfm?id=2208581
[3]
Thierry Bertin-Mahieux, Daniel P. W. Ellis, Brian Whitman, and Paul Lamere. 2011. The Million Song Dataset. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR'11). International Society for Music Information Retrieval, Miami, Florida, USA, 591--596. http://ismir2011.ismir.net/papers/OS6--1.pdf
[4]
Eric Brill and Robert C. Moore. 2000. An Improved Error Model for Noisy Channel Spelling Correction. In Proceedings of the Annual Meeting on Association for Computational Linguistics (ACL'00). Stroudsburg, PA, USA, 286--293.
[5]
John Brooke. 1996. SUS-A quick and dirty usability scale. Usability evaluation in industry 189, 194 (1996), 4--7. http://hell.meiert.org/core/pdf/sus.pdf
[6]
Stefan Evert and Brigitte Krenn. 2005. Using small random samples for the manual evaluation of statistical association measures. Computer Speech and Language 19, 4 (2005), 450--466.
[7]
Afsaneh Fazly, Paul Cook, and Suzanne Stevenson. 2009. Unsupervised Type and Token Identification of Idiomatic Expressions. Computational Linguistics 35, May 2008 (2009), 61--103.
[8]
Davide Fossati and Barbara Di Eugenio. 2008. I saw TREE trees in the park: How to Correct Real-Word Spelling Mistakes. In Proceedings of the International Conference on Language Resources and Evaluation (LREC'08) (28--30). Marrakech, Morocco, 896--901. http://www.lrec-conf.org/proceedings/lrec2008/
[9]
Spence Green, Marie-Catherine de Marneffe, and Christopher D. Manning. 2012. Parsing Models for Identifying Multiword Expressions. Computational Linguistics 39, 1 (Nov 2012), 195--227.
[10]
Manish Gupta, Rui Li, Zhijun Yin, and Jiawei Han. 2010. Survey on social tagging techniques. ACM SigKDD Explorations Newsletter 12, 1 (2010), 58--72.
[11]
Graeme Hirst and Alexander Budanitsky. 2005. Correcting real-word spelling errors by restoring lexical cohesion. Natural Language Engineering 11, December 2003 (2005), 87--111.
[12]
Mohsen Kamalzadeh, Christoph Kralj, Torsten Möller, and Michael Sedlmair. 2016. TagFlip: Active Mobile Music Discovery with Social Tags. In Proceedings of the ACM International Conference on Intelligent User Interfaces (IUI'16). ACM Press, 19--30.
[13]
Yvonne Kammerer, Rowan Nairn, Peter Pirolli, and Ed H. Chi. 2009. Signpost from the Masses: Learning Effects in an Exploratory Social Tag Search Browser. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI'09). NY, NY, USA, 625--634.
[14]
Sean Kandel, Andreas Paepcke, Joseph Hellerstein, and Jeffrey Heer. 2011. Wrangler: Interactive Visual Specification of Data Transformation Scripts. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI'011). ACM, NY, NY, USA, 3363--3372.
[15]
Paul Lamere. 2008. Social Tagging and Music Information Retrieval. Journal of New Music Research 37, 2 (2008), 101--114. http://www.tandfonline.com/doi/ abs/10.1080/09298210802479284
[16]
C. Laurier, M. Sordo, Joan Serrà, and Perfecto Herrera. 2009. Music Mood Representations from Social Tags. In Proceedings of the International Society of Music Information Retrieval Conference (ISMIR'09). Kobe, Japan, 381--386. http://mtg.upf.edu/node/1466
[17]
Mark Levy and Mark B. Sandler. 2007. A Semantic Space for Music Derived from Social Tags. In Proceedings of the International Society of Music Information Retrieval Conference (ISMIR'07). Vienna, Austria, 411--416. http://ismir2007.ismir.net/ proceedings/ISMIR2007_p411_levy.pdf
[18]
Emanuele Quintarelli. 2005. Folksonomies: Power to the People. (2005). http: //www-dimat.unipv.it/biblio/isko/doc/folksonomies.htm
[19]
Carlos Ramisch, Helena De Medeiros Caseli, Aline Villavicencio, Andre Machado, and Maria José Finatto. 2010. A hybrid approach for multiword expression identification. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6001 LNAI (2010), 65--74.
[20]
Christoph Ringlstetter, Klaus U. Schulz, and Stoyan Mihov. 2006. Orthographic Errors in Web Pages: Toward Cleaner Web Corpora. Computational Linguistics 3, December 2005 (2006), 295--340.
[21]
Ivan A. Sag, Timothy Baldwin, Francis Bond, Ann Copestake, and Dan Flickinger. 2002. Multiword Expressions: A Pain in the Neck for NLP. 1--15.
[22]
Nathan Schneider, Emily Danchik, Chris Dyer, and Noah A. Smith. 2014. Discriminative lexical semantic segmentation with gaps: running the MWE gamut. Transactions of the Association for Computational Linguistics 2 (April 2014), 193--206. http: //www.transacl.org/wp-content/uploads/2014/04/51.pdf
[23]
Ben Shneiderman. 1996. The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations. In Proceedings of the IEEE Symposium on Visual Languages (VL'96). 336--343. http://dl.acm.org/citation.cfm?id=832277.834354
[24]
Louise F. Spiteri. 2007. The structure and form of folksonomy tags: The road to the public library catalog. Information Technology and Libraries 26, 3 (2007), 13. http://ejournals.bc.edu/ojs/index.php/ital/article/ view/3272
[25]
Kristina Toutanova and Robert C. Moore. 2002. Pronunciation Modeling for Improved Spelling Correction. In Proceedings of the Annual Meeting on Association for Computational Linguistics (ACL'02). Association for Computational Linguistics, Stroudsburg, PA, USA, 144--151.
[26]
Tim Van de Cruys and Begoña Villada Moirón. 2007. Semantics-based Multiword Expression Extraction. In Proceedings of the Workshop on a Broader Perspective on Multiword Expressions (MWE'07). 25--32. http://dl.acm.org/citation.cfm?id=1613704.1613708
[27]
Jesse Vig, Shilad Sen, and John Riedl. 2011. Navigating the tag genome. In Proceedings of the ACM International Conference on Intelligent User Interfaces (IUI'11). 93--102. http://doi.acm.org/10.1145/1502650.1502661
[28]
Ju-Chiang Wang, Yu-Chin Shih, Meng-Sung Wu, Hsin-Min Wang, and Shyh-Kang Jeng. 2011. Colorizing Tags in Tag Cloud: A Novel Query-by-Tag Music Search System. In Proceedings of the ACM Conference on Multimedia (MM'11).

Cited By

View all
  • (2021)Manual and Automatic Methods for User Needs Detection in Requirements Engineering: Key Concepts and Challenges2021 International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME)10.1109/ICECCME52200.2021.9591046(1-7)Online publication date: 7-Oct-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CHI '17: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems
May 2017
7138 pages
ISBN:9781450346559
DOI:10.1145/3025453
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 May 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data cleaning
  2. data wrangling
  3. folksonomy
  4. graphical interface
  5. social tags
  6. user centred design
  7. visual data analysis

Qualifiers

  • Research-article

Conference

CHI '17
Sponsor:

Acceptance Rates

CHI '17 Paper Acceptance Rate 600 of 2,400 submissions, 25%;
Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025
ACM CHI Conference on Human Factors in Computing Systems
April 26 - May 1, 2025
Yokohama , Japan

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)16
  • Downloads (Last 6 weeks)1
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Manual and Automatic Methods for User Needs Detection in Requirements Engineering: Key Concepts and Challenges2021 International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME)10.1109/ICECCME52200.2021.9591046(1-7)Online publication date: 7-Oct-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media