research-article

TSAR: a Time Series Assisted Relabeling Tool for Reducing Label Noise

Authors:

Gentry Atkinson,

Vangelis MetsisAuthors Info & Claims

PETRA '21: Proceedings of the 14th PErvasive Technologies Related to Assistive Environments Conference

Pages 203 - 209

https://doi.org/10.1145/3453892.3453900

Published: 29 June 2021 Publication History

Abstract

Accurately detecting instances in datasets that have been mislabeled is a difficult problem with several imperfect solutions. Hand-reviewing labels is a reliable but expensive approach. Time series datasets present additional challenges because they are not as easily interpreted by reviewers. This paper introduces TSAR, as system for facilitating human review of a small portion of a dataset that it identifies as the most likely to be mislabeled. TSAR’s use is demonstrated on real-world time series data.

References

[1]

Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra, and Jorge Luis Reyes-Ortiz. 2013. A public domain dataset for human activity recognition using smartphones. In Esann, Vol. 3. 3.

[2]

Gentry Atkinson and Vangelis Metsis. 2020. Identifying label noise in time-series datasets. In Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers. 238–243.

Digital Library

[3]

Carla E Brodley and Mark A Friedl. 1999. Identifying mislabeled training data. Journal of artificial intelligence research 11 (1999), 131–167.

[4]

Federico Cruciani, Anastasios Vafeiadis, Chris Nugent, Ian Cleland, Paul McCullagh, Konstantinos Votis, Dimitrios Giakoumis, Dimitrios Tzovaras, Liming Chen, and Raouf Hamzaoui. 2020. Feature learning for Human Activity Recognition using Convolutional Neural Networks. CCF Transactions on Pervasive Computing and Interaction 2, 1(2020), 18–32.

[5]

Benoît Frénay, Ata Kabán, 2014. A comprehensive introduction to label noise. In ESANN.

[6]

Benoît Frénay and Michel Verleysen. 2013. Classification in the presence of label noise: a survey. IEEE transactions on neural networks and learning systems 25, 5(2013), 845–869.

[7]

Donghai Guan and Weiwei Yuan. 2013. A survey of mislabeled training data detection techniques for pattern classification. IETE Technical Review 30, 6 (2013), 524–530.

[8]

Geoffrey E Hinton and Sam Roweis. 2002. Stochastic neighbor embedding. Advances in neural information processing systems 15 (2002), 857–864.

[9]

Mark A Kramer. 1991. Nonlinear principal component analysis using autoassociative neural networks. AIChE journal 37, 2 (1991), 233–243.

[10]

Hyeokhyen Kwon, Gregory D Abowd, and Thomas Plötz. 2019. Handling annotation uncertainty in human activity recognition. In Proceedings of the 23rd International Symposium on Wearable Computers. 109–117.

Digital Library

[11]

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324.

[12]

Taylor R Mauldin, Marc E Canby, Vangelis Metsis, Anne HH Ngu, and Coralys Cubero Rivera. 2018. SmartFall: A smartwatch-based fall detection system using deep learning. Sensors 18, 10 (2018), 3363.

[13]

Daniela Micucci, Marco Mobilio, and Paolo Napoletano. 2017. Unimib shar: A dataset for human activity recognition using acceleration data from smartphones. Applied Sciences 7, 10 (2017), 1101.

[14]

Nicolas M Müller and Karla Markert. 2019. Identifying Mislabeled Instances in Classification Datasets. In 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–8.

[15]

Sudipta Paul, Shivkumar Chandrasekaran, BS Manjunath, and Amit K Roy-Chowdhury. 2020. Exploiting Context for Robustness to Label Noise in Active Learning. arXiv preprint arXiv:2010.09066(2020).

[16]

Sreenivasan Ramasamy Ramamurthy and Nirmalya Roy. 2018. Recent trends in machine learning for human activity recognition—A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 8, 4(2018), e1254.

[17]

Georgiy Shurkhovetskyy, N Andrienko, G Andrienko, and Georg Fuchs. 2018. Data abstraction for visualizing large time series. In Computer Graphics Forum, Vol. 37. Wiley Online Library, 125–144.

[18]

Ye Yuan, Guangxu Xun, Qiuling Suo, Kebin Jia, and Aidong Zhang. 2017. Wave2vec: Learning deep representations for biosignals. In 2017 IEEE International Conference on Data Mining (ICDM). IEEE, 1159–1164.

[19]

Xinchuan Zeng and Tony R Martinez. 2001. An algorithm for correcting mislabeled data. Intelligent data analysis 5, 6 (2001), 491–502.

[20]

Liyue Zhao, Gita Sukthankar, and Rahul Sukthankar. 2011. Incremental relabeling for active learning with noisy crowdsourced annotations. In 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing. IEEE, 728–733.

Cited By

Côté PNikanjam AAhmed NHumeniuk DKhomh F(2024)Data cleaning and machine learning: a systematic literature reviewAutomated Software Engineering10.1007/s10515-024-00453-w31:2Online publication date: 11-Jun-2024
https://doi.org/10.1007/s10515-024-00453-w
Liu ZMa PChen DPei WMa QOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Scale-teachingProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667586(33726-33757)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3667586
Atkinson GLi XMetsis V(2023)Conditional Diffusion with Label Smoothing for Data Synthesis from Examples with Noisy Labels2023 31st European Signal Processing Conference (EUSIPCO)10.23919/EUSIPCO58844.2023.10289794(1300-1304)Online publication date: 4-Sep-2023
https://doi.org/10.23919/EUSIPCO58844.2023.10289794
Show More Cited By

Recommendations

Identifying label noise in time-series datasets
UbiComp/ISWC '20 Adjunct: Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers

Reliably labeled datasets are crucial to the performance of supervised learning methods. Time-series data pose additional challenges. Data points lying on borders between classes can be mislabeled due to perception limitations of human labelers. Sensor ...
Analysis of label noise in graph-based semi-supervised learning
SAC '20: Proceedings of the 35th Annual ACM Symposium on Applied Computing

In machine learning, one must acquire labels to help supervise a model that will be able to generalize to unseen data. However, the labeling process can be tedious, long, costly, and error-prone. It is often the case that most of our data is unlabeled. ...
Noisy multi-label semi-supervised dimensionality reduction
Highlights
- A new semi-supervised and label noise-tolerant multi-label dimensionality reduction method.
Abstract
Noisy labeled data represent a rich source of information that often are easily accessible and cheap to obtain, but label noise might also have many negative consequences if not accounted for. How to fully utilize noisy labels has been ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

PETRA '21: Proceedings of the 14th PErvasive Technologies Related to Assistive Environments Conference

June 2021

593 pages

ISBN:9781450387927

DOI:10.1145/3453892

Conference Chair:
Fillia Makedon

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 June 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

PETRA '21

PETRA '21: The 14th PErvasive Technologies Related to Assistive Environments Conference

June 29 - July 2, 2021

Corfu, Greece

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
73
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Côté PNikanjam AAhmed NHumeniuk DKhomh F(2024)Data cleaning and machine learning: a systematic literature reviewAutomated Software Engineering10.1007/s10515-024-00453-w31:2Online publication date: 11-Jun-2024
https://doi.org/10.1007/s10515-024-00453-w
Liu ZMa PChen DPei WMa QOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Scale-teachingProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667586(33726-33757)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3667586
Atkinson GLi XMetsis V(2023)Conditional Diffusion with Label Smoothing for Data Synthesis from Examples with Noisy Labels2023 31st European Signal Processing Conference (EUSIPCO)10.23919/EUSIPCO58844.2023.10289794(1300-1304)Online publication date: 4-Sep-2023
https://doi.org/10.23919/EUSIPCO58844.2023.10289794
Hinkle LPedro TLynn TAtkinson GMetsis V(2023)Assisted Labeling Visualizer (ALVI): A Semi-Automatic Labeling System For Time-Series Data2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)10.1109/ICASSPW59220.2023.10193169(1-5)Online publication date: 4-Jun-2023
https://doi.org/10.1109/ICASSPW59220.2023.10193169
Kolkar RV. G(2023)Human activity recognition using deep learning techniques with spider monkey optimizationMultimedia Tools and Applications10.1007/s11042-023-15007-782:30(47253-47270)Online publication date: 9-May-2023
https://dl.acm.org/doi/10.1007/s11042-023-15007-7
You XZhang LYu HYuan MLi X(2021)KATNProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/34949575:4(1-26)Online publication date: 30-Dec-2021
https://dl.acm.org/doi/10.1145/3494957
Atkinson GMetsis V(2021)A Survey of Methods for Detection and Correction of Noisy Labels in Time Series DataArtificial Intelligence Applications and Innovations10.1007/978-3-030-79150-6_38(479-493)Online publication date: 22-Jun-2021
https://doi.org/10.1007/978-3-030-79150-6_38

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten