Article

Semi-supervised outlier detection

Authors:
Jing Gao

Michigan State University, East Lansing, MI

Michigan State University, East Lansing, MI
View Profile

,
Haibin Cheng

Michigan State University, East Lansing, MI

Michigan State University, East Lansing, MI
View Profile

,
Pang-Ning Tan

Michigan State University, East Lansing, MI

Michigan State University, East Lansing, MI
View Profile

SAC '06: Proceedings of the 2006 ACM symposium on Applied computingApril 2006Pages 635–636https://doi.org/10.1145/1141277.1141421

Published:23 April 2006Publication History

SAC '06: Proceedings of the 2006 ACM symposium on Applied computing

Pages 635–636

ABSTRACT

Outlier detection has been extensively researched in the context of unsupervised learning. But the learning results are not always satisfactory, which can be significantly improved using supervision of some labeled points. In this paper, we are concerned with employing supervision of limited amount of label information to detect outliers more accurately. The key of our approach is an objective function that punishes poor clustering results and deviation from known labels as well as restricts the number of outliers. The outliers can be found as a solution to the discrete optimization problem regarding the objective function. By this way, this method can detect meaningful outliers that can not be identified by existing unsupervised methods.

References

S. Basu, M. Bilenko, and R. J. Mooney. A probabilistic framework for semi-supervised clustering. In KDD '04: Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining, pages 59--68. ACM Press, 2004. Google ScholarDigital Library
M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander. Lof: identifying density-based local outliers. In SIGMOD '00: Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pages 93--104. ACM Press, 2000. Google ScholarDigital Library
E. M. Knorr, R. T. Ng, and V. Tucakov. Distance-based outliers: Algorithms and applications. VLDB Journal: Very Large Data Bases, 8(3--4):237--253, 2000. Google ScholarDigital Library
J. MacQueen. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Symposium on Math, Statistics, and Probability, pages 281--297, 1967.Google Scholar
K. Nigam, A. K. McCallum, S. Thrun, and T. M. Mitchell. Text classification from labeled and unlabeled documents using em. Machine Learning, 39(2/3):103--134, 2000. Google ScholarDigital Library

Index Terms

Semi-supervised outlier detection
1. Information systems
  1. Information systems applications

Recommendations

Semi-supervised Based Training Set Construction for Outlier Detection
CLOUDCOM-ASIA '13: Proceedings of the 2013 International Conference on Cloud Computing and Big Data

Outliers are sparse and few. It's costly to obtain a training set with enough outliers so that existing approaches to the problem of outlier detection seldom processed with supervised manner. However, given a training set with sufficient outliers, ...
Read More
Entropy-based outlier detection using semi-supervised approach with few positive examples

Outlier detection is an important problem in data mining that aims to discover useful exceptional and unusual patterns hidden in large data sets. Fraud detection, time series monitoring, intrusion detection and medical condition monitoring are some of ...
Read More
Rough-based semi-supervised outlier detection
FSKD'09: Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1

With the help of some labeled samples and rough C-means clustering, a rough-based semi-supervised outlier detection (RBSSOD) is proposed, which integrates the advantage of semi-supervised outlier detection (SSOD) and rough C-means clustering. This ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SAC '06: Proceedings of the 2006 ACM symposium on Applied computing
April 2006
1967 pages
ISBN:1595931082
DOI:10.1145/1141277
Conference Chair:
Hisham M. Haddad
Kennesaw State University, Kennesaw, Georgia
Copyright © 2006 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 April 2006
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
outlier detection
semi-supervised learning
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate1,650of6,669submissions,25%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 27
  Total Citations
  View Citations
- 1,219
  Total Downloads
- Downloads (Last 12 months)27
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Semi-supervised outlier detection

SAC '06: Proceedings of the 2006 ACM symposium on Applied computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Semi-supervised Based Training Set Construction for Outlier Detection

Entropy-based outlier detection using semi-supervised approach with few positive examples

Rough-based semi-supervised outlier detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Semi-supervised outlier detection

SAC '06: Proceedings of the 2006 ACM symposium on Applied computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Semi-supervised Based Training Set Construction for Outlier Detection

Entropy-based outlier detection using semi-supervised approach with few positive examples

Rough-based semi-supervised outlier detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media