Skip to main content

Outlier Effects on Databases

  • Conference paper
Advances in Information Systems (ADVIS 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3261))

Included in the following conference series:

Abstract

Real data and databases always contain some kind of heterogenity or contamination, which is called “outliers”. Outliers are defined as the few observations or records which appear to be inconsistent with the remainder group of the sample and more effective on prediction values. Isolated outliers may also have positive impact on the results of data analysis and data mining. In this study, we are concerned with outliers in time series which have two special cases, innovational outlier (IO) and additive outlier (AO). The occurence of AO indicates that action is required, possibly to adjust the measuring instrument or at least to print an error message on the database. However, if IO occurs, no adjustment of the measurement operation is required. At the end of the study, the results of the simulation and variance analysis on the produced data sets are emphasized.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abraham, B., Chuang, A.: Outlier Detection and Time Series Modelling. American Statistical Association and the American Society for Quality Control (1989)

    Google Scholar 

  2. Chang, I., Tiao, G.C., Chen, C.: Estimation of Time Series Parameters in the Presence of Outliers. American Statistical Association and American Society for Quality Control (Technometrics) 30(2) (1988)

    Google Scholar 

  3. Chang, I., Tiao, G.C.: Estimation of Time Series Parameters in the Presence of Outliers. Technical Report 8. Statistics Research Center, University of Chicago, Chicago (1983)

    Google Scholar 

  4. Collett, D., Lewis, T.: The Subjective Nature of Outliers Rejection Procedures. Applied Statist. 25(3), 228 (1976)

    Article  Google Scholar 

  5. Cox, D.R., Snell, E.J.: Applied Statistics-Principles and Examples, Great Britain (1980)

    Google Scholar 

  6. David, F.A., Pregibon: Finding the Outliers that Matter. J.R. Statist. Soc. B 40(1), 85–93 (1978)

    MATH  Google Scholar 

  7. Elashoff, J.D.: A model for Quadratic Outliers in Linear Regression. Journal of American Statistical Association 67, 478–485 (1972)

    Article  MATH  Google Scholar 

  8. Grubbs, F.E.: Procedures for Detecting Outlying Observations in Samples. Technometrics 11, 1–21 (1969)

    Article  Google Scholar 

  9. Gumbel, E.J.: Discussion on Rejection of Outliers by Anscombe. F. J. Technometrics 2, 165–166 (1960)

    Google Scholar 

  10. Hadi, A.S.: Identifying Multiple Outliers in Multivariate Data. J.R. Statist. Soc. B 54(3), 761–771 (1992)

    MathSciNet  Google Scholar 

  11. Hawkins, D.W.: Identification of Outliers, Chapman and Hall, Great Britain (1980)

    Google Scholar 

  12. Johnson, R.A., Wichern, D.W.: Applied Multivariate Statistical Analysis. Prentice Hall, New Jersey (1988)

    MATH  Google Scholar 

  13. Last, M., Kandel, A.: Automated Detection of Outliers in Real World Data, http://www.ise.bgu.ac.il/faculty/mlast/papers

  14. Ljung, G.M.: On Outlier Detection in Time Series. J.R. Statist. Soc. B 55(2), 559–567 (1993)

    MATH  MathSciNet  Google Scholar 

  15. Muirhead, C.R.: Distinguishing Outlier Types in Time Series. J.R. Statist. Soc. B 48, 39–47 (1986)

    MATH  MathSciNet  Google Scholar 

  16. Pena, D., Yohai, V.J.: The Detection of Influential Subsets in Linear Regression by Using an Influential Matrix. J.R. Statist. Soc. B 57(1), 145–156 (1995)

    MATH  MathSciNet  Google Scholar 

  17. Prescott, P.: An Approximate Test for Outliers in Linear Models. Technometrics 17, 129–132 (1975)

    Article  MATH  MathSciNet  Google Scholar 

  18. Wand, Y., Wang, R.Y.: Anchoring Data Quality Dimension in Ontological Foundations. Communications of the ACM 39(11), 86–95 (1996)

    Article  Google Scholar 

  19. Kaya, A.: An Investigation The Analysis of Outliers in Time Series, Ph.D Thesis, Dokuz Eylül University, İzmir, Turkey (1999)

    Google Scholar 

  20. Fox, A.J.: Outliers in Time Series. Journal of the Royal Statistical Society, Ser. B 43, 350–363

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kaya, A. (2004). Outlier Effects on Databases. In: Yakhno, T. (eds) Advances in Information Systems. ADVIS 2004. Lecture Notes in Computer Science, vol 3261. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30198-1_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30198-1_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23478-4

  • Online ISBN: 978-3-540-30198-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics