Skip to main content

Data Conflicts

  • Reference work entry
  • 354 Accesses

Synonyms

Data problems; Data quality problems; Data anomalies; Data inconsistencies; Data errors

Definition

Data conflicts are deviations between data intended to capture the same state of a real-world entity. Data with conflicts are often called “dirty” data and can mislead analysis performed on it. In case of data conflicts, data cleaning is needed in order to improve the data quality and to avoid wrong analysis results. With an understanding of different kinds of data conflicts and their characteristics, corresponding techniques for data cleaning can be developed.

Historical Background

Statisticians were probably the first who had to face data conflicts on a large scale. Early applications, which needed intensive resolution of data conflicts, were statistical surveys in the areas of governmental administration, public health, and scientific experiments. In 1946, Halbert L. Dunn already observed the problem of duplicates in data records of a person’s life captured at different places...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   2,500.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

  1. Barateiro J. and Galhardas H. A survey of data quality tools. Datenbank-Spektrum, 14:15–21, 2005.

    Google Scholar 

  2. Batini C. and Scannapieco M. Data Quality – Concepts, Methodologies and Techniques. Springer, Berlin, 2006.

    MATH  Google Scholar 

  3. Dunn H.L. Record linkage. Am. J. Public Health, 36(12):1412–1416, 1946.

    Google Scholar 

  4. Elmagarmid A.K., Ipeirotis P.G., and Verykios V.S. Duplicate record detection – a survey. IEEE Trans. Knowl. Data Eng., 19(1):1–16, 2007.

    Google Scholar 

  5. Fellegi I.P. and Sunter A.B. A theory for record linkage. J. Am. Stat. Assoc., 64(328):1183–1210, 1969.

    Google Scholar 

  6. Kim W., Choi B.-J., Kim S.-K., and Lee D. A taxonomy of dirty data. Data Mining Knowl. Discov., 7(1):81–99, 2003.

    MathSciNet  Google Scholar 

  7. Rahm E. and Do H.-H. Data cleaning – problems and current approaches. IEEE Techn. Bull. Data Eng., 23(4):3–13, 2000.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this entry

Cite this entry

Do, HH. (2009). Data Conflicts. In: LIU, L., ÖZSU, M.T. (eds) Encyclopedia of Database Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-39940-9_97

Download citation

Publish with us

Policies and ethics