Skip to main content

Inconsistency Knowledge Discovery for Longitudinal Data Management: A Model-Based Approach

  • Conference paper
Human-Computer Interaction and Knowledge Discovery in Complex, Unstructured, Big Data (HCI-KDD 2013)

Abstract

In the last years, the growing diffusion of IT-based services has given a rise to the use of huge masses of data. However, using data for analytical and decision making purposes requires to perform several tasks, e.g. data cleansing, data filtering, data aggregation and synthesis, etc. Tools and methodologies empowering people are required to appropriately manage the (high) complexity of large datasets.

This paper proposes the multidimensional RDQA, an enhanced version of an existing model-based data verification technique, that can be used to identify, extract, and classify data inconsistencies on longitudinal data. Specifically, it discovers fine grained information about the data inconsistencies and it uses a multidimensional visualisation technique for showing them. The enhanced RDQA supports and empowers the users in the task of assessing and improving algorithms and solutions for data analysis, especially when large datasets are considered.

The proposed technique has been applied on a real-world dataset derived from the Italian labour market domain, which we made publicly available to the community.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fayyad, U.M., Piatetsky-Shapiro, G., Uthurusamy, R.: Summary from the kdd-03 panel. ACM SIGKDD Explorations Newsletter 5(2), 191–196 (2003)

    Article  Google Scholar 

  2. Mezzanzanica, M., Boselli, R., Cesarini, M., Mercorio, F.: Data quality sensitivity analysis on aggregate indicators. In: International Conference on Data Technologies and Applications (DATA), pp. 97–108. SciTePress (2012)

    Google Scholar 

  3. Maletic, J., Marcus, A.: Data cleansing: A prelude to knowledge discovery. In: Data Mining and Knowledge Discovery Handbook, pp. 19–32. Springer, US (2010)

    Google Scholar 

  4. Batini, C., Scannapieco, M.: Data Quality: Concepts, Methodologies and Techniques. Data-Centric Systems and Applications. Springer (2006)

    Google Scholar 

  5. Ferreira de Oliveira, M.C., Levkowitz, H.: From visual data exploration to visual data mining: A survey. IEEE Trans. Vis. Comput. Graph. 9(3), 378–394 (2003)

    Article  Google Scholar 

  6. Wong, B.L.W., Xu, K., Holzinger, A.: Interactive visualization for information analysis in medical diagnosis. In: Holzinger, A., Simonic, K.-M. (eds.) USAB 2011. LNCS, vol. 7058, pp. 109–120. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  7. Parsaye, K., Chignell, M.: Intelligent Database Tools and Applications: Hyperinformation access, data quality, visualization, automatic discovery. John Wiley (1993)

    Google Scholar 

  8. Clemente, P., Kaba, B., Rouzaud-Cornabas, J., Alexandre, M., Aujay, G.: Sptrack: Visual analysis of information flows within selinux policies and attack logs. In: Huang, R., Ghorbani, A.A., Pasi, G., Yamaguchi, T., Yen, N.Y., Jin, B. (eds.) AMT 2012. LNCS, vol. 7669, pp. 596–605. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  9. Simonic, K.M., Holzinger, A., Bloice, M., Hermann, J.: Optimizing long-term treatment of rheumatoid arthritis with systematic documentation. In: IEEE International Conference on Pervasive Computing Technologies for Healthcare, PervasiveHealth, pp. 550–554 (2011)

    Google Scholar 

  10. Mezzanzanica, M., Boselli, R., Cesarini, M., Mercorio, F.: Data quality through model checking techniques. In: Gama, J., Bradley, E., Hollmén, J. (eds.) IDA 2011. LNCS, vol. 7014, pp. 270–281. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  11. European Telecommunications Standards Institute ES 201 671: Handover interface for the lawful interception of telecommunications traffic (2009)

    Google Scholar 

  12. Clarke, E.M., Grumberg, O., Long, D.E.: Model checking and abstraction. ACM TOPLAS 16(5), 1512–1542 (1994)

    Article  Google Scholar 

  13. Della Penna, G., Intrigila, B., Magazzeni, D., Mercorio, F.: UPMurphi: a tool for universal planning on PDDL+ problems. In: ICAPS, pp. 106–113. AAAI Press (2009)

    Google Scholar 

  14. Mercorio, F.: Model checking for universal planning in deterministic and non-deterministic domains. AI Communications 26(2), 257–259 (2013)

    Google Scholar 

  15. Norris Ip, C., Dill, D.: Better verification through symmetry. Formal Methods in System Design 9(1), 41–75 (1996)

    Article  Google Scholar 

  16. Martini, M., Mezzanzanica, M.: The Federal Observatory of the Labour Market in Lombardy: Models and Methods for the Construction of a Statistical Information System for Data Analysis. In: Information Systems for Regional Labour Market Monitoring - State of the Art and Prospectives. Rainer Hampp Verlag (2009)

    Google Scholar 

  17. Inselberg, A.: The plane with parallel coordinates. The Visual Computer 1(2), 69–91 (1985)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Boselli, R., Cesarini, M., Mercorio, F., Mezzanzanica, M. (2013). Inconsistency Knowledge Discovery for Longitudinal Data Management: A Model-Based Approach. In: Holzinger, A., Pasi, G. (eds) Human-Computer Interaction and Knowledge Discovery in Complex, Unstructured, Big Data. HCI-KDD 2013. Lecture Notes in Computer Science, vol 7947. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39146-0_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39146-0_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39145-3

  • Online ISBN: 978-3-642-39146-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics