Skip to main content

The classification problem with semantically heterogeneous data

  • Contributed Papers
  • Chapter
  • First Online:
Book cover Statistical and Scientific Database Management (SSDBM 1988)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 339))

Abstract

Given a database fed by two alternative data sources using a common but not identical classification criterion, if we are able to state precisely the semantical connection between the two classification systems, we can derive new and more detailed summary data. Therefore, the question whether an aggregate information is derivable or not, is fundamental to a query-processing system. We state a necessary and sufficient condition which leads to a simple procedure for deciding the answerability of a summary query and evaluating it, if answerable. Surprisingly, the condition of derivability is independent of the database instance and is dependent only on the topological properties of the graph modelling the semantical connection of the classification systems adopted.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. C. Berge, Graphs and hypergraphs. NORTH HOLLAND, 1973

    Google Scholar 

  2. J.A. Bondy and U.S.R. Murty, Graph theory with applications. 1976

    Google Scholar 

  3. C. Chen and P. Hernon, Numeric databases, Ablex Publishing Corporation, 1984

    Google Scholar 

  4. D. E. Denning, P. J. Denning and M. D. Schwartz, The Tracker: A Threat to Statistical Database Security, ACM Trans. on Datab. Syst. 4: 1 (1979) 76–96

    Google Scholar 

  5. S. Heiler and A. T. Maness, "Connecting Heterogeneous Systems and Data Sources", Working Group Notes: 2 Int. Workshop on Statistical & Scientific Database Management, in Database Engineering, 7: 1 (1984) 23–29

    Google Scholar 

  6. R. Johnson, "Modelling Summary Data", Proc. ACM SIGMOD 1981 Conf on "Data Management", 93–97

    Google Scholar 

  7. R. Johnson, "A Data Model for Integrating Statistical Interpretations", TR UCLR-86765 (1981)

    Google Scholar 

  8. E. L. Lawler, Combinatorial Optimization: Networks and Matroids, RINEHART & WINSTON, New York, 1976.

    Google Scholar 

  9. F. M. Malvestuto, "The derivation problem for summary data", Proc ACM SIGMOD 1988 Conf on "Data Management", 82–89

    Google Scholar 

  10. F. M. Malvestuto, M. Rafanelli, C. Zuffada, Many-source databases: some problems and solutions. IASI-CNR Tech. Rep. 218 (June 1988)

    Google Scholar 

  11. J. L. McCarthy et al., "The SEEDIS Project: A summary Overview of the Social, Economic, Environmental, Demographic Information System", Lawrence Berkeley Laboratory document PUB-424, April 1982

    Google Scholar 

  12. D. Merrill, "Problems in Spatial Data Analysis", Proc. VII SAS User Group Int. Conf., San Francisco 1982

    Google Scholar 

  13. Z. Michalewicz, Compromisability of a Statistical Database, Information Systems 6: 4 (1983) 301–304

    Google Scholar 

  14. Z. M. Ozsoyoglu and G. Ozsoyoglu, "An Extension of Relational Algebra for Summary Tables", Proc. 2 Int. Workshop on Statistical & Scientific Database Management 1983, 202–211

    Google Scholar 

  15. E. M. Reingold, J. Nievergelt and N. Deo, Combinatorial Algorithms: Theory and Practice. PRENTICE-HALL, 1977

    Google Scholar 

  16. R. Ruggles and N. Ruggles, "The Role of Microdata in the National Economic and Social Accounts", Review of Income and Wealth, June 1975, 203–216

    Google Scholar 

  17. H. Sato, "Handling Summary Information in a Database: Derivability", Proc. ACM SIGMOD 1981, 98–107

    Google Scholar 

  18. H. Sato, "Fundamental Concepts of Social/Regional Summary Data and Inferences in their Databases", Doctoral Thesis, The Faculty of Engineering, Tokyo University (1982)

    Google Scholar 

  19. A. Shoshani, "Statistical Databases: Characteristics, Problems and Some Solutions", Proc. VIII Int. Conf. on VERY LARGE DATA BASES (1982)

    Google Scholar 

  20. Ministero dell'Industria, Bilancio Energetico Nazionale. Roma, 1986

    Google Scholar 

  21. OECD, Energy Balance of OECD countries. Paris, 1987

    Google Scholar 

  22. UNITED NATIONS, "Towards a System of Social and Demographic Statistics", ST/EST/STAT/SER.F/18, New York, 1975

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Maurizio Rafanelli John C. Klensin Per Svensson

Rights and permissions

Reprints and permissions

Copyright information

© 1989 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Malvestuto, F.M., Zuffada, C. (1989). The classification problem with semantically heterogeneous data. In: Rafanelli, M., Klensin, J.C., Svensson, P. (eds) Statistical and Scientific Database Management. SSDBM 1988. Lecture Notes in Computer Science, vol 339. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0027512

Download citation

  • DOI: https://doi.org/10.1007/BFb0027512

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-50575-4

  • Online ISBN: 978-3-540-46045-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics