Skip to main content

Conceptual Analysis of Big Data Using Ontologies and EER

  • Conference paper
  • First Online:
  • 2159 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9432))

Abstract

Large amounts of “big data” are generated every day, many in a “raw” format that is difficult to analyze and mine. This data contains potential hidden meaningful concepts, but much of the data is superfluous and not of interest to the domain experts. Thus, dealing with big raw data solely by applying a set of distributed computing technologies (e.g., MapReduce, BSP [Bulk Synchronous Parallel], and Spark) and/or distributed storage systems, namely NoSQL, is generally not sufficient. Extracting the full knowledge that is hidden in the raw data is necessary to efficiently enable analysis and mining. The data needs to be processed to remove the superfluous parts and generate the meaningful domain-specific concepts. In this paper, we propose a framework that incorporates conceptual modeling and EER principle to effectively extract conceptual knowledge from the raw data so that mining and analysis can be applied to the extracted conceptual data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Embley, D.W., Liddle, S.W.: Big data—conceptual modeling to the rescue. In: 32nd International Conference on Conceptual Modeling (2013)

    Google Scholar 

  2. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: 6th Symposium on Operating Systems Design and Implementation (2004)

    Google Scholar 

  3. Valiant, L.G.: A bridging model for multi-core computing. In: 16th Annual European Symposium (2008)

    Google Scholar 

  4. Apache. Apache Spark™. http://spark.apache.org

  5. Zou, B., Ma, X., Kemme, B., Newton, G., Precup, D.: Data mining using relational database management systems. In: 10th Pacific-Asia Conference (2006)

    Google Scholar 

  6. Lam, C.: Hadoop in Action. Dreamtech Press, New Delhi (2011)

    Google Scholar 

  7. Edlich, S.: List of NOSQL Databases. http://nosql-database.org

  8. Amazon. Amazon DynamoDB. http://aws.amazon.com/dynamodb

  9. MongoDB. http://www.mongodb.org

  10. Jitkajornwanich, K., Elmasri, R., Li, C., McEnery, J.: Extracting storm-centric characteristics from raw rainfall data for storm analysis and mining. In: 1st ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data (2012)

    Google Scholar 

  11. Jitkajornwanich, K., Gupta, U., Elmasri, R., Fegaras, L., McEnery, J.: Using mapreduce to speed up storm identification from big raw rainfall data. In: 4th International Conference on Cloud Computing, GRIDs, and Virtualization (2013)

    Google Scholar 

  12. Jitkajornwanich, K., Gupta, U., Shanmuganathan, S.K., Elmasri, R., Fegaras, L., McEnery, J.: Complete storm identification algorithms from big raw rainfall data. In: 2013 IEEE International Conference on Big Data (2013)

    Google Scholar 

  13. Overeem, A., Buishand, A., Holleman, I.: Rainfall depth-duration-frequency curves and their uncertainties. J. Hydrol. 348, 124–134 (2008)

    Article  Google Scholar 

  14. Elmasri, R., Navathe, S.: Fundamentals of Database Systems, 6th edn. Pearson Education, New Delhi (2010)

    Google Scholar 

  15. Asquith, W.H., Roussel, M.C., Cleveland, T.G., Fang, X., Thompson, D.B.: Statistical characteristics of storm interevent time, depth, and duration for eastern New Mexico, Oklahoma, and Texas. Professional Paper 1725, US Geological Survey (2006)

    Google Scholar 

  16. Lanning-Rush, J., Asquith, W.H., Slade, Jr., R.M.: Extreme precipitation depth for Texas, excluding the trans-pecos region. Water-Resources Investigations Report 98–4099, US Geological Survey (1998)

    Google Scholar 

  17. NOAA’s national weather service. The XMRG File Format and Sample Codes to Read XMRG Files. http://www.nws.noaa.gov/oh/hrl/dmip/2/xmrgformat.html

  18. Consortium of universities for the advancement of hydrologic science, Inc. (CUAHSI). ODM Databases. http://his.cuahsi.org/odmdatabases.html

  19. Asquith, W.H.: Depth-duration frequency of precipitation for Texas. Water-Resources Investigations Report 98–4044, US Geological Survey (1998)

    Google Scholar 

  20. Asquith, W.H.: Summary of dimensionless Texas hyetographs and distribution of storm depth developed for texas department of transportation research project 0–4194. Report 0–4194-4, US Geological Survey (2005)

    Google Scholar 

  21. National Oceanic and Atmospheric Administration (NOAA). National Weather Service River Forecast Center: West Gulf RFC (NWS-WGRFC). http://www.srh.noaa.gov/wgrfc

  22. Unidata. What is the LDM? https://www.unidata.ucar.edu/software/ldm/ldm-6.6.5/tutor-ial/whatis.html

  23. Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. In: 7th USENIX Symposium on Operating Systems Design and Implementation (2006)

    Google Scholar 

  24. NOAA. MPE: Multisensor Precipitation Estimate. http://www.erh.noaa.gov/marfc/Maps/xmrg/index_java.html

  25. Mishra, S.K., Singh, V.P.: Soil Conservation Service Curve Number (SCS-CN) Methodology. Kluwer Academic Publishers, Boston (2003)

    Book  Google Scholar 

  26. Jitkajornwanich, K.: Analysis and modeling techniques for geo-spatial and spatio-temporal datasets. Doctoral Dissertation, The University of Texas at Arlington (2014)

    Google Scholar 

  27. Cheng, T., Haworth, J., Anbaroglu, B., Tanaksaranond, G., Wang, J.: Spatio-Temporal Data Mining. Handbook of Regional Science. Springer, Heidelberg (2013)

    Google Scholar 

  28. IBM Big Data and Analytics Hub. Understanding Big Data: e-book. http://www.ibmbigdatahub.com/whitepaper/understanding-big-data-e-book

  29. Jin, R. NoSQL and Big Data Processing: Hbase, Hive and Pig, etc. http://www.cs.kent.edu/~jin/Cloud12Spring/HbaseHivePig.pptx

  30. Widom, J. NoSQL Systems: Overview. http://openclassroom.stanford.edu/Main-Folder/courses/cs145/old-site/docs/slides/NoSQLOverview/annotated.pptx

  31. World Wide Web Consortium (W3C). OWL Web Ontology Language Guide. http://www.w3.org/TR/owl-guide/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kulsawasd Jitkajornwanich .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Jitkajornwanich, K., Elmasri, R. (2015). Conceptual Analysis of Big Data Using Ontologies and EER. In: Pardalos, P., Pavone, M., Farinella, G., Cutello, V. (eds) Machine Learning, Optimization, and Big Data. MOD 2015. Lecture Notes in Computer Science(), vol 9432. Springer, Cham. https://doi.org/10.1007/978-3-319-27926-8_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27926-8_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27925-1

  • Online ISBN: 978-3-319-27926-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics