Skip to main content

Data Summarization Model for User Action Log Files

  • Conference paper
Computational Science and Its Applications – ICCSA 2012 (ICCSA 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7335))

Included in the following conference series:

Abstract

During last years we have seen an impressive growth and diffusion of applications shared and used by a huge amount of users around the world, like for example social networks, web portals or elearning platforms. Such systems produce in general a large amount of data, normally stored in its raw format in log file systems and databases. To prevent an unmanageable growing of the necessary space to store data and the breakdown of data usability, such data can be condensed and summarized to improve reporting performance and reduce the system load. This data summarization reduces the amount of space that is required to store software data but produces, as a side effect, a decrease of their informative capability due to an information loss. In this work the problem of summarizing data obtained by the log systems of applications with a lot of users is studied. In particular a model to represent these raw data as temporal events collected in time sequences is proposed, methods to reduce the data size, collapsing the descriptions of more events in a unique descriptor or in a smaller set of descriptors, are provided and the optimal summarization problem is posed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hadoop: an Open-Source MapReduce computing platform, http://hadoop.apache.org

  2. Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the IEEE International Conference on Data Engineering (ICDE), pp. 3–14 (1995)

    Google Scholar 

  3. Allen, J.F.: An interval-based representation of temporal knowledge. In: Proceedings of the 7th International Joint Conference on Artificial Intelligence, vol. 1, pp. 221–226 (1981)

    Google Scholar 

  4. Allen, J.F.: Maintaining knowledge about temporal intervals. Communications of the ACM 26(11), 832–843 (1983)

    Article  MATH  Google Scholar 

  5. Chandola, V., Kumar, V.: Summarization–compressing data into an informative representation. Knowledge and Information Systems 12(3), 355–378 (2007)

    Article  Google Scholar 

  6. Costantini, A., Tasso, S., Gervasi, O.: It Visualization and Web Services for Studying Molecular Properties. In: Computational Science and Applications, pp. 222–228 (2009) ISBN 978-0-7695-3701-6

    Google Scholar 

  7. Jiang, Y., Perng, C.S., Li, T.: Natural event summarization. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 765–774. ACM (2011)

    Google Scholar 

  8. Kiernan, J., Terzi, E.: Constructing comprehensive summaries of large event sequences. In: Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 417–425. ACM (2008)

    Google Scholar 

  9. Kiernan, J., Terzi, E.: Constructing comprehensive summaries of large event sequences. ACM Transactions on Knowledge Discovery from Data (TKDD) 3(4), 21 (2009)

    Google Scholar 

  10. Kiernan, J., Terzi, E.: EventSummarizer: A tool for summarizing large event sequences. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pp. 1136–1139. ACM (2009)

    Google Scholar 

  11. Pallottelli, S., Tasso, S., Pannacci, N., Costantini, A., Lago, N.F.: Distributed and Collaborative Learning Objects Repositories on Grid Networks. In: Taniar, D., Gervasi, O., Murgante, B., Pardede, E., Apduhan, B.O. (eds.) ICCSA 2010. LNCS, vol. 6019, pp. 29–40. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  12. Peng, W., Perng, C., Li, T., Wang, H.: Event summarization for system management. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1028–1032 (2007)

    Google Scholar 

  13. Pham, Q.K., Raschia, G., Mouaddib, N., Saint-Paul, R., Benatallah, B.: Time sequence summarization to scale up chronology-dependent applications. In: Proceeding of the 18th ACM Conference on Information and Knowledge Management, pp. 1137–1146 (2009)

    Google Scholar 

  14. Povinelli, R.J.: Identifying temporal patterns for characterization and prediction of financial time series events. In: Temporal Spatial and SpatioTemporal Data Mining, pp. 46–61 (2001)

    Google Scholar 

  15. Saint-Paul, R., Raschia, G., Mouaddib, N.: General purpose database summarization. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 733–744. VLDB Endowment (2005)

    Google Scholar 

  16. Tang, L., Li, T., Perng, C.S.: LogSig: Generating system events from raw textual logs. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 785–794. ACM (2011)

    Google Scholar 

  17. Tasso, S., Pallottelli, S., Bastianini, R., Lagana, A.: Federation of Distributed and Collaborative Repositories and Its Application on Science Learning Objects. In: Murgante, B., Gervasi, O., Iglesias, A., Taniar, D., Apduhan, B.O. (eds.) ICCSA 2011, Part III. LNCS, vol. 6784, pp. 466–478. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  18. Wang, J., Karypis, G.: On efficiently summarizing categorical databases. Knowledge and Information Systems 9(1), 19–37 (2006)

    Article  Google Scholar 

  19. Wang, P., Wang, H., Liu, M., Wang, W.: An algorithmic approach to event summarization. In: Proceedings of the 2010 International Conference on Management of Data, pp. 183–194. ACM (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gentili, E., Milani, A., Poggioni, V. (2012). Data Summarization Model for User Action Log Files. In: Murgante, B., et al. Computational Science and Its Applications – ICCSA 2012. ICCSA 2012. Lecture Notes in Computer Science, vol 7335. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31137-6_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31137-6_41

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31136-9

  • Online ISBN: 978-3-642-31137-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics