skip to main content
10.1145/1958746.1958797acmconferencesArticle/Chapter ViewAbstractPublication PagesicpeConference Proceedingsconference-collections
abstract

Improving the efficiency of information collection and analysis in widely-used IT applications

Published: 30 September 2011 Publication History

Abstract

Modern IT environments collect and analyze increasingly large volumes of data for a growing number of purposes (e.g., automated management, security, regulatory compliance, etc.). Simultaneously, such environments are challenged by the need to minimize their environmental footprints. A general solution to this problem is to utilize IT resources more efficiently.
This paper describes our work to systematically evaluate the inefficiencies in the information collection and analysis of several widely-used IT applications, to implement a more efficient solution, and to quantify the improvements. In particular, the logging of HTTP transactions by the Apache Web server and of network events by the Bro intrusion detection system are converted from text files to DataSeries. The costs of recording, storing and analyzing the information in the different formats are thoroughly evaluated and compared. We converted the text logs to DataSeries online, with no discernable overhead on the logging applications. We achieved upto a 7x decrease in the logfile sizes relative to the sizes of the default text logs, and speedups of 3x-8.4x to analyze the logfiles.

References

[1]
Analog: a free logfile analyzer. {Online} Available: http://www.analog.cx/.
[2]
Apache HTTP server project. {Online} Available: http://httpd.apache.org/download.cgi.
[3]
Apache logging module mod_log_config. {Online} Available: http://httpd.apache.org/docs/2.0/mod/mod_log_config.html.
[4]
Awstats: a free logfile analyzer. {Online} Available: http://awstats.sourceforge.net/.
[5]
Bro IDS Reference Manual: Analyzers and Events. {Online} Available: http://www.broids. org/wiki/index.php/Reference_Manual:_Analyzers_and_Events.
[6]
Bro IDS Reference Manual: Getting Started (the cf utility). {Online} Available: http://www.broids. org/wiki/index.php/Reference_Manual:_Getting_Started# The_cf_utility.
[7]
Bro intrusion detection system. {Online} Available: http://www.bro-ids.org/download.html.
[8]
Bro Quick Start Guide. {Online} Available: http://www.bro-ids.org/Bro-quick-start.pdf.
[9]
Capstats: a quick hack to get some NIC statistics. {Online} Available: http://www.icir.org/robin/capstats/.
[10]
Conn.log connection summaries. {Online} Available: http://tinyurl.com/bro-conns.
[11]
DataSeries technical report. {Online} Available: http://tesla.hpl.hp.com/opensource/DataSeries-trsnapshot. pdf.
[12]
How do you create a new Apache module? {Online} Available: http://ivascucristian.com/how-do-you-create-anew-apache-module.
[13]
Httperf: a tool for measuring web server performance. {Online} Available: http://www.hpl.hp.com/research/linux/httperf/.
[14]
The Inline::CPP module: put C++ source code directly "inline" in a Perl script. {Online} Available: http://search.cpan.org/neilw/Inline-CPP- 0.25/lib/Inline/CPP.pod.
[15]
Open Source software at Hewlett-Packard Laboratories. {Online} Available: http://tesla.hpl.hp.com/opensource/.
[16]
RUBiS: an auction site prototype. {Online} Available: http://rubis.ow2.org/.
[17]
RUBiSVA: a virtual appliance of the RUBiS benchmark. {Online} Available: http://rubis.ow2.org/download/rubisva_v1.0.pdf.
[18]
Source files with modifications done within this work. {Online} Available: http://www.sfu.ca/sba70/files/dataseries/.
[19]
Tmpfs: a temporary file storage facility. {Online} Available: http://en.wikipedia.org/wiki/Tmpfs.
[20]
Traces from the Internet Traffic Archive. {Online} Available: http://ita.ee.lbl.gov/html/traces.html.
[21]
Webalizer: a free logfile analyzer. {Online} Available: http://www.webalizer.org/.
[22]
The Webalizer: free web server log file analysis program. {Online} Available: http://www.webalizer.org/download.html.
[23]
E. Anderson. Capture, conversion, and analysis of an intense NFS workload. In FAST '09, pages 139-152, 2009.
[24]
E. Anderson, M. Arlitt, C. B. Morrey, III, and A. Veitch. DataSeries: an efficient, flexible data format for structured serial data. SIGOPS Oper. Syst. Rev., 43(1):70-75, 2009.
[25]
S. Blagodurov and M. Arlitt. Improving the efficiency of information collection and analysis in widely-used IT applications. HPL Technical report {Online} Available: http://www.hpl.hp.com/techreports/2010/HPL-2010- 164.html.
[26]
B. Urgaonkar, G. Pacifici, P. Shenoy, M. Spreitzer, and A. Tantawi. An analytical model for multi-tier internet services and its applications. In SIGMETRICS '05, pages 291-302, 2005.
[27]
T. Wood, L. Cherkasova, K. Ozonat, and P. Shenoy. Profiling and modeling resource usage of virtualized applications. In Middleware '08, pages 366-387, 2008.

Index Terms

  1. Improving the efficiency of information collection and analysis in widely-used IT applications

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICPE '11: Proceedings of the 2nd ACM/SPEC International Conference on Performance engineering
    March 2011
    470 pages
    ISBN:9781450305198
    DOI:10.1145/1958746

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 September 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Abstract

    Conference

    ICPE'11

    Acceptance Rates

    Overall Acceptance Rate 252 of 851 submissions, 30%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 199
      Total Downloads
    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media