Abstract
Yahoo! is on track to realize its goal of real-time enterprise-level reporting. Accessing real-time reports allows executives and decision makers to program content and advertising in a way that benefits both the business and the end user. This paper describes our legacy architecture, as well as a new, low latency pipeline. In particular, we show that by using a combination of novel JavaScript instrumentation techniques, as well as an automated, standardized reporting system on top of a near real-time inter-colo event collection mechanism, Yahoo! is nearing its real-time reporting goals.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Apache web server software, http://www.apache.org
Claburn, T.: Yahoo Claims Record With Petabyte Database, InformationWeek (2008), http://www.informationweek.com/news/software/database/showArticle.jhtml?articleID=207801436
Document Object Model, http://www.w3.org/DOM/
Hypertext Transfer Protocol - HTTP/1.1, http://www.w3.org/Protocols/rfc2616/rfc2616.html
Maximum URL Length in 2,083 characters in Internet Explorer, http://support.microsoft.com/kb/208427
Web bug - Wikipedia, http://en.wikipedia.org/wiki/Web_bug
Google Analytics, http://www.google.com/analytics/features.html
Omniture Web Analytics, http://www.omniture.com/en/products/web_analytics
Private Members in JavaScript, http://www.crockford.com/javascript/private.html
JSON, http://www.json.org
Mouse Events in the browser, http://www.quirksmode.org/js/events_mouse.html
Web Browser Event Delegation, http://developer.yahoo.com/yui/examples/event/event-delegation.html
Event Bubbling, http://www.quirksmode.org/js/events_order.html
Livehttpheaders, http://livehttpheaders.mozdev.org
nsIHttpChannel interface API, http://xulplanet.mozdev.org/references/xpcomref/nsIHttpChannel.html
Header Field Definitions in Internet RFC 2616, http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
Apache Module description, http://httpd.apache.org/modules/
Cranor, C., Johnson, T., Spatscheck, O., Shkapenyuk, V.: Gigascope: A Stream Database for Network Applications. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 647–651. ACM, New York (2003)
Zdonik, S.B., Stonebraker, M., Cherniack, M., Çetintemel, U., Balazinska, M., Balakrishnan, H.: The Aurora and Medusa Projects. In: IEEE DE Bulletin, pp. 3–10 (2003)
Stanford Stream Data Manager, http://infolab.stanford.edu/stream/
Yahoo User Interface, http://developer.yahoo.com/yui
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tully, T. (2009). A Near Real-Time Reporting System for Enterprises Using JavaScript Instrumentation with Inter-colo Event Replication. In: Castellanos, M., Dayal, U., Sellis, T. (eds) Business Intelligence for the Real-Time Enterprise. BIRTE 2008. Lecture Notes in Business Information Processing, vol 27. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03422-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-03422-0_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03421-3
Online ISBN: 978-3-642-03422-0
eBook Packages: Computer ScienceComputer Science (R0)