The Data Webhouse Toolkit: Building the Web‐enabled Data Warehouse

Industrial Management & Data Systems

ISSN: 0263-5577

Article publication date: 1 November 2000

612

Keywords

Citation

Kimball, R. and Merz, R. (2000), "The Data Webhouse Toolkit: Building the Web‐enabled Data Warehouse", Industrial Management & Data Systems, Vol. 100 No. 8, pp. 406-408. https://doi.org/10.1108/imds.2000.100.8.406.1

Publisher

:

Emerald Group Publishing Limited


The most popular commercial Web sites handle more than 100 million hits daily, making the Web the single most important factor in today’s commerce revolution. These sites can track the behavioural information of customers that marketing analysts would die for. Sadly, many companies fail to take advantage of this valuable information because they cannot cope with such a huge amount of data. Enter the data warehouse – the perfect solution to the Web data deluge problem – which if designed and deployed correctly, can become the linchpin of the modern customer‐focused, Web‐based company.

The Data Webhouse Toolkit provides detailed design techniques for creating powerful Web‐enabled data warehouses – what Kimball has termed Webhouses. The Web revolution has propelled the data warehouse onto the main stage, because, in many situations, the data warehouse controls and analyses the Web experience. IT professionals are being asked to seamlessly publish all sorts of information through Web browser interfaces to a growing audience that includes not only internal management, but customers, partners, and a large pool of internal employees. And, because Web technology makes it possible to record nearly every gesture made by individuals when they are touring remote Web sites, IT departments are responsible for bringing this raw behavioural information to a serious database for analysis.

The Data Webhouse Toolkit is divided into two parts that reflect the two personalities of the data Webhouse. The first part describes bringing the Web to the warehouse. Because the Web itself is an immense source of undisciplined information, this clickstream data must be tamed and understood so it can be sorted in the data Webhouse and used effectively. Part 1 goes on to cover the process of tracking Web site user actions by discussing the categories of natural Web behaviour, including searching, browsing, work, education, communication, shopping, entertainment, and downloading. Since Web data sources often reveal the pathways by which the user entered a given Web page, this part dissects the various types of ISPs, portals, bookmarks, referrals, and clickthroughs. It talks in detail about tracking the route a user takes across the Web and tackles the major question of whether a user can be identified.

Other topics included in part 1 are:

  • Understanding the clickstream as a data source.

  • Designing the Web site to support warehousing.

  • Building clickstream data marts.

  • Assembling clickstream value chains.

  • Implementing the clickstream post‐processor.

The second half of The Data Webhouse Toolkit is about bringing the data warehouse to the Web, discussing how the design of effective Web sites is one of the most exciting areas in all of computing. It presents state‐of‐the‐art Web site design, and shows how design feeds the Webhouse, and also how the Webhouse provides the designer with the evidence to improve the design of the site. It also covers:

  • Driving data mining from the Webhouse.

  • Creating an international Webhouse.

  • Securing the Webhouse.

  • Scaling the Webhouse.

  • Managing the Webhouse project.

The Web and the warehouse are being drawn together like two powerful magnets. The Web needs the warehouse for its customer‐focused functions, and the warehouse is being transformed by the demands of the Web. Driving an understanding of customer behaviour, that is an adaptive and resilient source of information and the foundation for Web‐enabled decision‐making. The Data Webhouse Toolkit clearly shows that the impact of the Web is so profound that it is much more than an “application” – it is a new environment.

Related articles