Data stream mining to address big data problems | IEEE Conference Publication | IEEE Xplore

Data stream mining to address big data problems


Abstract:

Today, the IT world is trying to cope with “big data” problems (data volume, velocity, variety, veracity) on the path to obtaining useful information. In this paper, we p...Show More

Abstract:

Today, the IT world is trying to cope with “big data” problems (data volume, velocity, variety, veracity) on the path to obtaining useful information. In this paper, we present implementation details and performance results of realizing “online” Association Rule Mining (ARM) over big data streams for the first time in the literature. Specifically, we added Apriori and FP-Growth algorithms for stream mining inside an event processing engine, called Esper. Using the system, these two algorithms were compared over LastFM social music site data and by using tumbling windows. The better-performing FP-Growth was selected and used in creation of a real-time rule-based recommendation engine. Our most important findings show that online association rule mining can generate (1) more rules, (2) much faster and more efficiently, and (3) much sooner than offline rule mining. In addition, we have found many interesting and realistic musical preference rules such as “George Harrison⇒Beatles”. We hope that our findings can shed light on the design and implementation of other big data analytics systems in the future.
Date of Conference: 24-26 April 2013
Date Added to IEEE Xplore: 13 June 2013
ISBN Information:
Conference Location: Haspolat, Turkey

References

References is not available for this document.