Data stream mining gained in importance over the last years because it is indispensable for many real applications such as prediction and evolution of weather phenomena; security and anomaly detection in networks; evaluating satellite data; and mining health monitoring streams. Stream mining algorithms must take account of the unique properties of stream data: infinite data, temporal ordering, concept drifts and shifts, demand for scalability etc.
Learning on streams has followed two threads thus far: mining (classification, clustering, frequent itemset discovery) and probabilistic modeling. In both threads, scholars devise solutions to the above problems. Stream clustering algorithms are more oriented towards scalability and tracing of model changes, while dynamic probabilistic modeling, e.g. dynamic topic modeling, encompasses methods that adapt seamlessly to drifts. At the same time, research on unsupervised stream learning seems to be scattered along the many application areas. Examples of areas that seem to evolve independently are sensor mining, mining on clickstreams and other logs in stream form, topic modeling on document streams, and temporal mining on data that are actually streams.
This workshop brings together scholars working in different areas of learning on streams, including sensor data and other forms of accumulating data. Most of the papers in the next pages are on unsupervised learning with clustering methods. Issues addressed include the detection of outliers and anomalies, evolutionary clustering and incremental clustering, learning in subspaces of the complete feature space and learning with exploitation of context, deriving models from text streams and visualizing them.
Proceeding Downloads
Fully decentralized computation of aggregates over data streams
In several emerging applications, data is collected in massive streams at several distributed points of observation. A basic and challenging task is to allow every node to monitor a neighbourhood of interest by issuing continuous aggregate queries on ...
Detecting outliers on arbitrary data streams using anytime approaches
Data streams are gaining importance in many sensoring and monitoring environments. Frequent mining tasks on data streams include classification, modeling and outlier detection. Since often the data arrival rates vary, anytime algorithms have been ...
CALDS: context-aware learning from data streams
Drift detection methods in data streams can detect changes in incoming data so that learned models can be used to represent the underlying population. In many real-world scenarios context information is available and could be exploited to improve ...
Evolutionary clustering using frequent itemsets
Evolutionary clustering is an emerging research area addressing the problem of clustering dynamic data. An evolutionary clustering should take care of two conflicting criteria: preserving the current cluster quality and not deviating too much from the ...
Towards subspace clustering on dynamic data: an incremental version of PreDeCon
Todays data are high dimensional and dynamic, thus clustering over such kind of data is rather complicated. To deal with the high dimensionality problem, the subspace clustering research area has lately emerged that aims at finding clusters in subspaces ...
Visual analysis of news streams with article threads
The analysis of large quantities of news is an emerging area in the field of data analysis and visualization. International agencies collect thousands of news every day from a large number of sources and making sense of them is becoming increasingly ...
Conformal prediction for distribution-independent anomaly detection in streaming vessel data
This paper presents a novel application of the theory of conformal prediction for distribution-independent on-line learning and anomaly detection. We exploit the fact that conformal predictors give valid prediction sets at specified confidence levels ...
Research issues in mining multiple data streams
There exist emerging applications of data streams that have mining requirements. Although single data stream mining has been extensively studied, little research has been done for mining multiple data streams (MDS), which are more complex than single ...
- Proceedings of the First International Workshop on Novel Data Stream Pattern Mining Techniques
Recommendations
Data Stream Mining: Challenges and Techniques
ICTAI '10: Proceedings of the 2010 22nd IEEE International Conference on Tools with Artificial Intelligence - Volume 02Data streams are continuous flows of data. Examples of data streams include network traffic, sensor data, call center records and so on. Their sheer volume and speed pose a great challenge for the data mining community to mine them. Data streams ...
Sliding window based weighted erasable stream pattern mining for stream data applications
As one of the variations in frequent pattern mining, erasable pattern mining discovers patterns with benefits lower than or equal to a user-specified threshold from a product database. Although traditional erasable pattern mining algorithms can perform ...