ABSTRACT
Heterogeneous human-generated data streams are the measurands which provide opportunities to identify patterns, detect novelties and explore evolution of complex social systems. Communication technologies with their very high penetration into society can serve as particularly rich sources of information. However, for a variety of observable communication channels one has little or no access to the content of human-to-human communications, while the data streams on the intensities of such events are more common. The paper presents a framework of methods useful for exploratory analysis and visualization of such data streams. Particularly, we demonstrate how untypical activity levels can be identified by fitting a non-homogeneous Markov-modulated Poisson process and spatialising the component corresponding to unusual bursts/lulls of activity via heat maps. This approach is illustrated with a case study devoted to the analysis of geo-referenced data streams of instant messaging activity on the internet.
- A. S. Fotheringham and D. W. S. Wong. The modifiable areal unit problem in multivariate statistical analysis. 23(7):1025--1044, 1991.Google Scholar
- A. Ihler, J. Hutchins, and P. Smyth. Adaptive event detection with time-varying poisson processes. In KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 207--216, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- A. Pozdnoukhov. Dynamic network data exploration through semi-supervised functional embedding. In GIS '09: Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 372--379, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- B. W. Silverman. Density Estimation for Statistics and Data Analysis. Chapman and Hall/CRC, 1 edition, April 1986.Google Scholar
- R. L. Wolpert and K. Ickstadt. Poisson/gamma random field models for spatial statistics. Biometrika, 85(2):251--267, June 1998.Google ScholarCross Ref
Index Terms
- Exploratory novelty identification in human activity data streams
Recommendations
Novelty detection algorithm for data streams multi-class problems
SAC '13: Proceedings of the 28th Annual ACM Symposium on Applied ComputingNovelty detection has been presented in the literature as one-class problem. In this case, new examples are classified as either belonging to the target class or not. The examples not explained by the model are detected as belonging to a class named ...
A Fuzzy Approach for Classification and Novelty Detection in Data Streams Under Intermediate Latency
Intelligent SystemsAbstractNovelty detection is an important topic in data stream classification, as it is responsible for identifying the emergence of new concepts, new patterns, and outliers. It becomes necessary when the true label of an instance is not available right ...
Online Clustering for Novelty Detection and Concept Drift in Data Streams
Progress in Artificial IntelligenceAbstractData streams are related to large amounts of data that can continuously arrive with a probability distribution that may change over time. Depending on the changes in the data distribution, different phenomena can occur, like new classes can appear ...
Comments