Skip to main content
Log in

On-board Vehicle Data Stream Monitoring Using MineFleet and Fast Resource Constrained Monitoring of Correlation Matrices

  • Published:
New Generation Computing Aims and scope Submit manuscript

Abstract

This paper considers the problem of monitoring vehicle data streams in a resource-constrained environment. It particularly focuses on a monitoring task that requires frequent computation of correlation matrices using lightweight on-board computing devices. It motivates this problem in the context of the MineFleet Real-Time system and offers a randomized algorithm for fast monitoring of correlation (FMC), inner product, and Euclidean distance matrices among others. Unlike the existing approaches that compute all the entries of these matrices from a data set, the proposed technique works using a divide-and-conquer approach. This paper presents a probabilistic test for quickly detecting whether or not a subset of coefficients contains a significant one with a magnitude greater than a user given threshold. This test is used for quickly identifying the portions of the space that contain significant coefficients. The proposed algorithm is particularly suitable for monitoring correlation and related matrices computed from continuous data streams.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Alon, N., Babai, L. and Itai. A., “A Fast and Simple Randomized Parallel Algorithm for the Maximal Independent Set Problem”, Journ. of Algorithms, 7, 4, pp. 567–583, 1986.

    Article  MATH  MathSciNet  Google Scholar 

  2. Alon, N., Goldreich, O., Hastad, J. and Peralta, R., “Simple Constructions of Almost K-wise Independent Random Variables”, in IEEE Symposium on Foundations of Computer Science, pp. 544–553, 1990.

  3. Alon, N., Goldreich, O. and Mansour, Y., “Almost K-wise Independence versus K-wise Independence,” Inf. Process. Lett., 88 ,3, pp. 107–110, 2003.

    Article  MathSciNet  Google Scholar 

  4. Alon, N., Matias, Y. and Szegedy, M., “The Space Complexity of Approximating the Frequency Moments,” in Proc. of the ACM Symposium on Theory of Computing, pp. 20–29, 1996.

  5. Alqallaf, F., Konis, K., Martin, R. and Zamar, R., “Scalable Robust Covariance and Correlation Estimates for Data Mining,” in Proc. of the eighth ACMSIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 14–23, ACM Press, 2002.

  6. Chien, S., Debban, T., Yen, C., Sherwood, R., Castano, R., Cichy, B., Davies, A., Burl, M., Fukunaga, A., Greeley, R., Doggett, T., Williams, K., Baker, V. and Dohm, J., “Revolutionary Deep Space Science Missions Enabled by Onboard Autonomy,” International Symposium on Artificial Intelligence, Robotics, and Automation in Space (i-SAIRAS), 2003.

  7. Cormode, G. and Muthukrishnan, S., “Estimating Dominance Norms of Multiple Data Streams,” Technical Report, DIMACS TR 2002–35, DIMACS, 2002.

  8. Cormode, G. and Muthukrishnan, S., “What is New: Finding Significant Differences in Network Data Streams,” in Proc. of the INFOCOM04, 2004.

  9. Falk, R. and Well, A., “Many Faces of the Correlation Coefficient,” Journ. of Statistics Education, 5, 3, 1997.

  10. Feigenbaum, J., Kannan, S., Strauss, M. and Viswanathan, M., “An Approximate l 1 - difference Algorithm for Massive Data Streams,” in IEEE Symposium on Foundations of Computer Science, pp. 501–511, 1999.

  11. Ganguly, S., “Estimating Frequency Moments of Data Streams Using Random Linear Combinations,” in APPROX-RANDOM, pp. 369–380, 2004.

  12. Hall, D.L. and Culler, D., Handbook of Multi-Sensor Data Fusion, 2001.

  13. Hotelling, H. “Relation between Two Sets of Variants,” Biometrika, 28, pp. 322–377, 1936.

    Google Scholar 

  14. Kargupta, H., Bhargava, R., Liu, K., Powers, M., Blair, P., Bushra, S., Dull, J., Sarkar, K., Klein, M., Vasa, M. and Handy, D., “Vedas: A Mobile and Distributed Data Stream Mining System for Real-time Vehicle Monitoring,” in Proc. of the SIAM International Data Mining Conference, Orlando, 2004.

  15. Kargupta, H. and Sivakumar, K., “Existential Pleasures of Distributed Data Mining,” Next Generation Data Mining: Future Directions and Challenges, MIT/AAAI Press, 2004.

  16. Luby, M., “A Simple Parallel Algorithm for the Maximal Independent Set Problem,” in STOC ’85: Proc. of the Seventeenth Annual ACM Symposium on Theory of Computing, pp. 1–10, ACM Press, 1985.

  17. Motwani, R. and Raghavan, P., Randomized Algorithms, Cambridge University Press, 1995.

  18. Pottie, G. and Kaiser, W., “Embedding the Internet: Wireless Integrated Network Sensors,” Communications of the ACM, 43, 5, pp. 51–58, 2000.

    Article  Google Scholar 

  19. Srivastava, A.N. and Stroeve, J., “Onboard Detection of Snow, Ice, Clouds and Other Geophysical Processes Using Kernel Methods,” in Proc. of the ICML 2003 Workshop on Machine Learning Technologies for Autonomous Space Sciences, 2003.

  20. Weldon, K.L., “A Simplified Introduction to Correlation and Regression,” Journ. of Statistics Education, 8, 3, 2000.

  21. Zilberstein, S., “Using Anytime Algorithms in Intelligent Systems,” AI Magazine, 17, 3, pp. 73–83, 1996.

    Google Scholar 

  22. Zue, Y. and Shasha, D., “Statistical Monitoring of Thousands of Data Streams in Real Time,” in Proc. of the 28th VLDB Conference, Hong Kong, China, 2002.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hillol Kargupta.

About this article

Cite this article

Kargupta, H., Puttagunta, V., Klein, M. et al. On-board Vehicle Data Stream Monitoring Using MineFleet and Fast Resource Constrained Monitoring of Correlation Matrices. New Gener. Comput. 25, 5–32 (2006). https://doi.org/10.1007/s00354-006-0002-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00354-006-0002-4

Keywords

Navigation