Abstract
Given a large set of measurement data, in order to identify a simple function that captures the essence of the data, we suggest representing the data by an abstract function, in particular by polynomials. We interpolate the datapoints to define a polynomial that would represent the data succinctly. The interpolation is challenging, since in practice the data can be noisy and even Byzantine where the Byzantine data represents an adversarial value that is not limited to being close to the correct measured data. We present two solutions, one that extends the Welch-Berlekamp technique (Error correction for algebraic block codes, 1986) to eliminate the outliers appearance in the case of multidimensional data, and copes with discrete noise and Byzantine data; and the other solution is based on Arora and Khot (J Comput Syst Sci 67(2):325–340, 2003) method which handles noisy data, and we have generalized it in the case of multidimensional noisy and Byzantine data.
Similar content being viewed by others
References
Ar, S., Lipton, R.J., Rubinfeld, R., Sudan, M.: Reconstructing algebraic functions from mixed data. In FOCS. IEEE Computer Society, pp. 503–512 (1992)
Arora, S., Khot, S.: Fitting algebraic curves to noisy data. J. Comput. Syst. Sci. 67(2), 325–340 (2003)
Bertino, E., Bernstein, P., Agrawal, D., Davidson, S., Dayal, U., Franklin, M., Gehrke, J., Haas, L., Halevy, A., Han, J., et al.: Challenges and opportunities with big data. (2011)
Daltrophe, H., Dolev, S., Lotker, Z.: Big data interpolation: an efficient sampling alternative for sensor data aggregation. Algo. Sensors 2012(2013), 66–77 (2013)
Davis, P.J.: Interpolation and approximation. Dover Publications, New York (1975)
Ditzian, Z.: Multivariate Bernstein and Markov inequalities. J. Approx. Theory 70(3), 273–283 (1992)
Fasolo, E., Rossi, M., Widmer, J., Zorzi, M.: In-network aggregation techniques for wireless sensor networks: a survey. IEEE Wirel. Commun. 14(2), 70–87 (2007)
Jesus, P., Baquero, C., Almeida, P.S.: A survey of distributed data aggregation algorithms. arXiv preprint arXiv:1110.0725 (2011)
Kahn, Joseph M., Katz, Randy H., Pister, Kristofer SJ.: Next century challenges: mobile networking for “Smart Dust”. In Proceedings of the 5th annual ACM/IEEE international conference on Mobile computing and networking. ACM, pp. 271–278 (1999)
Madden, S.: From databases to big data. IEEE Internet Comput. 16, 3 (2012)
Nürnberger, G.: Approximation by spline functions, vol. 1. Springer, Berlin (1989)
Pinkus, A.: Weierstrass and approximation theory. J. Approx. Theory 107(1), 1–66 (2000). doi:10.1006/jath.2000.3508
Rajagopalan, R., Varshney, P.K.: 2006. A survey, Data aggregation techniques in sensor networks (2006)
Rivlin, T.J.: An introduction to the approximation of functions. Dover Publications, New York (2003)
Saniee, R.: A simple expression for multivariate lagrange interpolation. (2008)
Sudan, M.: Decoding of Reed Solomon codes beyond the error-correction bound. J. Complex. 13(1), 180–193 (1997)
Ullman, Jeffrey D., Aho, Alfred V., Hopcroft, John E.: The design and analysis of computer algorithms. Addison-Wesley, Reading (1974)
Welch, L.R., Berlekamp, E.R.: Error correction for algebraic block codes. US Patent 4,633,470, 30 Dec 1986
Acknowledgements
The research was partially supported by the Rita Altura Trust Chair in Computer Sciences; grant of the Ministry of Science, Technology and Space, Israel, and the National Science Council (NSC) of Taiwan; the Ministry of Foreign Affairs, Italy; the Ministry of Science, Technology and Space, Infrastructure Research in the Field of Advanced Computing and Cyber Security; and the Israel National Cyber Bureau.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Daltrophe, H., Dolev, S. & Lotker, Z. Big data interpolation using functional representation. Acta Informatica 55, 213–225 (2018). https://doi.org/10.1007/s00236-016-0288-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00236-016-0288-8