Skip to main content
Log in

Highly efficient nonlinear regression for big data with lexicographical splitting

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

This paper considers the problem of online piecewise linear regression for big data applications. We introduce an algorithm, which sequentially achieves the performance of the best piecewise linear (affine) model with optimal partition of the space of the regressor vectors in an individual sequence manner. To this end, our algorithm constructs a class of \(2^D\) sequential piecewise linear models over a set of partitions of the regressor space and efficiently combines them in the mixture-of-experts setting. We show that the algorithm is highly efficient with computational complexity of only \(O(mD^2)\), where m is the dimension of the regressor vectors. This efficient computational complexity is achieved by efficiently representing all of the \(2^D\) models using a “lexicographical splitting graph.” We analyze the performance of our algorithm without any statistical assumptions, i.e., our results are guaranteed to hold. Furthermore, we demonstrate the effectiveness of our algorithm over the well-known data sets in the machine learning literature with computational complexity fraction of the state of the art.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Samah, H.A., Isa, N.M., Toh, K.K.: Automatic false edge elimination using locally adaptive regression kernel. Signal Image Video Process. 9(6), 1339–1351 (2015)

    Article  Google Scholar 

  2. Kozat, S.S., Singer, A.C., Zeitler, G.C.: Universal piecewise linear prediction via context trees. IEEE Trans. Signal Process. 55(7), 3730–3745 (2007)

    Article  MathSciNet  Google Scholar 

  3. Helmbold, D.P., Schapire, R.E.: Predicting nearly as well as the best pruning of a decision tree. Mach. Learn. 27(1), 51–68 (1997)

    Article  Google Scholar 

  4. Singer, A.C., Feder, M.: Universal linear prediction by model order weighting. Signal Proces. IEEE Trans. 47(10), 2685–2699 (1999)

    Article  MATH  Google Scholar 

  5. Malik, P.: Governing big data: principles and practices. IBM J. Res. Dev. 57(3/4), 1:1–1:13 (2013)

    Article  Google Scholar 

  6. Michel, O.J.J., Hero, A.O., Badel, A.-E.: Tree-structured nonlinear signal modeling and prediction. IEEE Trans. Signal Process. 47(11), 3027–3041 (1999)

    Article  MATH  Google Scholar 

  7. Vanli, N.D., Kozat, S.S.: A comprehensive approach to universal piecewise nonlinear regression based on trees. Signal Process. IEEE Trans. 62(20), 5471–5486 (2014)

    Article  MathSciNet  Google Scholar 

  8. Willems, F.M.J., Shtarkov, Y.M., Tjalkens, T.J.: Context weighting for general finite-context sources. IEEE Trans. Inform. Theory 42, 42–1514 (1996)

    MATH  Google Scholar 

  9. Kivinen, J., Warmuth, M.K.: Exponentiated gradient versus gradient descent for linear predictors. J. Inf. Comput. 42(5), 1514–1520 (1996)

    MATH  Google Scholar 

  10. Sayed, A.H.: Fundam. Adapt. Filter. Wiley, New Jersey (2003)

    Google Scholar 

  11. Blake, C., Merz, C.: UCI repository of machine learning databases (1998). http://www.ics.uci.edu/~mlearn/MLRepository.html

  12. Mathews, V.J.: Adaptive polynomial filters. Signal Process. Mag. IEEE 8(3), 10–26 (1991)

    Article  Google Scholar 

  13. Carini, A., Sicuranza, G.L.: Recursive even mirror fourier nonlinear filters and simplified structures. Signal Process. IEEE Trans. 62(24), 6534–6544 (2014)

    Article  MathSciNet  Google Scholar 

  14. Hénon, M.: A two-dimensional mapping with a strange attractor. Commun. Math. Phys. 50(1), 69–77 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  15. Scarpiniti, M., Comminiello, D., Parisi, R., Uncini, A.: Novel cascade spline architectures for the identification of nonlinear systems. Circ. Syst. I Regul. Pap. IEEE Trans. 62(7), 1825–1835 (2015)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

This work is in part supported by Turkish Academy of Science Outstanding Researcher Programme, Tubitak Contract No 113E517, and Turk Telekom Inc.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammadreza Mohaghegh Neyshabouri.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mohaghegh Neyshabouri, M., Demir, O., Delibalta, I. et al. Highly efficient nonlinear regression for big data with lexicographical splitting. SIViP 11, 391–398 (2017). https://doi.org/10.1007/s11760-016-0972-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-016-0972-8

Keywords

Navigation