Skip to main content
Log in

Testing Proximity to Subspaces: Approximate \(\ell _\infty \) Minimization in Constant Time

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

We consider the subspace proximity problem: Given a vector \({\varvec{x}} \in {\mathbb {R}}^n\) and a basis matrix \(V \in {\mathbb {R}}^{n \times m}\), the objective is to determine whether \({\varvec{x}}\) is close to the subspace spanned by V. Although the problem is solvable by linear programming, it is time consuming especially when n is large. In this paper, we propose a quick tester that solves the problem correctly with high probability. Our tester runs in time independent of n and can be used as a sieve before computing the exact distance between \({\varvec{x}}\) and the subspace. The number of coordinates of \({\varvec{x}}\) queried by our tester is \(O(\frac{m}{\epsilon }\log \frac{m}{\epsilon })\), where \(\epsilon \) is an error parameter, and we show almost matching lower bounds. By experiments, we demonstrate the scalability and applicability of our tester using synthetic and real data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. \(T_{\mathrm {LIN}}(m,m)\) is at most \(m^\omega \), where \(\omega <2.3728639\) [25].

  2. https://www.gnu.org/software/glpk/.

  3. One may notice that the runtime slightly increases as n increases when m is small (\(m=5,20\)). It may be due to computational overheads such as memory caching, which could be dominant when the number of queries is small.

  4. A tester with the non-negative constraint is discussed in Sect. 4.3.

  5. One may think that, if we have the with-sunglasses images in the training phase (obtaining V), our tester is not greatly advantageous because we can just train a classifier to discriminate with- and without-sunglasses. This is incorrect. Even in such case, our tester is meaningful in terms of time complexity; although the classifier requires O(n) time, our testers works in constant time.

  6. In actual use cases, we first fix \(\gamma \) or the number of queries depending on available computational resources, and then determine \({\varDelta }\) to achieve the best classification performance. We can choose \({\varDelta }\) by some standard methods in machine learning, such as cross-validation.

References

  1. Achlioptas, D.: Database-friendly random projections: Johnson–Lindenstrauss with binary coins. J. Comput. Syst. Sci. 66(4), 671–687 (2003)

    Article  MathSciNet  Google Scholar 

  2. Alon, N., Dar, S., Parnas, M., Ron, D.: Testing of clustering. SIAM J. Discrete Math. 16(3), 393–417 (2003)

    Article  MathSciNet  Google Scholar 

  3. Alon, N., Fischer, E., Newman, I., Shapira, A.: A combinatorial characterization of the testable graph properties: It’s all about regularity. SIAM J. Comput. 39(1), 143–167 (2009)

    Article  MathSciNet  Google Scholar 

  4. Balcan, M.-F., Li, Y., Woodruff, D.P., Zhang, H.: Testing matrix rank, optimally. In: Proceedings of the 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 727–746 (2019)

    Chapter  Google Scholar 

  5. Barrodale, I., Phillips, C.: Algorithm 495: solution of an overdetermined system of linear equations in the chebychev norm [F4]. ACM Trans. Math. Softw. 1(3), 264–270 (1975)

    Article  Google Scholar 

  6. Bingham, E., Mannila, H.: Random projection in dimensionality reduction: applications to image and text data. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 245–250 (2001)

  7. Borgs, C., Chayes, J., Lovász, L., Sós, V.T., Szegedy, B., Vesztergombi, K.: Graph limits and parameter testing. In: Proceedings of the 38th Annual ACM Symposium on Theory of Computing (STOC), pp. 261–270 (2006)

  8. Carroll, J.D., Chang, J.-J.: Analysis of individual differences in multidimensional scaling via an \(N\)-way generalization of “Eckart–Young” decomposition. Psychometrika 35(3), 283–319 (1970)

    Article  Google Scholar 

  9. Chandrasekaran, K., Cheraghchi, M., Gandikota, V., Grigorescu, E.: Local testing for membership in lattices. In: Proceedings of the 36th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS), pp. 46:1–46:14 (2016)

  10. Clarkson, K.L.: Las Vegas algorithms for linear and integer programming when the dimension is small. J. ACM 42(2), 488–499 (1995)

    Article  MathSciNet  Google Scholar 

  11. Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)

    Article  Google Scholar 

  12. Ding, H., Wang, J.: Recurrent neural networks for minimum infinity-norm kinematic control of redundant manipulators. IEEE Trans. Syst. Man Cybern. Part A 29(3), 269–276 (1999)

    Article  Google Scholar 

  13. Farebrother, R.W.: The historical development of the linear minimax absolute residual estimation procedure 1786–1960. Comput. Stat. Data Anal. 24(4), 455–466 (1997)

    Article  MathSciNet  Google Scholar 

  14. Goldreich, O., Ron, D.: Property testing in bounded degree graphs. Algorithmica 32(2), 302–343 (2002)

    Article  MathSciNet  Google Scholar 

  15. Guestrin, C., Koller, D., Parr, R.: Max-norm projections for factored MDPs. In: Proceedings of the 17th International Joint Conference on Artificial Intelligence (IJCAI), pp. 673–680 (2001)

  16. Har-Peled, S.: Geometric Approximation Algorithms. American Mathematical Society, Providence (2011)

    Book  Google Scholar 

  17. Harsanyi, J.C., Chang, C.-I.: Hyperspectral image classification and dimensionality reduction: an orthogonal subspace projection approach. IEEE Trans. Geosci. Remote Sens. 32, 779–785 (1994)

    Article  Google Scholar 

  18. Harshman, R.: Foundations of the parafac procedure: models and conditions for an “explanatory” multi-modal factor analysis. UCLA Working Papers in Phonetics, p. 16 (1970)

  19. Haussler, D., Welzl, E.: Epsilon-nets and simplex range queries. In: Proceedings of the 2nd Annual Symposium on Computational geometry (SoCG), pp. 61–71 (1986)

  20. Hayashi, K., Yoshida, Y.: Minimizing quadratic functions in constant time. In: Proceedings of the 30th Annual Conference on Neural Information Processing Systems (NIPS), pp. 2217–2225 (2016)

  21. Hotelling, H.: Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24, 417–441 (1933)

    Article  Google Scholar 

  22. Hubert, M., Rousseeuw, P., Vanden Branden, K.: ROBPCA: a new approach to robust principal component analysis. Technometrics 47, 64–79 (2005)

    Article  MathSciNet  Google Scholar 

  23. Kahl, F., Hartley, R.: Multiple-view geometry under the \(L_\infty \)-norm. IEEE Trans. Pattern Anal. Mach. Intell. 30(9), 1603–1617 (2008)

    Article  Google Scholar 

  24. Krauthgamer, R., Sasson, O.: Property testing of data dimensionality. In: Proceedings of the 27th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 18–27 (2003)

  25. Le Gall, F.: Powers of tensors and fast matrix multiplication. In: Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation (ISSAC), pp. 296–303 (2014)

  26. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)

    Article  Google Scholar 

  27. Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)

    Article  Google Scholar 

  28. Li, Y., Wang, Z., Woodruff, D.P.: Improved testing of low rank matrices. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 691–700 (2014)

  29. Martínez, A., Benavente, R.: The AR face database. Technical report 24, Computer Vision Center (1998)

  30. Ostrowski, A.: Sur l’Approximation du déterminant de Fredholm par les déterminants des systèmes d’équations linéaires. Arkiv för matematik, astronomi och fysik. Almqvist & Wiksells, Stockholm (1938)

    MATH  Google Scholar 

  31. Pearson, K.: On lines and planes of closest fit to systems of points in space. Philos. Mag. 2, 559–572 (1901)

    Article  Google Scholar 

  32. Rice, J.R., White, J.S.: Norms for smoothing and estimation. SIAM Rev. 6(3), 243–256 (1964)

    Article  MathSciNet  Google Scholar 

  33. Rubinfeld, R., Sudan, M.: Robust characterizations of polynomials with applications to program testing. SIAM J. Comput. 25(2), 252–271 (1996)

    Article  MathSciNet  Google Scholar 

  34. Stiefel, E.: Note on Jordan elimination, linear programming and Tschebyscheff approximation. Numer. Math. 2, 1–17 (1960)

    Article  MathSciNet  Google Scholar 

  35. Stromberg, A.J.: Computing the exact least median of squares estimate and stability diagnostics in multiple linear regression. SIAM J. Sci. Comput. 14(6), 1289–1299 (1993)

    Article  Google Scholar 

  36. Torgerson, W.S.: Multidimensional scaling: I. Theory and method. Psychometrika 17, 401–419 (1952)

    Article  MathSciNet  Google Scholar 

  37. Woodroofe, M.: On the maximum deviation of the sample density. Ann. Math. Stat. 38(2), 475–481 (1967)

    Article  MathSciNet  Google Scholar 

  38. Yoshida, Y.: A characterization of locally testable affine-invariant properties via decomposition theorems. In: Proceedings of the 46th Annual ACM Symposium on Theory of Computing (STOC), pp. 154–163 (2014)

  39. Yoshida, Y.: Gowers norm, function limits, and parameter estimation. In: Proceedings of the 27th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 1391–1406 (2016)

  40. Yu, B.: Density estimation in the \(l^\infty \) norm for dependent data with applications to the Gibbs sampler. Ann. Stat. 21(2), 711–735 (1993)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We thank anonymous referees for helpful comments and providing an alternative algorithm for the subspace proximity problem explained in Sect. 4.4. Y.Y. is supported by JSPS KAKENHI Grant Number JP17H04676.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuichi Yoshida.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hayashi, K., Yoshida, Y. Testing Proximity to Subspaces: Approximate \(\ell _\infty \) Minimization in Constant Time. Algorithmica 82, 1277–1297 (2020). https://doi.org/10.1007/s00453-019-00642-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-019-00642-0

Keywords

Navigation