Skip to main content

Exact Algorithms for Two Quadratic Euclidean Problems of Searching for the Largest Subset and Longest Subsequence

  • Conference paper
  • First Online:
Learning and Intelligent Optimization (LION 12 2018)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11353))

Included in the following conference series:

Abstract

The following two strongly NP-hard problems are considered. In the first problem, we need to find in the given finite set of points in Euclidean space the subset of largest size such that the sum of squared distances between the elements of this subset and its unknown centroid (geometrical center) does not exceed a given percentage of the sum of squared distances between the elements of the input set and its centroid. In the second problem, the input is a sequence (not a set) and we have some additional constraints on the indices of the elements of the chosen subsequence under the same restriction on the sum of squared distances as in the first problem. Both problems can be treated as data editing problems aimed to find similar elements and removal of extraneous (dissimilar) elements. We propose exact algorithms for the cases of both problems in which the input points have integer-valued coordinates. If the space dimension is bounded by some constant, our algorithms run in a pseudopolynomial time. Some results of numerical experiments illustrating the performance of the algorithms are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kel’manov, A.V., Pyatkin, A.V.: NP-completeness of some problems of choosing a vector subset. J. Appl. Ind. Math. 5(3), 352–357 (2011)

    Article  MathSciNet  Google Scholar 

  2. Kel’manov, A.V., Pyatkin, A.V.: On the complexity of some problems of choosing a vector subsequence. Zhurnal Vychislitel’noi Matematiki i Matematicheskoi Fiziki (in Russian) 52(12), 2284–2291 (2012)

    MATH  Google Scholar 

  3. Aggarwal, A., Imai, H., Katoh, N., Suri, S.: Finding \(k\) points with minimum diameter and related problems. J. Algorithms 12(1), 38–56 (1991)

    Google Scholar 

  4. Kel’manov, A.V., Romanchenko, S.M.: An approximation algorithm for solving a problem of search for a vector subset. J. Appl. Ind. Math. 6(1), 90–96 (2012)

    Article  MathSciNet  Google Scholar 

  5. Kel’manov, A.V., Romanchenko, S.M.: Pseudopolynomial algorithms for certain computationally hard vector subset and cluster analysis problems. Autom. Remote Control 73(2), 349–354 (2012)

    Article  MathSciNet  Google Scholar 

  6. Shenmaier, V.V.: An approximation scheme for a problem of search for a vector subset. J. Appl. Ind. Math. 6(3), 381–386 (2012)

    Article  MathSciNet  Google Scholar 

  7. Kel’manov, A.V., Romanchenko, S.M.: An FPTAS for a vector subset search problem. J. Appl. Ind. Math. 8(3), 329–336 (2014)

    Article  MathSciNet  Google Scholar 

  8. Shenmaier, V.V.: Solving some vector subset problems by voronoi diagrams. J. Appl. Ind. Math. 10(2), 550–566 (2016)

    MathSciNet  MATH  Google Scholar 

  9. Kel’manov, A.V., Romanchenko, S.M., Khamidullin, S.A.: Approximation algorithms for some intractable problems of choosing a vector subsequence. J. Appl. Ind. Math. 6(4), 443–450 (2012)

    Article  MathSciNet  Google Scholar 

  10. Kel’manov, A.V., Romanchenko, S.M., Khamidullin, S.A.: Exact pseudopolynomial algorithms for some np-hard problems of searching a vectors subsequence. Zhurnal Vychislitel’noi Matematiki i Matematicheskoi Fiziki (in Russian) 53(1), 143–153 (2013)

    MATH  Google Scholar 

  11. Kel’manov, A.V., Romanchenko, S.M., Khamidullin, S.A.: An approximation scheme for the problem of finding a subsequence. Numerical Anal. Appl. 10(4), 313–323 (2017)

    Article  MathSciNet  Google Scholar 

  12. Ageev, A.A., Kel’manov, A.V., Pyatkin, A.V., Khamidullin, S.A., Shenmaier, V.V.: Approximation polynomial algorithm for the data editing and data cleaning problem. Pattern Recognit. Image Anal. 17(3), 365–370 (2017)

    Article  Google Scholar 

  13. de Waal, T., Pannekoek, J., Scholtus, S.: Handbook of Statistical Data Editing and Imputation. Wiley, Hoboken (2011)

    Book  Google Scholar 

  14. Osborne, J.W.: Best Practices in Data Cleaning: A Complete Guide to Everything You Need to Do Before and After Collecting Your Data, 1st edn. SAGE Publication, Inc., Los Angeles (2013)

    Book  Google Scholar 

  15. Farcomeni, A., Greco, L.: Robust Methods for Data Reduction. Chapman and Hall/CRC, Boca Raton (2015)

    Google Scholar 

  16. Hansen, P., Jaumard, B.: Cluster analysis and mathematical programming. Math. Program. 79, 191–215 (1997)

    MathSciNet  MATH  Google Scholar 

  17. Jain, A.K.: Data clustering: 50 years beyond \(k\)-means. Pattern Recognit. Lett. 31(8), 651–666 (2010)

    Google Scholar 

  18. Shirkhorshidi, A.S., Aghabozorgi, S., Wah, T.Y., Herawan, T.: Big Data Clustering: A Review. LNCS. 8583, 707–720 (2014)

    Google Scholar 

  19. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer Science+Business Media, LLC, New York (2006)

    MATH  Google Scholar 

  20. James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning. Springer Science+Business Media, LLC, New York (2013)

    Book  Google Scholar 

  21. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning, 2nd edn. Springer, Berlin (2009)

    Book  Google Scholar 

  22. Aggarwal, C.C.: Data Mining: The Textbook. Springer International Publishing, Berlin (2015)

    MATH  Google Scholar 

  23. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning (Adaptive Computation and Machine Learning series). MIT Press, Cambridge (2017)

    MATH  Google Scholar 

  24. Fu, T.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)

    Article  Google Scholar 

  25. Kuenzer, C., Dech, S., Wagner, W. (eds.): Remote Sensing Time Series. RSDIP, vol. 22. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-15967-6

    Book  Google Scholar 

  26. Liao, T.W.: Clustering of time series data – a survey. Pattern Recognit. 38(11), 1857–1874 (2005)

    Article  Google Scholar 

  27. Kel’manov, A.V., Khamidullin, S.A.: Posterior detection of a given number of identical subsequences in a quasi-periodic sequence. Comput. Math. Math. Phys. 41(5), 762–774 (2001)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

The study presented in Sects. 2, 4 was supported by the Russian Science Foundation, project 16-11-10041. The study presented in Sects. 3, 5 was supported by the Russian Foundation for Basic Research, projects 16-07-00168 and 18-31-00398, by the Russian Academy of Science (the Program of basic research), project 0314-2016-0015, and by the Russian Ministry of Science and Education under the 5-100 Excellence Programme.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vladimir Khandeev .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kel’manov, A., Khamidullin, S., Khandeev, V., Pyatkin, A. (2019). Exact Algorithms for Two Quadratic Euclidean Problems of Searching for the Largest Subset and Longest Subsequence. In: Battiti, R., Brunato, M., Kotsireas, I., Pardalos, P. (eds) Learning and Intelligent Optimization. LION 12 2018. Lecture Notes in Computer Science(), vol 11353. Springer, Cham. https://doi.org/10.1007/978-3-030-05348-2_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-05348-2_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-05347-5

  • Online ISBN: 978-3-030-05348-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics