Skip to main content
Log in

Outlier Respecting Points Approximation

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

In this paper, we consider a generalized problem formulation of computing a functional curve to approximate a point set in the plane with outliers. The goal is to seek a solution that not only optimizes its original objectives, but also somehow accommodates the impact of the outliers. Based on a new model of accommodating outliers, we present efficient geometric algorithms for various versions of this problem (e.g., the approximating functions are step functions or piecewise linear functions, the points are unweighted or weighted, etc). All our results are first known. Our new model and techniques for handling outliers may be useful to other applications as well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Ahuja, R.K., Orlin, J.B.: Inverse optimization. Oper. Res. 49, 771–783 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  2. Arning, A., Agrawal, R., Raghavan, P.: A linear method for deviation detection in large databases. In: Proc. of the Second International Conference on Knowledge Discovery and Data Mining, pp. 164–169 (1996)

    Google Scholar 

  3. Barnett, V., Lewis, T.: Outliers in Statistical Data. Wiley, New York (1994)

    MATH  Google Scholar 

  4. Bender, M., Farach-Colton, M.: The LCA problem revisited. In: Proc. of the 4th Latin American Symposium on Theoretical Informatics, pp. 88–94 (2000)

    Google Scholar 

  5. Brodal, G., Jacob, R.: Dynamic planar convex hull. In: Proc. of the 43rd IEEE Symposium on Foundations of Computer Science, pp. 617–626 (2002)

    Google Scholar 

  6. Burton, D., Toint, Ph.L.: On an instance of the inverse shortest paths problem. Math. Program. 53, 45–61 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  7. Chan, T.M.: Output-sensitive results on convex hulls and extreme points, and related problems. Discrete Comput. Geom. 16(3), 369–387 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  8. Chan, T.M.: Low-dimensional linear programming with violations. In: Proc. of 43rd IEEE Symposium on Foundations of Computer Science, pp. 570–579 (2002)

    Google Scholar 

  9. Chazelle, B.: An algorithm for segment-dragging and its implementation. Algorithmica 3(1–4), 205–221 (1988)

    Article  MathSciNet  Google Scholar 

  10. Chen, D.Z., Wang, H.: Approximating points by a piecewise linear function: I. In: Proc. of the 20th International Symposium on Algorithms and Computation. Lecture Notes in Computer Science, vol. 5878, pp. 224–233. Springer, Berlin (2009)

    Chapter  Google Scholar 

  11. Chen, D.Z., Wang, H.: Approximating points by a piecewise linear function: II. Dealing with outliers. In: Proc. of the 20th International Symposium on Algorithms and Computation. Lecture Notes in Computer Science, vol. 5878, pp. 234–243. Springer, Berlin (2009)

    Chapter  Google Scholar 

  12. Díaz-Bánez, J., Mesa, J.: Fitting rectilinear polygonal curves to a set of points in the plane. Eur. J. Oper. Res. 130(1), 214–222 (2001)

    Article  MATH  Google Scholar 

  13. Dobkin, D.P., Kirkpatrick, D.G.: A linear algorithm for determining the separation of convex polyhedra. J. Algorithms 6(3), 381–392 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  14. Dobkin, D.P., Kirkpatrick, D.G.: Determining the separation of preprocessed polyhedra—a unified approach. In: Proc. of the 17th International Colloquium on Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 443, pp. 400–413. Springer, Berlin (1990)

    Chapter  Google Scholar 

  15. Dyer, M.: Linear time algorithms for two- and three-variable linear programs. SIAM J. Comput. 13(1), 31–45 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  16. Everett, H., Robert, J.-M., van Kreveld, M.: An optimal algorithm for the (≤k)-levels and with applications to separation and transversal problems. Int. J. Comput. Geom. Appl. 6(3), 247–261 (1996)

    Article  MATH  Google Scholar 

  17. Fournier, H., Vigneron, A.: Fitting a step function to a point set. In: Proc. of the 16th Annual European Symposium on Algorithms, pp. 442–453 (2008)

    Google Scholar 

  18. Gabow, H., Bentley, J., Tarjan, R.E.: Scaling and related techniques for geometry problems. In: Proc. of the 16th Annual ACM Symposium on Theory of Computing, pp. 135–143 (1984)

    Google Scholar 

  19. Guha, S., Koudas, N., Shim, K.: Data streams and histograms. In: Proc. of the 33rd Annual ACM Symposium on Theory of Computing, pp. 471–475 (2001)

    Chapter  Google Scholar 

  20. Guha, S., Shim, K.: A note on linear time algorithms for maximum error histograms. IEEE Trans. Knowl. Data Eng. 19(7), 993–997 (2007)

    Article  Google Scholar 

  21. Guha, S., Shim, K., Woo, J.: Rehist: Relative error histogram construction algorithms. In: Proc. of the 30th International Conference on Very Large Data Bases, pp. 300–311 (2004)

    Google Scholar 

  22. Hershberger, J., Suri, S.: Offline maintenance of planar configurations. In: Proc. of the 2nd Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 32–41 (1991)

    Google Scholar 

  23. Heuberger, C.: Inverse combinatorial optimization: a survey on problems, methods, and results. J. Comb. Optim. 8(3), 329–361 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  24. Karras, P., Sacharidis, D., Mamoulis, N.: Exploiting duality in summarization with deterministic guarantees. In: Proc. of the 13th International Conference on Knowledge Discovery and Data Mining, pp. 380–389 (2007)

    Google Scholar 

  25. Kirkpatrick, D.: Optimal search in planar subdivisions. SIAM J. Comput. 12(1), 28–35 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  26. Knorr, E., Ng, R.: Algorithms for mining distance-based outliers in large datasets. In: Proc. of the 24th International Conference on Very Large Data Bases, pp. 382–403 (1998)

    Google Scholar 

  27. Matoušek, J.: On geometric optimization with few violated constraints. Discrete Comput. Geom. 14(1), 365–384 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  28. Megiddo, N.: Linear programming in linear time when the dimension is fixed. J. ACM 31(1), 114–127 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  29. Muller, D., Preparata, F.P.: Finding the intersection of two convex polyhedra. Theor. Comput. Sci. 7, 217–236 (1978)

    Article  MATH  MathSciNet  Google Scholar 

  30. O’Rourke, J.: An on-line algorithm for fitting straight lines between data ranges. Commun. ACM 24, 574–578 (1981)

    Article  MATH  Google Scholar 

  31. Preparata, F.P., Hong, S.J.: Convex hulls of finite sets of points in two and three dimensions. Commun. ACM 20(2), 87–93 (1977)

    Article  MATH  MathSciNet  Google Scholar 

  32. Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets. ACM SIGMOD Rec. 29(2), 427–438 (2000)

    Article  Google Scholar 

  33. Roos, T., Widmayer, P.: k-violation linear programming. Inf. Process. Lett. 52(2), 109–114 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  34. Sharir, M., Smorodinsky, S., Tardos, G.: An improved bound for k-sets in three dimensions. Discrete Comput. Geom. 26, 195–204 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  35. Zhang, J., Lin, Y.: Computation of the reverse shortest-path problem. J. Glob. Optim. 25(3), 243–261 (2003)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haitao Wang.

Additional information

Chen’s research was supported in part by NSF under Grants CCF-0916606 and CCF-1217906.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, D.Z., Wang, H. Outlier Respecting Points Approximation. Algorithmica 69, 410–430 (2014). https://doi.org/10.1007/s00453-012-9738-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-012-9738-z

Keywords

Navigation