Skip to main content

Some Experiments on Clustering a Set of Strings

  • Conference paper
  • First Online:
Graph Based Representations in Pattern Recognition (GbRPR 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2726))

Abstract

We introduce in this paper the concept of set deviation as a tool to characterize the deviation of a set of strings around its set median. The set deviation is defined as the set median of the positive edit sequences between any string and the set median. We show how the set deviation can be efficiently used in well known statistical estimation and particularly with the minimum volume ellipsoid estimator. This concept is illustrated on several examples and particularly in clustering a set of shapes coded as strings using the Freeman code.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. C. Arcelli, L.P. Cordella and G. Sanniti di Baja, 4th IAPR Int. Workshop on Visual Form, Capri, Italy, may 2001, Springer, Lecture Notes in Computer Science, 2059.

    Google Scholar 

  2. J.E. Atkins, E.G. Boman and B. Hendrickson, A spectral algorithm for seriation and the consecutive ones problem, SIAM Journal on Computing, 28(1), 297–310, 1998.

    Article  MATH  MathSciNet  Google Scholar 

  3. H. Bunke, X. Jiang, K. Abegglen and A. Kandel, On the weighted mean of a pair of strings, Pattern Analysis and Applications, 2003 (to appear).

    Google Scholar 

  4. H. Bunke and S. Günter, Weighted mean of a pair of graphs, Computing, 67, 209–224, 2001.

    Article  MATH  MathSciNet  Google Scholar 

  5. H. Bunke, Recent Developments in Graph Matching, Proc. of ICPR’00, Barcelona, Spain, 2, 117–124, 2000.

    Google Scholar 

  6. H. Bunke, On a relation between graph edit distance and maximum common subgraph, Pattern Recognition Letters, 18, 689–694, 1997.

    Article  MathSciNet  Google Scholar 

  7. C. de la Higuera and F. Casacuberta, Topology of strings: Median string is NP-complete, Theoretical Computer Science, 230(1/2), 39–48, 2000.

    Article  MATH  MathSciNet  Google Scholar 

  8. X. Jiang, A. Münger and H. Bunke, On Median Graphs: Properties, Algorithms, and Applications, IEEE trans. on Pattern Analysis and Machine Intelligence, 23(10), 1144–1151, 2001.

    Article  Google Scholar 

  9. J.M. Jolion, On the deviation of a set of strings, to appear in Pattern Analysis and Application, 2003.

    Google Scholar 

  10. J.M. Jolion, Graph matching: what are we talking about?, J.M. Jolion, W. Kropatsch, and M. Vento eds, Proc. of Third IAPR-TC15 Workshop on Graphbased Representations in Pattern Recognition, may 23–25, 2001, Ischia, Cuen editor, Italy (isbn 88 7146 579-2), 170–175, 2001.

    Google Scholar 

  11. J.M. Jolion, P. Meer and S. Bataouche, Robust clustering with applications in computer vision, IEEE Trans. on Pattern Analysis and Machine Intelligence, 13(8), 791–802, 1991.

    Article  Google Scholar 

  12. A. Levenstein, Binary codes capable of correcting deletions, insertions and reversals, Sov. Phy. Dohl., 10, 707–710, 1966.

    MathSciNet  Google Scholar 

  13. T. Lewis, R. Owens and A. Baddeley, Averaging feature maps, Pattern Recognition, 32(9), 331–339, 1999.

    Article  Google Scholar 

  14. D. Lopresti and J. Zhou, Using Consensus Sequence Voting to Correct OCR Errors, Computer Vision and Image Understanding, 67(1), 39–47, 1997.

    Article  Google Scholar 

  15. A. Marzal and E. Vidal, Computation of Normalized Edit Distance and Applications, IEEE trans. on Pattern Analysis and Machine Intelligence, 15(9), 926–932, 1993.

    Article  Google Scholar 

  16. A. Massaro and M. Pelillo, A Linear Complementarity Approach to Graph Matching, Proc. of Third IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition, may 23–25, 2001, Ischia, Cuen editor, Italy (isbn 88 7146 579-2), 160–169, 2001.

    Google Scholar 

  17. M. Mattenet and J.M. Jolion, Analyse de Formes Déformées (in french), Mappemonde, 3, 5–9, 1992.

    Google Scholar 

  18. P.J. Rousseeuw and A.M. Leroy, Robust Regression & Outlier Detection, Wiley, 1987 (p. 45).

    Google Scholar 

  19. G. Sánchez, J. Lladós and K. Tombre, A mean string algorithm to compute the average among a set of 2D shapes, Pattern Recognition Letters, 23, 203–213, 2002.

    Article  MATH  Google Scholar 

  20. R.A. Wagner and M.J. Fisher, The string to string correction problem, Journal of the ACM, 21(1), 168–173, 1974.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jolion, JM. (2003). Some Experiments on Clustering a Set of Strings. In: Hancock, E., Vento, M. (eds) Graph Based Representations in Pattern Recognition. GbRPR 2003. Lecture Notes in Computer Science, vol 2726. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45028-9_19

Download citation

  • DOI: https://doi.org/10.1007/3-540-45028-9_19

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40452-1

  • Online ISBN: 978-3-540-45028-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics