Some Experiments on Clustering a Set of Strings

Jolion, Jean-Michel

doi:10.1007/3-540-45028-9_19

Jean-Michel Jolion⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2726))

Included in the following conference series:

International Workshop on Graph-Based Representations in Pattern Recognition

562 Accesses
1 Citations

Abstract

We introduce in this paper the concept of set deviation as a tool to characterize the deviation of a set of strings around its set median. The set deviation is defined as the set median of the positive edit sequences between any string and the set median. We show how the set deviation can be efficiently used in well known statistical estimation and particularly with the minimum volume ellipsoid estimator. This concept is illustrated on several examples and particularly in clustering a set of shapes coded as strings using the Freeman code.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

C. Arcelli, L.P. Cordella and G. Sanniti di Baja, 4th IAPR Int. Workshop on Visual Form, Capri, Italy, may 2001, Springer, Lecture Notes in Computer Science, 2059.
Google Scholar
J.E. Atkins, E.G. Boman and B. Hendrickson, A spectral algorithm for seriation and the consecutive ones problem, SIAM Journal on Computing, 28(1), 297–310, 1998.
Article MATH MathSciNet Google Scholar
H. Bunke, X. Jiang, K. Abegglen and A. Kandel, On the weighted mean of a pair of strings, Pattern Analysis and Applications, 2003 (to appear).
Google Scholar
H. Bunke and S. Günter, Weighted mean of a pair of graphs, Computing, 67, 209–224, 2001.
Article MATH MathSciNet Google Scholar
H. Bunke, Recent Developments in Graph Matching, Proc. of ICPR’00, Barcelona, Spain, 2, 117–124, 2000.
Google Scholar
H. Bunke, On a relation between graph edit distance and maximum common subgraph, Pattern Recognition Letters, 18, 689–694, 1997.
Article MathSciNet Google Scholar
C. de la Higuera and F. Casacuberta, Topology of strings: Median string is NP-complete, Theoretical Computer Science, 230(1/2), 39–48, 2000.
Article MATH MathSciNet Google Scholar
X. Jiang, A. Münger and H. Bunke, On Median Graphs: Properties, Algorithms, and Applications, IEEE trans. on Pattern Analysis and Machine Intelligence, 23(10), 1144–1151, 2001.
Article Google Scholar
J.M. Jolion, On the deviation of a set of strings, to appear in Pattern Analysis and Application, 2003.
Google Scholar
J.M. Jolion, Graph matching: what are we talking about?, J.M. Jolion, W. Kropatsch, and M. Vento eds, Proc. of Third IAPR-TC15 Workshop on Graphbased Representations in Pattern Recognition, may 23–25, 2001, Ischia, Cuen editor, Italy (isbn 88 7146 579-2), 170–175, 2001.
Google Scholar
J.M. Jolion, P. Meer and S. Bataouche, Robust clustering with applications in computer vision, IEEE Trans. on Pattern Analysis and Machine Intelligence, 13(8), 791–802, 1991.
Article Google Scholar
A. Levenstein, Binary codes capable of correcting deletions, insertions and reversals, Sov. Phy. Dohl., 10, 707–710, 1966.
MathSciNet Google Scholar
T. Lewis, R. Owens and A. Baddeley, Averaging feature maps, Pattern Recognition, 32(9), 331–339, 1999.
Article Google Scholar
D. Lopresti and J. Zhou, Using Consensus Sequence Voting to Correct OCR Errors, Computer Vision and Image Understanding, 67(1), 39–47, 1997.
Article Google Scholar
A. Marzal and E. Vidal, Computation of Normalized Edit Distance and Applications, IEEE trans. on Pattern Analysis and Machine Intelligence, 15(9), 926–932, 1993.
Article Google Scholar
A. Massaro and M. Pelillo, A Linear Complementarity Approach to Graph Matching, Proc. of Third IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition, may 23–25, 2001, Ischia, Cuen editor, Italy (isbn 88 7146 579-2), 160–169, 2001.
Google Scholar
M. Mattenet and J.M. Jolion, Analyse de Formes Déformées (in french), Mappemonde, 3, 5–9, 1992.
Google Scholar
P.J. Rousseeuw and A.M. Leroy, Robust Regression & Outlier Detection, Wiley, 1987 (p. 45).
Google Scholar
G. Sánchez, J. Lladós and K. Tombre, A mean string algorithm to compute the average among a set of 2D shapes, Pattern Recognition Letters, 23, 203–213, 2002.
Article MATH Google Scholar
R.A. Wagner and M.J. Fisher, The string to string correction problem, Journal of the ACM, 21(1), 168–173, 1974.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Lyon Research Center for Images and Information Systems Bât. J. Verne, INSA Lyon, 69621, Villeurbanne Cedex, France
Jean-Michel Jolion

Authors

Jean-Michel Jolion
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of York, York, YO1 5DD, UK
Edwin Hancock
D.I.I.I.E., Università degli Studi di Salerno, Via Ponte don Melillo, 1, 84084, Fisciano (SA), Italy
Mario Vento

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jolion, JM. (2003). Some Experiments on Clustering a Set of Strings. In: Hancock, E., Vento, M. (eds) Graph Based Representations in Pattern Recognition. GbRPR 2003. Lecture Notes in Computer Science, vol 2726. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45028-9_19

Download citation

DOI: https://doi.org/10.1007/3-540-45028-9_19
Published: 24 June 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40452-1
Online ISBN: 978-3-540-45028-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics