An Efficient Algorithm for Rank Distance Consensus

Dinu, Liviu P.; Ionescu, Radu Tudor

doi:10.1007/978-3-319-03524-6_43

Liviu P. Dinu²⁰ &
Radu Tudor Ionescu²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8249))

Included in the following conference series:

Congress of the Italian Association for Artificial Intelligence

1261 Accesses

Abstract

In various research fields a common task is to summarize the information shared by a collection of objects and to find a consensus of them. In many scenarios, the object items for which a consensus needs to be determined are rankings, and the process is called rank aggregation. Common applications are electoral processes, meta-search engines, document classification, selecting documents based on multiple criteria, and many others. This paper is focused on a particular application of such aggregation schemes, that of finding motifs or common patterns in a set of given DNA sequences. Among the conditions that a string should satisfy to be accepted as consensus, are the median string and closest string. These approaches have been intensively studied separately, but only recently, the work of [1] tries to combine both problems: to solve the consensus string problem by minimizing both distance sum and radius.

The aim of this paper is to investigate the consensus string in the rank distance paradigm. Theoretical results show that it is not possible to identify a consensus string via rank distance for three or more strings. Thus, an efficient genetic algorithm is proposed to find the optimal consensus string. To show an application for the studied problem, this work also exhibits a clustering algorithm based on consensus string, that builds a hierarchy of clusters based on distance connectivity. Experiments on DNA comparison are presented to show the efficiency of the proposed genetic algorithm for consensus string. Phylogenetic experiments were also conducted to show the utility of the proposed clustering method. In conclusion, the consensus string is indeed an interesting problem with many practical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Amir, A., Landau, G.M., Na, J.C., Park, H., Park, K., Sim, J.S.: Consensus optimizing both distance sum and radius. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 234–242. Springer, Heidelberg (2009)
Chapter Google Scholar
Chimani, M., Woste, M., Bocker, S.: A closer look at the closest string and closest substring problem. In: Proceedings of ALENEX, pp. 13–24 (2011)
Google Scholar
Diaconis, P., Graham, R.L.: Spearman footrule as a measure of disarray. Journal of Royal Statistical Society. Series B (Methodological) 39(2), 262–268 (1977)
MathSciNet MATH Google Scholar
Dinu, L.P.: On the classification and aggregation of hierarchies with different constitutive elements. Fundamenta Informaticae 55(1), 39–50 (2003)
MathSciNet MATH Google Scholar
Dinu, L.P., Ionescu, R.-T.: Clustering Based on Rank Distance with Applications on DNA. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds.) ICONIP 2012, Part V. LNCS, vol. 7667, pp. 722–729. Springer, Heidelberg (2012)
Chapter Google Scholar
Dinu, L.P., Ionescu, R.T.: Clustering Methods Based on Closest String via Rank Distance. In: Proceedings of SYNASC, pp. 207–214 (2012)
Google Scholar
Dinu, L.P., Ionescu, R.T.: An efficient rank based approach for closest string and closest substring. PLoS ONE 7(6), 37576 (2012)
Article Google Scholar
Dinu, L.P., Manea, F.: An efficient approach for the rank aggregation problem. Theoretical Computer Science 359(1-3), 455–461 (2006)
Article MathSciNet MATH Google Scholar
Dinu, L.P., Popa, A.: On the closest string via rank distance. In: Kärkkäinen, J., Stoye, J. (eds.) CPM 2012. LNCS, vol. 7354, pp. 413–426. Springer, Heidelberg (2012)
Chapter Google Scholar
Dinu, L.P., Sgarro, A.: A Low-complexity Distance for DNA Strings. Fundamenta Informaticae 73(3), 361–372 (2006)
MathSciNet MATH Google Scholar
Dinu, L.P., Sgarro, A.: Estimating Similarities in DNA Strings Using the Efficacious Rank Distance Approach, Systems and Computational Biology – Bioinformatics and Computational Modeling. InTech (2011)
Google Scholar
Frances, M., Litman, A.: On covering problems of codes. Theory of Computing Systems 30(2), 113–119 (1997)
MathSciNet MATH Google Scholar
Koonin, E.V.: The emerging paradigm and open problems in comparative genomics. Bioinformatics 15, 265–266 (1999)
Article Google Scholar
Lee, T., Na, J.C., Park, H., Park, K., Sim, J.S.: Finding consensus and optimal alignment of circular strings. Theoretical Computer Science 468, 92–101 (2013)
Article MathSciNet MATH Google Scholar
Li, M., Chen, X., Li, X., Ma, B., Vitanyi, P.M.B.: The similarity metric. IEEE Transactions on Information Theory 50(12), 3250–3264 (2004)
Article MathSciNet Google Scholar
Liew, A.W., Yan, H., Yang, M.: Pattern recognition techniques for the emerging field of bioinformatics: A review. Pattern recognition 38(11), 2055–2073 (2005)
Article Google Scholar
Nicolas, F., Rivals, E.: Complexities of the centre and median string problems. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 315–327. Springer, Heidelberg (2003)
Chapter Google Scholar
Nicolas, F., Rivals, E.: Hardness results for the center and median string problems under the weighted and unweighted edit distances. Journal of Discrete Algorithms 3, 390–415 (2005)
Article MathSciNet MATH Google Scholar
Popov, Y.V.: Multiple genome rearrangement by swaps and by element duplications. Theoretical Computer Science 385(1-3), 115–126 (2007)
Article MathSciNet MATH Google Scholar
Reyes, A., Gissi, C., Pesole, G., Catzeflis, F.M., Saccone, C.: Where Do Rodents Fit? Evidence from the Complete Mitochondrial Genome of Sciurus vulgaris. Molecular Biology Evolution 17(6), 979–983 (2000)
Article Google Scholar
States, D.J., Agarwal, P.: Compact encoding strategies for dna sequence similarity search. In: Proceedings of the 4th International Conference on Intelligent Systems for Molecular Biology, pp. 211–217 (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Mathematics and Computer Science, University of Bucharest, 14 Academiei Street, Bucharest, Romania
Liviu P. Dinu & Radu Tudor Ionescu

Authors

Liviu P. Dinu
View author publications
You can also search for this author in PubMed Google Scholar
Radu Tudor Ionescu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Informatica, Università degli Studi di Torino, via Pessinetto 12, 10149, Torino, Italy
Matteo Baldoni , Cristina Baroglio , Guido Boella & Roberto Micalizio , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dinu, L.P., Ionescu, R.T. (2013). An Efficient Algorithm for Rank Distance Consensus. In: Baldoni, M., Baroglio, C., Boella, G., Micalizio, R. (eds) AI*IA 2013: Advances in Artificial Intelligence. AI*IA 2013. Lecture Notes in Computer Science(), vol 8249. Springer, Cham. https://doi.org/10.1007/978-3-319-03524-6_43

Download citation

DOI: https://doi.org/10.1007/978-3-319-03524-6_43
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03523-9
Online ISBN: 978-3-319-03524-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics