Skip to main content

An Efficient Algorithm for Rank Distance Consensus

  • Conference paper
AI*IA 2013: Advances in Artificial Intelligence (AI*IA 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8249))

Included in the following conference series:

  • 1261 Accesses

Abstract

In various research fields a common task is to summarize the information shared by a collection of objects and to find a consensus of them. In many scenarios, the object items for which a consensus needs to be determined are rankings, and the process is called rank aggregation. Common applications are electoral processes, meta-search engines, document classification, selecting documents based on multiple criteria, and many others. This paper is focused on a particular application of such aggregation schemes, that of finding motifs or common patterns in a set of given DNA sequences. Among the conditions that a string should satisfy to be accepted as consensus, are the median string and closest string. These approaches have been intensively studied separately, but only recently, the work of [1] tries to combine both problems: to solve the consensus string problem by minimizing both distance sum and radius.

The aim of this paper is to investigate the consensus string in the rank distance paradigm. Theoretical results show that it is not possible to identify a consensus string via rank distance for three or more strings. Thus, an efficient genetic algorithm is proposed to find the optimal consensus string. To show an application for the studied problem, this work also exhibits a clustering algorithm based on consensus string, that builds a hierarchy of clusters based on distance connectivity. Experiments on DNA comparison are presented to show the efficiency of the proposed genetic algorithm for consensus string. Phylogenetic experiments were also conducted to show the utility of the proposed clustering method. In conclusion, the consensus string is indeed an interesting problem with many practical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amir, A., Landau, G.M., Na, J.C., Park, H., Park, K., Sim, J.S.: Consensus optimizing both distance sum and radius. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 234–242. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  2. Chimani, M., Woste, M., Bocker, S.: A closer look at the closest string and closest substring problem. In: Proceedings of ALENEX, pp. 13–24 (2011)

    Google Scholar 

  3. Diaconis, P., Graham, R.L.: Spearman footrule as a measure of disarray. Journal of Royal Statistical Society. Series B (Methodological) 39(2), 262–268 (1977)

    MathSciNet  MATH  Google Scholar 

  4. Dinu, L.P.: On the classification and aggregation of hierarchies with different constitutive elements. Fundamenta Informaticae 55(1), 39–50 (2003)

    MathSciNet  MATH  Google Scholar 

  5. Dinu, L.P., Ionescu, R.-T.: Clustering Based on Rank Distance with Applications on DNA. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds.) ICONIP 2012, Part V. LNCS, vol. 7667, pp. 722–729. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  6. Dinu, L.P., Ionescu, R.T.: Clustering Methods Based on Closest String via Rank Distance. In: Proceedings of SYNASC, pp. 207–214 (2012)

    Google Scholar 

  7. Dinu, L.P., Ionescu, R.T.: An efficient rank based approach for closest string and closest substring. PLoS ONE 7(6), 37576 (2012)

    Article  Google Scholar 

  8. Dinu, L.P., Manea, F.: An efficient approach for the rank aggregation problem. Theoretical Computer Science 359(1-3), 455–461 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  9. Dinu, L.P., Popa, A.: On the closest string via rank distance. In: Kärkkäinen, J., Stoye, J. (eds.) CPM 2012. LNCS, vol. 7354, pp. 413–426. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  10. Dinu, L.P., Sgarro, A.: A Low-complexity Distance for DNA Strings. Fundamenta Informaticae 73(3), 361–372 (2006)

    MathSciNet  MATH  Google Scholar 

  11. Dinu, L.P., Sgarro, A.: Estimating Similarities in DNA Strings Using the Efficacious Rank Distance Approach, Systems and Computational Biology – Bioinformatics and Computational Modeling. InTech (2011)

    Google Scholar 

  12. Frances, M., Litman, A.: On covering problems of codes. Theory of Computing Systems 30(2), 113–119 (1997)

    MathSciNet  MATH  Google Scholar 

  13. Koonin, E.V.: The emerging paradigm and open problems in comparative genomics. Bioinformatics 15, 265–266 (1999)

    Article  Google Scholar 

  14. Lee, T., Na, J.C., Park, H., Park, K., Sim, J.S.: Finding consensus and optimal alignment of circular strings. Theoretical Computer Science 468, 92–101 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  15. Li, M., Chen, X., Li, X., Ma, B., Vitanyi, P.M.B.: The similarity metric. IEEE Transactions on Information Theory 50(12), 3250–3264 (2004)

    Article  MathSciNet  Google Scholar 

  16. Liew, A.W., Yan, H., Yang, M.: Pattern recognition techniques for the emerging field of bioinformatics: A review. Pattern recognition 38(11), 2055–2073 (2005)

    Article  Google Scholar 

  17. Nicolas, F., Rivals, E.: Complexities of the centre and median string problems. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 315–327. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  18. Nicolas, F., Rivals, E.: Hardness results for the center and median string problems under the weighted and unweighted edit distances. Journal of Discrete Algorithms 3, 390–415 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  19. Popov, Y.V.: Multiple genome rearrangement by swaps and by element duplications. Theoretical Computer Science 385(1-3), 115–126 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  20. Reyes, A., Gissi, C., Pesole, G., Catzeflis, F.M., Saccone, C.: Where Do Rodents Fit? Evidence from the Complete Mitochondrial Genome of Sciurus vulgaris. Molecular Biology Evolution 17(6), 979–983 (2000)

    Article  Google Scholar 

  21. States, D.J., Agarwal, P.: Compact encoding strategies for dna sequence similarity search. In: Proceedings of the 4th International Conference on Intelligent Systems for Molecular Biology, pp. 211–217 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Dinu, L.P., Ionescu, R.T. (2013). An Efficient Algorithm for Rank Distance Consensus. In: Baldoni, M., Baroglio, C., Boella, G., Micalizio, R. (eds) AI*IA 2013: Advances in Artificial Intelligence. AI*IA 2013. Lecture Notes in Computer Science(), vol 8249. Springer, Cham. https://doi.org/10.1007/978-3-319-03524-6_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-03524-6_43

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-03523-9

  • Online ISBN: 978-3-319-03524-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics