Abstract
This paper provides an overview of the use of Prolog and its derivatives to sustain research and development in the fields of bioinformatics and computational biology. A number of applications in this domain have been enabled by the declarative nature of Prolog and the combinatorial nature of the underlying problems. The paper provides a summary of some relevant applications as well as potential directions that the Prolog community can continue to pursue in this important domain. The presentation is organized in two parts: “small,” which explores studies in biological components and systems, and “large,” that discusses the use of Prolog to handle biomedical knowledge and data. A concrete encoding example is presented and the effective implementation in Prolog of a widely used approximated search technique, large neighborhood search, is presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Angelopoulos, N., Giamas, G.: Prolog Bioinformatic Pipelines: A Case Study in Gene Dysregulation. In: Workshop on Constraint-Based Methods in Bioinformatics (WCB14). Lyon, France (2014)
Baek, M., Baker, D.: Deep learning and protein structure modeling. Nat. Methods 19, 13–14 (2022)
Bansal, A.K.: Establishing a framework for comparative analysis of genome sequences. In: Proceedings of the International IEEE Symposium on Intelligence in Neural and Biological Systems, pp. 84–91 (1995)
Bansal, A.K., Bork, P.: Applying logic programming to derive novel functional information of genomes. In: Gupta, G. (ed.) PADL 1999. LNCS, vol. 1551, pp. 275–289. Springer, Heidelberg (1998). https://doi.org/10.1007/3-540-49201-1_19
Barton, G.J., Rawlings, C.J.: A Prolog approach to analysing protein structure. Tetrahedron Comput. Methodol. 3(6 PART C), 739–756 (1990)
Bodei, C., Bracciali, A., Chiarugi, D.: On deducing causality in metabolic networks. BMC Bioinform. 9(S-4) (2008)
Burger, A., Davidson, D., Baldock, R.: Formalization of mouse embryo anatomy. Bioinformatics 20, 259–267 (2004)
Calzone, L., Fages, F., Soliman, S.: Biocham: an environment for modeling biological systems and formalizing experimental knowledge. Bioinformatics 22(14), 1805–1807 (2006)
Campeotto, F., Dal Palù, A., Dovier, A., Fioretto, F., Pontelli, E.: A constraint solver for flexible protein model. J. Artif. Intell. Res. (JAIR) 48, 953–1000 (2013)
Chisham, B., Pontelli, E., Son, T., Wright, B.: CDAOStore: a phylogenetic repository using logic programming and web services. In: International Conference on Logic Programming, pp. 209–219 (2011)
Chisham, B., Wright, B., Le, T., Son, T., Pontelli, E.: CDAO-Store: Ontology-driven Data Integration for Phylogenetic Analysis. BMC Bioinform. 12, 98 (2011)
Christiansen, H., Have, C.T., Lassen, O.T., Petit, M.: Inference with constrained hidden markov models in PRISM. Theory Pract. Logic Program. 10(4–6), 449–464 (2010)
Cooper, G., Friedman, J.M.: Interpreting chromosomal abnormalities using Prolog. Comput. Biomed. Res. 23(2), 153–164 (1990)
Crescenzi, P., Goldman, D., Papadimitrou, C., Piccolboni, A., Yannakakis, M.: On the complexity of protein folding. In: Proceedings of STOC, pp. 597–603 (1998)
Dal Palù, A., Dovier, A., Fogolari, F.: Constraint logic programming approach to protein structure prediction. BMC Bioinform. 5, 186 (2004)
Dal Palù, A., Dovier, A., Fogolari, F., Pontelli, E.: CLP-based protein fragment assembly. Theory Pract. Logic Program. 10(4–6), 709–724 (2010)
Dal Palù, A., Dovier, A., Formisano, A., Pontelli, E.: Exploring life: answer set programming in bioinformatics. In: Kifer, M., Liu, Y.A. (eds.) Declarative Logic Programming: Theory, Systems, and Applications, pp. 359–412. ACM / Morgan & Claypool (2018)
De Maeyer, D., Renkens, J., Cloots, L., De Raedt, L., Marchal, K.: PheNetic: network-based interpretation of unstructured gene lists in E. coli. Mol. BioSyst. 9, 1594–1603 (2013)
Degrand, É., Fages, F., Soliman, S.: Graphical conditions for rate independence in chemical reaction networks. In: Abate, A., Petrov, T., Wolf, V. (eds.) CMSB 2020. LNCS, vol. 12314, pp. 61–78. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60327-4_4
Dong, S., Searls, D.B.: Gene structure prediction by linguistic methods. Genomics 23(3), 540–551 (1994)
Dovier, A., Formisano, A., Gupta, G., Hermenegildo, M.V., Pontelli, E., Rocha, R.: Parallel logic programming: a sequel. Theory Pract. Log. Program. 22(6), 905–973 (2022)
Erdem, E.: Applications of answer set programming in phylogenetic systematics. In: Balduccini, M., Son, T.C. (eds.) Logic Programming, Knowledge Representation, and Nonmonotonic Reasoning. LNCS (LNAI), vol. 6565, pp. 415–431. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20832-4_26
Erdem, E., Gelfond, M., Leone, N.: Applications of answer set programming. AI Mag. 37(3), 53–68 (2016)
EvoInfo Working Group: PhyloWS: Phyloinformatics Web Services API. https://evoinfo.nescent.org/PhyloWS (2009)
Fanchon, E., Corblin, F., Trilling, L., Hermant, B., Gulino, D.: Modeling the molecular network controlling adhesion between human endothelial cells: inference and simulation using constraint logic programming. In: Danos, V., Schachter, V. (eds.) CMSB 2004. LNCS, vol. 3082, pp. 104–118. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-25974-9_9
Fierens, D., et al.: Inference and learning in probabilistic logic programs using weighted boolean formulas. Theory Pract. Logic Program. 15(3), 358–401 (2015)
Gaasterland, T., Sensen, C.W.: Fully automated genome analysis that reflects user needs and preferences. A detailed introduction to the MAGPIE system architecture. Biochimie 78(5), 302–310 (1996)
Gaasterland, T., Selkov, E.: Reconstruction of metabolic networks using incomplete information. In: Rawlings, C.J., Clark, D.A., Altman, R.B., Hunter, L., Lengauer, T., Wodak, S.J. (eds.) Proceedings of the Third International Conference on Intelligent Systems for Molecular Biology, Cambridge, United Kingdom, 16–19 July 1995, pp. 127–135. AAAI (1995)
Gonçalves, A., Ong, I.M., Lewis, J.A., Santos Costa, V.: A Problog model for analyzing gene regulatory networks. In: Riguzzi, F., Zelezný, F. (eds.) Late Breaking Papers of the 22nd International Conference on Inductive Logic Programming, Dubrovnik, Croatia, 17–19 September 2012. CEUR Workshop Proceedings, vol. 975, pp. 38–43. CEUR-WS.org (2012)
Goodman, N., Rozen, S., Stein, L.: Requirements for a deductive query language in the mapbase genome-mapping database. In: Ramakrishnan, R. (ed.) Proceedings of the Workshop on Programming with Logic Databases. In Conjunction with ILPS, Vancouver, BC, Canada, October 30, 1993. Technical Report, vol. 1183, pp. 18–32. University of Wisconsin (1993)
Gouret, P., Thompson, J.D., Pontarotti, P.: PhyloPattern: regular expressions to identify complex patterns in phylogenetic trees. BMC Bioinformatics 10, 298 (2009)
Gouret, P., Vitiello, V., Balandraud, N., Gilles, A., Pontarotti, P., Danchin, E.G.J.: FIGENIX: intelligent automation of genomic annotation: expertise integration in a new software platform. BMC Bioinform. 6, 198 (2005)
Gray, P.M.D., Paton, N.W., Kemp, G.J.L., Fothergill, J.E.: An object-oriented database for protein structure analysis. Protein Eng. Des. Sel. 3(4), 235–243 (1990)
Gupta, G., et al.: Semantics-based filtering: logic programming’s Killer app. In: Krishnamurthi, S., Ramakrishnan, C.R. (eds.) PADL 2002. LNCS, vol. 2257, pp. 82–100. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45587-6_7
Gupta, G., et al.: Prolog: past, present, and future. In: Warren, D.S., Dahl, V., Eiter, T., Hermenegildo, M., Kowalski, R., Rossi, F. (eds.) Prolog: 50 Years of Future, LNAI 13900, pp. 48–61. Springer, Cham (2023)
Hanekamp, K., Bohnebeck, U., Beszteri, B., Valentin, K.: PhyloGena - a user-friendly system for automated phylogenetic annotation of unknown sequences. Bioinformatics 23(7), 793–801 (2007)
Hearne, C., Cui, Z., Parsons, S., Hajnal, S., et al.: Prototyping a genetics deductive database. In: ISMB. vol. 2, pp. 170–178 (1994)
Heidtke, K.R., Schulze-Kremer, S.: BioSim: a new qualitative simulation environment for molecular biology. In: Glasgow, J.I., Littlejohn, T.G., Major, F., Lathrop, R.H., Sankoff, D., Sensen, C.W. (eds.) Proceedings of the 6th International Conference on Intelligent Systems for Molecular Biology (ISMB-98), Montréal, Québec, Canada, 28 June - 1 July, 1998. pp. 85–94. AAAI (1998)
Ishikawa, T., Terano, T.: Using analogical reasoning to predict a protein structure. Genome Inform. 4, 339–346 (1993)
Ishikawa, T., Terano, T.: How to predict it: inductive prediction by analogy using taxonomic information. In: Proceedings of the Third International Conference on Multistrategy Learning, pp. 285–293. AAAI Press (1996)
Jamil, H.M.: A visual interface for querying heterogeneous phylogenetic databases. IEEE ACM Trans. Comput. Biol. Bioinform. 14(1), 131–144 (2017)
Jamil, H.M.: Optimizing phylogenetic queries for performance. IEEE ACM Trans. Comput. Biol. Bioinform. 15(5), 1692–1705 (2018)
Joubert, M., Fieschi, M., Fieschi, D., Roux, M.: Medical decision aid: Logic bases of the system SPHINX. In: Caneghem, M.V. (ed.) Proceedings of the First International Logic Programming Conference, Faculté des Science de Luminy, ADDP-GIA, Marseille, France, September, 14–17, 1982, pp. 210–214. ADDP-GIA (1982)
Kazic, T.: Representation, reasoning and the intermediary metabolism of Escherichia coli. In: Proceedings of the Annual Hawaii International Conference on System Sciences, vol. 1, pp. 853–862 (1993)
Kazic, T.: Representation of biochemistry for modeling organisms. In: Kumosinski, T.F., Liebman, M.N. (eds.) Molecular Modeling, pp. 486–494. American Chemical Society, Washington, DC (1994)
Kemp, G.J.L., Gray, P.M.D.: Finding hydrophobic microdomains using an object-oriented database. Comput. Appl. Biosci. 6(4), 357–363 (1990)
Kemp, G.J.L., Jiao, Z., Gray, P.M.D., Fothergill, J.E.: Combining computation with database access in biomolecular computing. In: Litwin, W., Risch, T. (eds.) ADB 1994. LNCS, vol. 819, pp. 317–335. Springer, Heidelberg (1994). https://doi.org/10.1007/3-540-58183-9_57
Leung, S., Mellish, C., Robertson, D.: Basic Gene Grammars and DNA-ChartParser for language processing of Escherichia coli promoter DNA sequences. Bioinform. 17(3), 226–236 (2001)
Lusk, E.L., Overbeek, R.A., Mudambi, S., Szeredi, P.: Applications of the aurora parallel Prolog system to computational molecular biology. In: Workshop on Concurrent and Parallel Implementations (sessions A and B), held at IJCSLP’92, Washington, DC, USA, November 1992 (1992)
Lyall, A., Hammond, P., Brough, D., Glover, D.: BIOLOG - a DNA sequence analysis system in Prolog. Nucleic Acids Res. 12(1), 633–642 (1984)
MacKay, K., Carlsson, M., Kusalik, A.: GeneRHi-C: 3D GENomE reconstruction from Hi-C data. In: Proceedings of the 10th International Conference on Computational Systems-Biology and Bioinformatics, CSBIO 2019. ACM (2019)
Maddison, D., Swofford, D., Maddison, W.: NEXUS: an extensible file format for systematic information. Syst. Biol. 46(4), 590–621 (1997)
Meneghetti, A.: Exploiting fashion features for floor storage systems in the shoe industry. Int. J. Eng. Bus. Manage. 5, SPL.ISSUE (2013)
Mørk, S., Holmes, I.: Evaluating bacterial gene-finding hmm structures as probabilistic logic programs. Bioinformatics 28(5), 636–642 (2012)
Muggleton, S.: Inverse entailment and Progol. N. Gener. Comput. 13(3–4), 245–286 (1995)
Muggleton, S., King, R.D., Sternberg, M.J.E.: Using logic for protein structure prediction. In: Proceedings of the Twenty-Fifth Hawaii International Conference on System Sciences, vol. 1, pp. 685–696 (1992)
Muggleton, S., Srinivasan, A., King, R.D., Sternberg, M.J.E.: Biochemical knowledge discovery using inductive logic programming. In: Arikawa, S., Motoda, H. (eds.) DS 1998. LNCS (LNAI), vol. 1532, pp. 326–341. Springer, Heidelberg (1998). https://doi.org/10.1007/3-540-49292-5_29
Mungall, C.: Experiences using logic programming in bioinformatics. Lect. Notes Comput. Sci. 5649, 1–21 (2009)
Nakhleh, L., Miranker, D.P., Barbançon, F., Piel, W.H., Donoghue, M.J.: Requirements of phylogenetic databases. In: 3rd IEEE International Symposium on BioInformatics and BioEngineering (BIBE) 2003, 10–12 March 2003, Bethesda, MD, USA, pp. 141–148. IEEE Computer Society (2003)
Nguyen, T.H., Pontelli, E., Son, T.C.: Phylotastic: an experiment in creating, manipulating, and evolving phylogenetic biology workflows using logic programming. Theory Pract. Logic Program. 18(3–4), 656–672 (2018)
Overbeek, R.A.: Logic programming and genetic sequence analysis: a tutorial. In: Apt, K.R. (ed.) Logic Programming, Proceedings of the Joint International Conference and Symposium on Logic Programming, JICSLP 1992, Washington, DC, USA, November 1992, pp. 32–34. MIT Press (1992)
Pan, Y., Pontelli, E., Son, T.: BSIS: an experiment in automating bioinformatics tasks through intelligent workflow construction. In: Chen, H., Wang, Y., Cheung, K.H. (eds.) Semantic e-Science, pp. 189–238. Springer, Cham (2010). https://doi.org/10.1007/978-1-4419-5908-9_6
Prosdocimi, F., Chisham, B., Pontelli, E., Thompson, J., Stoltzfus, A.: Initial Implementation of a Comparative Data Analysis Ontology. Evol. Bioinforma. 5, 47–66 (2009)
Rawlings, C.J., Taylor, W.R., Nyakairu, J., Fox, J., Sternberg, M.J.E.: Reasoning about protein topology using the logic programming language Prolog. J. Mol. Graph. 3(4), 151–157 (1985)
Rawlings, C.J., Taylor, W.R., Taylor, W.R., Nyakairu, J., Fox, J., Sternberg, M.J.E.: Using prolog to represent and reason about protein structure. In: Shapiro, E. (ed.) ICLP 1986. LNCS, vol. 225, pp. 536–543. Springer, Heidelberg (1986). https://doi.org/10.1007/3-540-16492-8_101
Riguzzi, F., Cota, G., Bellodi, E., Zese, R.: Causal inference in cplint. Int. J. Approx. Reason. 91, 216–232 (2017)
Rosenblueth, D.A., Thieffry, D., Huerta, A.M., Salgado, H., Collado-Vides, J.: Syntactic recognition of regulatory regions in Escherichia coli. Comput. Appl. Biosci. 12(5), 415–422 (1996)
Saldanha, J., Eccles, J.R.: GENPRO: automatic generation of Prolog clause files for knowledge-based systems in the biomedical sciences. Comput. Methods Programs Biomed. 28(3), 207–214 (1989)
Saldanha, J., Eccles, J.R.: The application of SSADM to modelling the logical structure of proteins. Bioinformatics 7(4), 515–524 (1991)
Saldanha, J., Mahadevan, D.: Molecular model-building of amylin and \(\alpha \)-calcitonin gene-related polypeptide hormones using a combination of knowledge sources. Protein Eng. Des. Sel. 4(5), 539–544 (1991)
Sato, T.: A statistical learning method for logic programs with distribution semantics. In: Proceedings of the 12th International Conference on Logic Programming (ICLP 95), pp. 715–729 (1995)
Shaw, P.: Using constraint programming and local search methods to solve vehicle routing problems. In: Maher, M., Puget, J.-F. (eds.) CP 1998. LNCS, vol. 1520, pp. 417–431. Springer, Heidelberg (1998). https://doi.org/10.1007/3-540-49481-2_30
Shu, W., Lan, J.: Design a pathway/genome expert system using a Prolog machine incorporated with a parallel hardware searcher. In: Proceedings of the Asia Pacific Association of Medical Informatics, APAMI, pp. 9–14 (2006)
Stebbing, J., et al.: Characterization of the tyrosine kinase-regulated proteome in breast cancer by combined use of RNA interference (rnai) and stable isotope labeling with amino acids in cell culture (silac) quantitative proteomics. Mol. Cell. Proteomics 14(9), 2479–2492 (2015)
Tarzariol, A., Zanazzo, E., Dovier, A., Policriti, A.: Towards a logic programming tool for cancer data analysis. Fundam. Informaticae 176(3–4), 299–319 (2020)
Todd, S., Morffew, A., Burridge, J.: Application of relational database and graphics to the molecular sciences. In: Longstaff, J. (ed.) Proceedings of the Third British National Conference on Databases (BNCOD) 3, Leeds, UK, July 11–13, 1984, pp. 1–13. Cambridge University Press (1984)
Tsur, S., Olken, F., Naor, D.: Deductive databases for genomic mapping (extended abstract). In: Chomicki, J. (ed.) Proceedings of the Workshop on Deductive Databases held in conjunction with the North American Conference on Logic Programming, Austin, Texas, USA, November 1, 1990. Technical Report, vol. TR-CS-90-14. Kansas State University (1990)
Turcotte, M., Muggleton, S., Sternberg, M.J.E.: Use of inductive logic programming to learn principles of protein structure. Electron. Trans. Artif. Intell. 4(B), 119–124 (2000)
Turcotte, M., Muggleton, S., Sternberg, M.J.E.: Generating protein three-dimensional fold signatures using inductive logic programming. Comput. Chem. 26(1), 57–64 (2002)
Van Hentenryck, P., Michel, L.: Constraint-Based Local Search. MIT Press, Cambridge (2005)
Warren, D.S.: Introduction to Prolog. In: Warren, D.S., Dahl, V., Eiter, T., Hermenegildo, M., Kowalski, R., Rossi, F. (eds.) Prolog: 50 Years of Future, LNAI 13900, pp. 3–19. Springer, Cham (2023)
Wong, W.K.C.: Logic programming and deductive databases for genomic computations: A comparison between Prolog and LDL. In: Proceedings of the Annual Hawaii International Conference on System Sciences. vol. 1, pp. 834–843. IEEE Computer Society (1993)
Yoshida, K., et al.: Toward a human genome encyclopedia. In: Proceedings of the International Conference on Fifth Generation Computer Systems. FGCS 1992, June 1–5, Tokyo, Japan, pp. 307–320. IOS Press (1992)
Zupan, B., et al.: GenePath: a system for automated construction of genetic networks from mutant data. Bioinform. 19(3), 383–389 (2003)
Acknowledgements
We thank the anonymous reviewers that allowed us to greatly improve the focus and the presentation of the paper. This research is partially supported by INdAM-GNCS projects CUP E55F22000270001 and CUP E53C22001930001, by Interdepartmental Project on AI (Strategic Plan UniUD-22-25), and by NSF grants 2151254 and 1914635.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Dal Palù, A., Dovier, A., Formisano, A., Pontelli, E. (2023). Prolog Meets Biology. In: Warren, D.S., Dahl, V., Eiter, T., Hermenegildo, M.V., Kowalski, R., Rossi, F. (eds) Prolog: The Next 50 Years. Lecture Notes in Computer Science(), vol 13900. Springer, Cham. https://doi.org/10.1007/978-3-031-35254-6_26
Download citation
DOI: https://doi.org/10.1007/978-3-031-35254-6_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-35253-9
Online ISBN: 978-3-031-35254-6
eBook Packages: Computer ScienceComputer Science (R0)