Abstract
In this paper we address the problem of grammatical inference in the programming language domain. The grammar of a programming language is an important asset because it is used in developing many software engineering tools. Sometimes, grammars of languages are not available and have to be inferred from the source code; especially in the case of programming language dialects. We propose an approach for inferring the grammar of a programming language when an incomplete grammar along with a set of correct programs is given as input. The approach infers a set of grammar rules such that the addition of these rules makes the initial grammar complete. A grammar is complete if it parses all the input programs successfully. We also proposes a rule evaluation order, i.e. an order in which the rules are evaluated for correctness. A set of rules are correct if their addition makes the grammar complete. Experiments show that the proposed rule evaluation order improves the process of grammar inference.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Adriaans, P.W.: Language Learning for Categorial Perspective. PhD thesis, University of Amsterdam, Amsterdam, Netherlands (November 1992)
Aho, A.V., Sethi, R., Ullman, J.D.: Compilers Principles, Techniques, and Tools. Pearson Education (Singapore) Pte. Ltd, London (2002)
Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987)
Crepinsek, M., Mernik, M., Javed, F., Bryant, B.R., Sprague, A.: Extracting grammar from programs: evolutionary approach. SIGPLAN Not. 40(4), 39–46 (2005)
Crepinsek, M., Mernik, M., Zumer, V.: Extracting grammar from programs: brute force approach. SIGPLAN Not. 40(4), 29–38 (2005)
de la Higuera, C.: A bibliographical study of grammatical inference. Pattern Recognition 38, 1332–1348 (2005)
Dubey, A., Aggarwal, S.K., Jalote, P.: A technique for extracting keyword based rules from a set of programs. In: CSMR 2005: Proceedings of the Ninth European Conference on Software Maintenance and Reengineering (CSMR 2005), Manchester, UK, pp. 217–225. IEEE Computer Society, Los Alamitos (2005)
Dubey, A., Jalote, P., Aggarwal, S.K.: A deterministic technique for extracting keyword based grammar rules from programs. In: Proceedings of 21st Annual ACM Symposium on Applied Computing, PL track, Dijon, France, pp. 1631–1632. ACM SIGAPP, New York (2006)
Mark Gold, E.: Language identification in the limit. Information and Control 10(5), 447–474 (1967)
Gold, E.M.: Complexity of automaton identification from given data. Information and Control 37(3), 302–320 (1978)
Grünwald, P.: A minimum description length approach to grammar inference. In: Connectionist, Statistical, and Symbolic Approaches to Learning for Natural Language Processing, pp. 203–216. Springer, London (1996)
Jain, R., Aggarwal, S.K., Jalote, P., Biswas, S.: An interactive method for extracting grammar from programs. Softw. Pract. Exper. 34(5), 433–447 (2004)
Koshiba, T., Makinen, E., Takada, Y.: Learning deterministic even linear languages from positive examples. Theor. Comput. Sci. 185(1), 63–79 (1997)
Lämmel, R., Verhoef, C.: Semi-automatic Grammar Recovery. Software—Practice & Experience 31(15), 1395–1438 (2001)
Langley, P., Stromsten, S.: Learning context-free grammars with a simplicity bias. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 220–228. Springer, Heidelberg (2000)
Lawrence, S., Giles, C.L., Fong, S.: Natural language grammatical inference with recurrent neural networks. IEEE Transactions on Knowledge and Data Engineering 12(1), 126–140 (2000)
Lee, L.: Learning of context-free languages: A survey of the literature. Technical Report TR-12-96, Harvard University (1996), ftp://deas-ftp.harvard.edu/techreports/tr-12-96.ps.gz
Mernik, M., Gerlic, G., Zumer, V., Bryant, B.: Can a parser be generated from examples? In: Proceedings of 18th ACM symposium on applied computing, pp. 1063–1067. ACM Press, New York (2003)
Parekh, R., Honovar, V.: Invited Chapter. In: Dale, Moisl, Somers (eds.) Grammar Inference, Automata Induction, and Language Acquision. Marcel Dekker, New York (2000)
van Zaanen, M.: ABL: Alignment-based learning. In: COLING 2000 - Proceedings of the 18th International Conference on Computational Linguistics, Saarbrücken, Germany, pp. 961–967 (August 2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dubey, A., Jalote, P., Aggarwal, S.K. (2006). Inferring Grammar Rules of Programming Language Dialects. In: Sakakibara, Y., Kobayashi, S., Sato, K., Nishino, T., Tomita, E. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2006. Lecture Notes in Computer Science(), vol 4201. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11872436_17
Download citation
DOI: https://doi.org/10.1007/11872436_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45264-5
Online ISBN: 978-3-540-45265-2
eBook Packages: Computer ScienceComputer Science (R0)