Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter June 7, 2017

Distributed storage and recall of sentences

  • Marc Ebner EMAIL logo

Abstract

The human brain is able to learn language by processing written or spoken language. Recently, several deep neural networks have been successfully used for natural language generation. Although it is possible to train such networks, it remains unknown how these networks (or the brain) actually process language. A scalable method for distributed storage and recall of sentences within a neural network is presented. A corpus of 59 million words was used for training. A system using this method can efficiently identify sentences that can be considered reasonable replies to an input sentence. The system first selects a small number of seeds words which occur with low frequency in the corpus. These seed words are then used to generate answer sentences. Possible answers are scored using statistical data also obtained from the corpus. A number of sample answers generated by the system are shown to illustrate how the method works.

  1. Author contributions: The author invented the method, implemented the software, and ran the experiments.

  2. Research funding: Ernst Moritz Universität Greifswald Employment or leadership: Marc Ebner is Professor of Computer Science at the Ernst Moritz Arndt Universität Greifswald.

  3. Honorarium: None declared.

  4. Competing interests: The funding organization(s) played no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the report for publication.

References

1. Ebner M. Verfahren und Vorrichtung zum maschinellen Verarbeiten von Texten. Deutsche Patentanmeldung, 42 Seiten, 21. Dezember, DE P16025DE02, 2016.Search in Google Scholar

2. Abrego GH, Menendez-Pidal X. Supervised automatic text generation based on word classes for language modeling. United States Patent US 7,035,789, Apr. 2006.Search in Google Scholar

3. Metz BD, Automatic grammar tuning using statistical language model generation. United States Patent Application US 2008/0052076 A1, Feb. 2008.Search in Google Scholar

4. Chang WW, Wilensky GD, Dontcheva LA. Natural language vocabulary generation and usage. United States Patent Application US 2014/0081626 A1, Mar. 2014.Search in Google Scholar

5. Christ O. Dynamic generation of auto-suggest dictionary for natural language translation. United States Patent Application US 2011/0184719 A1, Jul. 2011.Search in Google Scholar

6. Fux V, Elizarov MG. Handheld electronic device and method for disambiguation of compound text input and that employs n-gram data to limit generation of low-probability compound language solutions. United States Patent US 8,265,926, Sep. 2012.Search in Google Scholar

7. Wilson HR, Cowan JD. Excitatory and inhibitory interactions in localized populations of model neurons. Biophys J 1972;12:1–24.10.1016/S0006-3495(72)86068-5Search in Google Scholar

8. Kistler WM, Gerstner W, van Hemmen JL. Reduction of the Hodgkin-Huxley equations to a single-variable threshold model. Neural Comput 1997;9:1015–1045.10.1162/neco.1997.9.5.1015Search in Google Scholar

9. Wilson HR. Simplified dynamics of human and mammalian neocortical neurons. J Theor Biol 1999;200:375–88.10.1006/jtbi.1999.1002Search in Google Scholar

10. Christodoulou C, Bugmann G, Clarkson TG. A spiking neuron model: applications and learning. Neural Netw 2002;15:891–908.10.1016/S0893-6080(02)00034-5Search in Google Scholar

11. Ebner M, Hameroff S. Lateral information processing by spiking neurons – a theoretical model of the neural correlate of consciousness. Comput Intell Neurosci 2011;2011:17.10.1155/2011/247879Search in Google Scholar PubMed PubMed Central

12. Ebner M, Hameroff S. Modeling figure/ground separation with spiking neurons. In: Roterman-Konieczna I, editor. Simulation in medicine – preclinical and clinical applications. Berlin: de Gruyter, 2015:77–96.10.1515/9783110406344-007Search in Google Scholar

13. Hebb DO. The organization of behavior, chapter 4, new york, wiley, 1949. In: Anderson JA, Rosenfeld E, editors. Neurocomputing: foundations of research. Cambridge, Massachusetts: The MIT Press, 1988.Search in Google Scholar

14. Izhikevich EM, Edelman GM. Large-scale model of mammalian thalamocortical systems. Proc Natl Acad Sci USA 2008;105:3593–8.10.1073/pnas.0712231105Search in Google Scholar PubMed PubMed Central

15. Carus AB, Wiesner M, Boone K. Method and apparatus for morphological analysis and generation of natural language text. United States Patent US 5,794,177, Aug. 1998.Search in Google Scholar

16. Rehberg CP. Automatic pattern generation in natural language processing. United States Patent US 8,180,629, May 2012.Search in Google Scholar

17. Bostick JE, Ganci JM Jr, Kaemmerer JP, Trim CM. Ontology driven dictionary generation and ambiguity resolution for natural language processing. United States Patent US 9,372,924, Jun. 2016.Search in Google Scholar

18. Manning CD, Schütze H. Foundations of statistical natural language processing. Cambridge, Massachusetts: The MIT Press, 1999.Search in Google Scholar

19. Chengjian S, Zhu S, Shi Z. Image annotation via deep neural network. In: International Conference on Machine Vision Applications, Tokyo, Japan, 2015:518–21.10.1109/MVA.2015.7153244Search in Google Scholar

20. Fang H, Gupta S, Iandola F, Srivastava R, Deng L, Dollar P, et al. From captions to visual concepts and back. In: Proceedings of Computer Vision and Pattern Recognition. IEEE, Jun. 2015.10.1109/CVPR.2015.7298754Search in Google Scholar

21. Kiros R, Salakhutdinov R, Zemel R. Multimodal neural language models. In: Proceedings of the 31st International Conference on Machine Learning, Beijing, China, 2014.Search in Google Scholar

22. Kiros R, Salakhutdinov R, Zemel R. Unifying visual-semantic embeddings with multimodal neural language models. in TACL, 2015.Search in Google Scholar

23. Hochreiter S, Schidhuber J. Long short-term memory. Neural Comput 1997;9:1735–80.10.1162/neco.1997.9.8.1735Search in Google Scholar PubMed

24. Vinyals O, Toshev A, Bengio S, Erhan D. Show and tell: a neural image caption generator. In: Proceedings of Computer Vision and Pattern Recognition. IEEE, 2015, pp. 3156–3164.Search in Google Scholar

25. Hermann KM, Kociský T, Grefenstette E, Espeholt L, Kay W, Suleyman M,et al. Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems (NIPS), 2015. [Online]. Available: http://arxiv.org/abs/1506.03340=0ptSearch in Google Scholar

26. Lally A, Fodor P. Natural language processing with prolog in the ibm watson system. Association for Logic Programming, 2011.Search in Google Scholar

27. Bhowan U, McCloskey DJ. Genetic programming for feature selection and question-answer ranking in ibm watson. In: Machado P, Heywood ML, McDermott J, Castelli M, García-Sánchez P, Burelli P, et al., editors. Proceedings of the 18th European Conference on Genetic Programming, Denmark, April 8–10. Berlin: Springer, 2015:153–66.Search in Google Scholar

28. Manning CD, Raghavan P, Schütze H. Introduction to information retrieval. New York, NY: Cambridge University Press, 2008.10.1017/CBO9780511809071Search in Google Scholar

29. Schilder F. Systems and methods for natural language generation. United States Patent Application US 2014/0149107 A1, May 2014.Search in Google Scholar

30. Howald B, Kondadadi R, Schilder F. Systems and methods for natural language generation. United States Patent US 9,424,254, Aug. 2016.Search in Google Scholar

31. Kaeser A, Vignon E, Stoecklé L. Systems and methods for natural language generation. United States Patent US 9,411,804, Aug. 2016.Search in Google Scholar

32. Bangalore S, Rambow OC. System and method for natural language generation. United States Patent US 7,562,005, Jul. 2009.Search in Google Scholar

33. Ratnaparkhi A. Trainable dynamic phrase reordering for natural language generation in conversational systems. United States Patent Application US 2002/0116173 A1, Aug. 2002.Search in Google Scholar

34. Pan S, Shaw C-K. Method, program, and apparatus for natural language generation. United States Patent US 7,496,621, Feb. 2009.Search in Google Scholar

35. Chomsky N. Three models for the description of language. IRE Transactions on Information Theory 1956;2:113–24.10.1109/TIT.1956.1056813Search in Google Scholar

36. Christiansen MH, Chater N. Language as shaped by the brain. Behav Brain Sci. 2008;31:489–558.10.1017/S0140525X08004998Search in Google Scholar

37. Broder AZ, Glassman SC, Manasse MS, Zweig G. Syntactic clustering of the web. SRC Technical Note, Jul. 1997.10.1016/S0169-7552(97)00031-7Search in Google Scholar

38. Dunning T. Statistical identification of language. Technical Report MCCS 94-273, Mar. 1994.Search in Google Scholar

Received: 2017-3-7
Accepted: 2017-5-12
Published Online: 2017-6-7
Published in Print: 2017-6-27

©2017 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 25.4.2024 from https://www.degruyter.com/document/doi/10.1515/bams-2017-0005/html
Scroll to top button