Abstract
The study of minimum entropy of English has a long history and has made a great progress, but only a few studies on other languages have been reported in literature so far. In this paper, we present a new method to estimate the minimum entropy of character in natural languages, based on two hypotheses of conservation of information quantity. We also verified the hypotheses empirically through experiments with two natural languages, Japanese and Chinese.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
K. Kita, T. Nakamura, M. Nagata, “Voice Language Processing,” Morikita Publishing, Inc., 1996
N. Mori, O. Yamaji, “Estimating the Upper Limit of Information Quantity in Japanese Language,” Information Processing Society Journal, Vol. 38, No. 11, pp. 2192–2199, 1997
K. Asai, “About Entropy in Japanese,” Measuring Japanese Language, pp. 4–7, 1965
C.E. Shannon, “Prediction and Entropy of Printed English,” Bell System Technical Journal, Vol. 30, pp. 50–64, 1951.
P.F. Brown, S.A.D. Pietra, R.L. Mercer, “An Estimate of an Upper Bound for the Entropy of English,” Computational Linguistics, Vol.18, No. 1, pp. 31–20, 1992.
P.F. Brown, V.J.D. Pietra, P.V. deSouza, J.C. Lai, R.L. Mercer, “Class-Based ngram Models of Natural Language,” Computational Linguistics, Vol. 18, No. 4, pp. 467–479, 1992.
T.M. Cover, R.C. King, “A Convergent Gambling Estimate of the Entropy of English,” IEEE Trans. on Information Theory, Vol.-IT-24, No.-4, pp. 413–421, 1978.
F. Ren, J. Nie, “The Concept of Sensitive Word in Chinese-Survey in a Machine-Readable Dictionary,” Journal of Natural Language Processing, Vol. 6, No. 1, pp. 59–78, 1999.
L. Fan, F. Ren, Y. Miyanaga, K. Tochinai, “Automatic Composition of Chinese Compound Words for Chinese-Japanese Machine Translation,” Transactions of Information Processing Society of Japan, Vol. 33, No. 9, pp. 1103–1113, 1992.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ren, F., Mitsuyoshi, S., Yen, K., Zong, C., Zhu, H. (2003). An Estimate Method of the Minimum Entropy of Natural Languages. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2003. Lecture Notes in Computer Science, vol 2588. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36456-0_39
Download citation
DOI: https://doi.org/10.1007/3-540-36456-0_39
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00532-2
Online ISBN: 978-3-540-36456-6
eBook Packages: Springer Book Archive