Skip to main content

Advertisement

Log in

Study of sign segmentation in the text of Chinese sign language

  • Long Paper
  • Published:
Universal Access in the Information Society Aims and scope Submit manuscript

Abstract

The natural language processing (NLP) of sign language aims to make human sign language “understandable” to computers. In achieving this goal, the text of sign language should first be segmented into sign sequences for computers to recognize. This segmentation process constitutes the basis for the information processing of sign language. With an aim to solve the problems in expressing Chinese sign language (CSL), this paper analyzes the lexical features of CSL and discusses various sign segmentation algorithms used in obtaining computer-read files. Sign segmentation involves two main approaches: The first is rule based, whereas the second is statistics based. Backward maximum matching (BMM) is an important rule-based method widely used in Chinese NLP fields. The recently proposed conditional random fields (CRFs) have also demonstrated excellent performance as a statistical method in international tests. In this study, both the BMM and CRFs methods are employed on the same dataset to explore the practical issues in the sign segmentation of CSL. The results of the CRFs method are then presented and discussed. Our corpus contains only hundreds of sentences; therefore, cross-validation based on CRFs is also performed to avoid the unreliable function that may arise from using an exceedingly small corpus scale within limited processing time. Specifically, three-group twofold cross-validation is applied to analyze the design of the annotation specification and the selection of a feature template. The results validate the effectiveness of our proposed segmentation strategy and confirm that CRFs outperform the BMM method. The proposed approach yields an F-score of 77.4% in sign segmentation in the CSL corpus. The CRFs perform effectively in sign segmentation because they can capture the arbitrary, overlapping features of the input in a Markov model. However, to obtain more satisfactory results, we must rely on the technological development of the sign language corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. May be downloaded from www.icl.pku.edu.cn.

References

  1. Arlot, S.: V-fold cross-validation improved: V-fold penalization. (2008). arXiv preprint arXiv:0802.0566

  2. Bauer, B., Heinz, H.: Relevant features for video-based continuous sign language recognition. In: Proceedings of the 2000 International Conference on Automatic Face and Gesture Recognition, pp. 440–445 (2000)

  3. Cooper, H., Holt, B., Bowden, R.: Sign language recognition. In: Moeslund, T.B., Hilton A., Krüger, V., Sigal, L. (eds.) Visual Analysis of Humans, pp. 539–562. Springer, London (2011)

  4. Dreuw, P., Ney, H.: Towards automatic sign language annotation for the elan tool. In: LREC Workshop on the Representation and Processing of Sign Languages: Construction and Exploitation of Sign Language Corpora, Morocco (2008)

  5. Feng, H., Chen, K., Kit, C., Deng, X.: Unsupervised segmentation of Chinese corpus using accessor variety. In: Su, K.Y., Tsujii, J., Lee, J.H., Kwong, O.Y. (eds.) Natural language processing–IJCNLP 2004, pp. 694–703. Springer, Berlin Heidelberg (2005)

  6. Fenlon, J., Cormier, K., Schembri, A.: Building BSL SignBank: the lemma dilemma revisited. Int. J. Lexicogr. 28(2), 169–206 (2015). doi:10.1093/ijl/ecv008

    Article  Google Scholar 

  7. Fu, Y., Mei, C.: Introduction to deaf sign language (in Chinese). Xuelin Press, Shanghai (1986)

    Google Scholar 

  8. Huang, C. (1997). Segmentation problem in Chinese processing (in Chinese). Appl. Linguist. 1, 72–78

  9. Huenerfauth, M.: American sign language generation: multimodal NLG with multiple linguistic channels. In: Proceedings of the ACL Student Research (2005)

  10. Jiang, M.: Natural Language Processing (in Chinese). Higher Education Press, Beijing (2006)

  11. Johnston, T.A.: W (h) ither the deaf community? Population, genetics, and the future of Australian sign language. Am. Ann. Deaf 148(5), 358–375 (2004)

    Article  Google Scholar 

  12. Johnston, T., Cresdee, D., Schembri, A., Woll, B.: FINISH variation and grammaticalization in a signed language: How far down this well-trodden pathway is Auslan (Australian sign language)?. Lang. Var. Change. 27(1), 117–155. doi:10.1017/S0954394514000209

  13. Kudo, T., Yamamoto, K., Matsumoto, Y.: Applying conditional random fields to Japanese morphological analysis. In: EMNLP, Vol. 4, pp. 230–237. (2004)

  14. Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceeding of the International Conference on Machine Learning (ICML-2001), Williams, pp. 282–289 (2011)

  15. Liddell, S.K.: Grammar, gesture, and meaning in American Sign Language. Cambridge University Press, Cambridge (2003)

    Book  Google Scholar 

  16. Marshall, I., Safar, E.: Grammar development for sign language avatar-based synthesis. In: Stephanidis, C. (ed.) Universal access in HCI: exploring new dimensions of diversity—volume 8 of the Proceedings of the 11th International Conference on Human-Computer Interaction, (CD-ROM), Lawrence Erlbaum Associates, Mahwah, NJ, USA (2005).

  17. McCallum, A., Freitag, D., Pereira, F.C.: Maximum entropy markov models for information extraction and segmentation. In: ICML, pp. 591–598. (2000)

  18. Nadeau, C., Bengio, Y.: Inference for the generalization error. Mach. Learn. 52(3), 239–281 (2003)

    Article  MATH  Google Scholar 

  19. Pinto, D., McCallum, A., Wei, X., Croft, W.B.: Table extraction using conditional random fields. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 235–242. ACM (2003)

  20. Prillwitz, S., Leven, R., Zienert, H., Hanke, T., Henning, J. et al.: Hamburg notation system for sign languages–an introductory guide. International Studies on Sign Language and the Communication of the Deaf, 5 (1989)

  21. Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–285 (1989)

    Article  Google Scholar 

  22. Stein, D., Bungeroth, J., Ney, H.: Morpho-syntax based statistical methods for sign language translation. In: Proceedings of the European Association for Machine Translation, pp. 169–177. European Association for Machine Translation, Allschwil, Switzerland (2006)

  23. Stuart, M.T.: A grammar of sign writing. Thesis in Linguistics, University of North Dakota (2011)

  24. Tan, Y., Yao, T., Chen, Q., Zhu, J.: Applying conditional random fields to Chinese shallow parsing. In: Gelbukh, A. (ed.) Computational Linguistics and Intelligent Text Processing, pp. 167–176. Springer, Berlin Heidelberg (2005)

  25. Wilcox, S., Wilcox, P.P.: Learning to See: Teaching American Sign Language as a Second Language. Gallaudet University Press, Washington (1997)

    Google Scholar 

  26. Wang, W.S.Y.: Application of computers in Chinese linguistics (in Chinese). In: Proceedings of R.O.C. Computational Linguistics Workshops I (ROCLING I) (1988)

  27. Yu, S., Duan, H., Zhu, X., et al.: The basic processing of contemporary Chinese corpus at Peking University. J. Chin. Inf. Process. 16(6), 58–65 (2002)

    Google Scholar 

  28. Yu, W., Ruibo, W., Huichen, J., Jihong, L.: Blocked 3 × 2 cross-validated t-test for comparing supervised classification learning algorithms. Neural Comput. 26(1), 208–235 (2014)

    Article  MathSciNet  Google Scholar 

  29. Zhao, H., Kit, C. (2007) Incorporating global information into supervised learning for Chinese word segmentation. In: 10th Conference of the Pacific Association for Computational Linguistics, pp. 66–74

  30. Zhao, L., Kipper, K., Schuler, W., Vogler, C., Badler, N., Palmer, M.: A machine translation system from English to American Sign Language. In: Proceedings of the 4th conference of the association for machine translation in the Americas on envisioning machine translation in the information future, lecture notes in computer science, 1934, Springer, Heidelberg, London, pp. 54–67 (2000)

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (61433015; 91420202; 61602041), National Social Science Major Fund (14ZDB154; 13&ZD187) of China, MOE (Ministry of Education in China) Project of Humanities and Social Sciences (14YJC740104), the key Project of the National Language Committee (ZDI135-31), and the independent scientific research project of Tsinghua University (20161080056).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Minghu Jiang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yao, D., Jiang, M., Huang, Y. et al. Study of sign segmentation in the text of Chinese sign language. Univ Access Inf Soc 16, 725–737 (2017). https://doi.org/10.1007/s10209-016-0506-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10209-016-0506-8

Keywords

Navigation