Skip to main content
Log in

Performance evaluation of character-level CNNs using tweet data and analysis for weight perturbations

  • Original Article
  • Published:
Artificial Life and Robotics Aims and scope Submit manuscript

Abstract

Character-level convolutional neural networks (CLCNNs) are commonly used to classify textual data. CLCNN is used as a more versatile tool. For natural language recognition, after decomposing a sentence into character units, each unit is converted into a corresponding character code (e.g., Unicode values) and the code is input into the CLCNN network. Thus, sentences can be treated like images. We have previously applied a CLCNN to verify whether a university’s diploma and/or curriculum policies are well written. In this study, we experimentally confirm the effectiveness of CLCNN using tweet data. In particular, we focus on the effect of the number of units on performance using the following two types of data; one is a real and public tweet dataset on the reputation of a cell phone, and the other is the NTCIR-13 MedWeb task, which consists of pseudo-tweet data and is a well-known collection of tests for multi-label problems. Results of experiments conducted by varying the number of units in the all-coupled layer confirmed the agreement of the results with the theorem introduced in the Amari’s book (Amari in Mathematical Science New Development of Information Geometry, For Senior & Graduate Courses. SAIENSU-SHA Co., 2014). Furthermore, in the NTCIR-13 MedWeb task, we analyze two kinds of experiments, the effects of kernel size and weight perturbation. The results of the difference in the kernel size suggest the existence of an optimal kernel size for sentence comprehension. The results of perturbations to the convolutional layer and pooling layer indicate the possibility of relationship between the numbers of degrees of freedom and network parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. The three policies are: diploma policy, which concerns graduation certification; curriculum policy, which concerns course contents and their organization; and admission policy, which concerns enrollment acceptance.

  2. These figures are recreated based on Refs. [12, 13].

  3. This figure is recreated based on Ref. [12].

  4. Although it is possible to increase the percentage of exact matches by also performing dropout after batch normalization, it is not used in this paper to ascertain the effect of perturbations more accurately.

  5. This table is reprinted from Ref. [15].

References

  1. Amari S (2014) Mathematical Science New Development of Information Geometry, For Senior & Graduate Courses. SAIENSU-SHA Co., Ltd ((in Japanese))

    Google Scholar 

  2. Belkin M, Hsu D, Ma S, Mandal S (2019) Reconciling modern machine-learning practice and the classical bias-variance trade-off. Proc Natl Acad Sci 116(32):15849–15854

    Article  MathSciNet  Google Scholar 

  3. Hastie T, Montanari A, Rosset S, Tibshirani RJ (2019) Surprises in high-dimensional least squares interpolation. arXiv:1903.08560

  4. Keskar NS, Nocedal J, Tang PTP, Mudigere D, Smelyanskiy M (2019) On large-batch training for deep learning: generalization gap and sharp minima. In: International Conference on Learning Representations

  5. Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp 1746–1751

  6. Miyazaki K, Ida M (2012) Proposal and evaluation of the active course classification support system with exploitation-oriented learning, the 9th European workshop on reinforcement learning (EWRL-9), Sept. 9, 2011. Athens Royal Olympic Hotel, Lecture Notes in Computer Science 7188:333–344

    Article  Google Scholar 

  7. Miyazaki K, Ida M, Yoshikane F, Nozawa T, Kita H (2005) Development of a course classification support system for the awarding of degrees using syllabus data. IPSJ J 46(3):782–791 (in Japanese)

  8. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781

  9. Miyazaki K, Takahashi N, Mori R (2019). Research on Consistency between Diploma Policies and Nomenclature of Major Disciplines: Deep Learning Approach, Proc. of 2019 7th International Conference on Information and Education Technology (ICIET2019)

  10. Miyazaki K, Ida M (2020) Construction of Consistency Judgment System of Diploma Policy and Curriculum Policy using Character-level CNN. Electronics and Communications in Japan 102(12):30–39

    Article  Google Scholar 

  11. Miyazaki K (2020). Classification of Medical Data using Character-level CNN, The 3rd International Conference on Information Science and System, pp.43-47

  12. Miyazaki K, Ida M (2021). Evaluation of Character-level CNN using NTCIR-13 MedWeb Task, 2021 Annual Conference on Electronics, Information and Systems Institute of Electrical Engineers of Japan (IEEJ), 6 pages (in Japanese)

  13. Miyazaki K, Ida M (2021). Evaluation of Character-Level CNNs using the NTCIR-13 MedWeb Task, the 22nd International Symposium on Advanced Intelligent Systems (ISIS2021), 6 pages

  14. Miyazaki K, Yamaguchi S, Mori R, Yoshikawa Y, Saito T, Suzuki T (2022). Proposal and evaluation of a course classification support system emphasizing communication with the sub-committees within the Committee of Validation and Examination for Degrees, Preliminary Soft-Proceedings 4th EAI International Conference on Artificial Intelligence for Communications and Networks, pp.122-129

  15. Miyazaki K, Ida M (2023). Effectiveness of Character-level CNN and its Examination of Perturbation for Weights, 28th International Symposium on Artificial Life and Robotics (AROB 28th 2023), 5 pages

  16. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) 2017. Attention Is All You Need, Neural Information Processing Systems (NIPS

    Google Scholar 

  17. Wakamiya S, Morita M, Kano Y, Ohkuma T, Aramaki E (2017). Overview of the NTCIR-13 MedWeb Task, In Proceedings of the 13th NTCIR Conference on Evaluation of Information Access Technologies (NTCIR-13), pp. 40-49

  18. http://research.nii.ac.jp/ntcir/permission/ntcir-13/perm-en-MedWeb.html [accessed: 2023-12-21]

  19. Yanaka H, Mineshima K (2022). Compositional Evaluation on Japanese Textual Entailment and Similarity (arXiv, data), Transactions of the Association for Computational Linguistics (TACL), Vol.10, pp.1266-1284

  20. Yang Y, Zhang Y, Tar C, Baldridge J (2019). PAWS-X: A cross- lingual adversarial dataset for paraphrase identification, In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp.3687-3692

  21. Zhang X, Zhao J, LeCun Y (2015). Characterlevel Convolutional Networks for Text Classification, arXiv:1509.01626

  22. https://www.db.info.gifu-u.ac.jp/sentiment_analysis/ [accessed: 2023-12-21]

  23. https://retty.me [accessed: 2023-12-21]

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kazuteru Miyazaki.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was presented in part at the joint symposium of the 28th International Symposium on Artificial Life and Robotics, the 8th International Symposium on BioComplexity, and the 6th International Symposium on Swarm Behavior and Bio-Inspired Robotics (Beppu, Oita and Online, January 25-27, 2023).

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Miyazaki, K., Ida, M. Performance evaluation of character-level CNNs using tweet data and analysis for weight perturbations. Artif Life Robotics 29, 266–273 (2024). https://doi.org/10.1007/s10015-024-00944-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10015-024-00944-9

Keywords

Navigation