Skip to main content

Syntax Enhanced Research Method of Stylistic Features

  • Conference paper
  • First Online:
Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data (CCL 2018, NLP-NABD 2018)

Abstract

Nowadays, research on stylistic features (SF) mainly focuses on two aspects: lexical elements and syntactic structures. The lexical elements act as the content of a sentence and the syntactic structures constitute the framework of a sentence. How to combine both aspects and exploit their common advantages is a challenging issue. In this paper, we propose a Principal Stylistic Features Analysis method (PSFA) to combine these two parts, and then mine the relations between features. From a statistical analysis point of view, many interesting linguistic phenomena can be found. Through the PSFA method, we finally extract some representative features which cover different aspects of styles. To verify the performance of these selected features, classification experiments are conducted. The results show that the elements selected by the PSFA method provide a significantly higher classification accuracy than other advanced methods.

This work is supported by Beijing Social Science Fund (16YYB021) and Project of Humanities and Social Sciences of Ministry of Education in China (17YJAZH056).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.cs.brandeis.edu/~clp/ctb/posguide.3rd.ch.pdf.

  2. 2.

    https://en.wikipedia.org/wiki/Student%27s_t-test.

  3. 3.

    https://en.wikipedia.org/wiki/Correlation_coefficient.

  4. 4.

    https://en.wikipedia.org/wiki/Hierarchical_clustering.

  5. 5.

    https://plg.uwaterloo.ca/~gvcormac/treccorpus06/.

  6. 6.

    https://github.com/HIT-SCIR/pyltp.

  7. 7.

    https://nlp.stanford.edu/software/.

  8. 8.

    http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html.

References

  1. Ahmad, M., Nadeem, M.T., Khan, T., Ahmad, S.: Stylistic analysis of the ‘muslim family laws ordinance 1961’. J. Study Engl. Linguist. 3(1), 28–37 (2015)

    Article  Google Scholar 

  2. Ashraf, S., Iqbal, H.R., Nawab, R.M.A.: Cross-genre author profile prediction using stylometry-based approach. In: CLEF (Working Notes), pp. 992–999 (2016)

    Google Scholar 

  3. Bird, H., Franklin, S., Howard, D.: Age of acquisition and imageability ratings for a large set of words, including verbs and function words. Behav. Res. Methods Instrum. Comput. 33(1), 73–79 (2001)

    Article  Google Scholar 

  4. Booten, K., Hearst, M.A.: Patterns of wisdom: discourse-level style in multi-sentence quotations. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1139–1144 (2016)

    Google Scholar 

  5. Chen, J., Huang, H., Tian, S., Qu, Y.: Feature selection for text classification with Naïve Bayes. Expert Syst. Appl. 36(3), 5432–5435 (2009)

    Article  Google Scholar 

  6. Griffiths, T.L., Steyvers, M., Blei, D.M., Tenenbaum, J.B.: Integrating topics and syntax. In: Advances in Neural Information Processing Systems, pp. 537–544 (2005)

    Google Scholar 

  7. Kumar, S., Kernighan, B.: Cloud-based plagiarism detection system performing predicting based on classified feature vectors. US Patent 9,514,417 (2016)

    Google Scholar 

  8. Lahiri, S., Vydiswaran, V.V., Mihalcea, R.: Identifying usage expression sentences in consumer product reviews. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (vol. 1: Long Papers), pp. 394–403 (2017)

    Google Scholar 

  9. Liu, Q.: Research on stylistic features of the English international business contract. DEStech Trans. Soc. Sci. Educ. Hum. Sci. (MSIE) (2017)

    Google Scholar 

  10. Majumder, N., Poria, S., Gelbukh, A., Cambria, E.: Deep learning-based document modeling for personality detection from text. IEEE Intell. Syst. 32(2), 74–79 (2017)

    Article  Google Scholar 

  11. Mishne, G., et al.: Experiments with mood classification in blog posts. In: Proceedings of ACM SIGIR 2005 Workshop on Stylistic Analysis of Text for Information Access, vol. 19, pp. 321–327 (2005)

    Google Scholar 

  12. Niu, X., Carpuat, M.: Discovering stylistic variations in distributional vector space models via lexical paraphrases. In: Proceedings of the Workshop on Stylistic Variation, pp. 20–27 (2017)

    Google Scholar 

  13. Pavlick, E., Rastogi, P., Ganitkevitch, J., Van Durme, B., Callison-Burch, C.: PPDB 2.0: better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (vol. 2: Short Papers), pp. 425–430 (2015)

    Google Scholar 

  14. Pervaz, I., Ameer, I., Sittar, A., Nawab, R.M.A.: Identification of author personality traits using stylistic features: notebook for PAN at CLEF 2015. In: CLEF (Working Notes) (2015)

    Google Scholar 

  15. Ruano San Segundo, P.: A corpus-stylistic approach to dickens’ use of speech verbs: beyond mere reporting. Lang. Lit. 25(2), 113–129 (2016)

    Article  Google Scholar 

  16. Santosh, D.T., Babu, K.S., Prasad, S., Vivekananda, A.: Opinion mining of online product reviews from traditional LDA topic clusters using feature ontology tree and sentiwordnet. IJEME 6, 1–11 (2016)

    Article  Google Scholar 

  17. Saparova, M.: The problem of stylistic classification of colloquial vocabulary. 5(1), 80–82 (2016)

    Google Scholar 

  18. Schler, J., Koppel, M., Argamon, S., Pennebaker, J.W.: Effects of age and gender on blogging. In: AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs, vol. 6, pp. 199–205 (2006)

    Google Scholar 

  19. Szymanski, T., Lynch, G.: UCD: diachronic text classification with character, word, and syntactic n-grams. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), United States (2015)

    Google Scholar 

  20. Wang, L.: News authorship identification with deep learning (2017)

    Google Scholar 

  21. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wu, H., Liu, Y. (2018). Syntax Enhanced Research Method of Stylistic Features. In: Sun, M., Liu, T., Wang, X., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. CCL NLP-NABD 2018 2018. Lecture Notes in Computer Science(), vol 11221. Springer, Cham. https://doi.org/10.1007/978-3-030-01716-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01716-3_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01715-6

  • Online ISBN: 978-3-030-01716-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics