Data Augmentation for Deep Learning of Judgment Documents

Yan, Ge; Li, Yu; Zhang, Shu; Chen, Zhenyu

doi:10.1007/978-3-030-36204-1_19

Ge Yan^13,14,
Yu Li^13,14,
Shu Zhang^13,14 &
…
Zhenyu Chen^13,14

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11936))

Included in the following conference series:

International Conference on Intelligent Science and Big Data Engineering

1854 Accesses
11 Citations

Abstract

With the increasing number of machine learning parameters, the requirements on data quantity are getting higher and higher to train a good model. The choice of methods and the optimization of parameters can improve the model while the quality and quantity of the data determine the upper limit of the model. However, in realistic scenarios, it is quite challenging to get a lot of tag data. Therefore, it is natural to realize data augmentation by transforming the original data. We use three methods for data augmentation on different scales of original data in solving the crime prediction problem based on the description of the cases, and find that the effects of data augmentation are different for different models and different fundamental data quantities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. In: ACL (2014)
Google Scholar
Hawkins, D.M.: The problem of overfitting. J. Chem. Inf. Comput. Sci. 44(1), 1–12 (2004)
Article Google Scholar
Hayashi, T., et al.: Back-translation-style data augmentation for end-to-end ASR. In: 2018 IEEE Spoken Language Technology Workshop (SLT), pp. 426–433. IEEE (2018)
Google Scholar
Joulin, A., Grave, E., Bojanowski, P., Douze, M., Jégou, H., Mikolov, T.: FastText.zip: compressing text classification models. arXiv preprint arXiv:1612.03651 (2016)
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997)
Article Google Scholar
Liu, C.-L., Hsieh, C.-D.: Exploring phrase-based classification of judicial documents for criminal charges in Chinese. In: Esposito, F., Raś, Z.W., Malerba, D., Semeraro, G. (eds.) ISMIS 2006. LNCS (LNAI), vol. 4203, pp. 681–690. Springer, Heidelberg (2006). https://doi.org/10.1007/11875604_75
Chapter Google Scholar
Luo, B., Feng, Y., Xu, J., Zhang, X., Zhao, D.: Learning to predict charges for criminal cases with legal basis. arXiv preprint arXiv:1707.09168 (2017)
Ng, A.Y.: Feature selection, L1 vs. L2 regularization, and rotational invariance. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 78. ACM (2004)
Google Scholar
Perez, L., Wang, J.: The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:1712.04621 (2017)
Prechelt, L.: Automatic early stopping using cross validation: quantifying the criteria. Neural Netw. 11(4), 761–767 (1998)
Article Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Tang, Z., Zhang, Z., Ma, X., Qin, J., Zhao, M.: Robust neighborhood preserving low-rank sparse CNN features for classification. In: Hong, R., Cheng, W.-H., Yamasaki, T., Wang, M., Ngo, C.-W. (eds.) PCM 2018. LNCS, vol. 11164, pp. 357–369. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00776-8_33
Chapter Google Scholar
Tang, Z., Jiang, W., Zhang, Z., Zhao, M., Zhang, L., Wang, M.: DenseNet with up-sampling block for recognizing texts in images. Neural Comput. Appl. 1–9
Google Scholar
Xiao, C., et al.: CAIL2018: a large-scale legal dataset for judgment prediction. arXiv preprint arXiv:1807.02478 (2018)
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, pp. 649–657 (2015)
Google Scholar
Zhong, H., Zhipeng, G., Tu, C., Xiao, C., Liu, Z., Sun, M.: Legal judgment prediction via topological learning. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3540–3549 (2018)
Google Scholar
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Google Scholar

Download references

Acknowledgment

The work is supported in part by the National Key Research and Development Program of China (2016YFC0800805) and the National Natural Science Foundation of China (61472176, 61772014).

Author information

Authors and Affiliations

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Ge Yan, Yu Li, Shu Zhang & Zhenyu Chen
Software Testing Engineering Laboratory of Jiangsu Province, Nanjing, China
Ge Yan, Yu Li, Shu Zhang & Zhenyu Chen

Authors

Ge Yan
View author publications
You can also search for this author in PubMed Google Scholar
Yu Li
View author publications
You can also search for this author in PubMed Google Scholar
Shu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhenyu Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhenyu Chen .

Editor information

Editors and Affiliations

Nanjing University of Science and Technology, Nanjing, China
Zhen Cui
Nanjing University of Science and Technology, Nanjing, China
Jinshan Pan
Nanjing University of Science and Technology, Nanjing, China
Shanshan Zhang
Nanjing University of Science and Technology, Nanjing, China
Liang Xiao
Nanjing University of Science and Technology, Nanjing, China
Jian Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yan, G., Li, Y., Zhang, S., Chen, Z. (2019). Data Augmentation for Deep Learning of Judgment Documents. In: Cui, Z., Pan, J., Zhang, S., Xiao, L., Yang, J. (eds) Intelligence Science and Big Data Engineering. Big Data and Machine Learning. IScIDE 2019. Lecture Notes in Computer Science(), vol 11936. Springer, Cham. https://doi.org/10.1007/978-3-030-36204-1_19

Download citation

DOI: https://doi.org/10.1007/978-3-030-36204-1_19
Published: 29 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36203-4
Online ISBN: 978-3-030-36204-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics