Skip to main content

Document Structure Analysis with Syntactic Model and Parsers: Application to Legal Judgments

  • Conference paper
New Frontiers in Artificial Intelligence (JSAI-isAI 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7258))

Included in the following conference series:

  • 619 Accesses

Abstract

The structure of a type of documents described in a common format like legal judgments can be expressed by and extracted by using syntax rules. In this paper, we propose a novel method for document structure analysis, based on a method to describe syntactic structure of documents with an abstract document model, and a method to implement a document structure parser by a combination of syntactic parsers. The parser implemented with this method has high generality and extensibility, thus it works well for a variety of document types with common description format, especially for legal documents such as judgments and legislations, while achieving high accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Bacci, L., Spinosa, P., Marchetti, C., Battistoni, R., Senate, I.: Automatic mark-up of legislative documents and its application to parallel text generation. IDT, 45 (2009)

    Google Scholar 

  2. Barzilay, R., Lee, L.: Catching the drift: Probabilistic content models, with applications to generation and summarization. In: Proceedings of HLT-NAACL, vol. 2004 (2004)

    Google Scholar 

  3. Blei, D.M., Lafferty, J.D.: Topic models. Text Mining: Classification, Clustering, and Applications 10, 71 (2009)

    Article  Google Scholar 

  4. Ford, B.: Parsing expression grammars: a recognition-based syntactic foundation. ACM SIGPLAN Notices 39, 111–122 (2004)

    Article  Google Scholar 

  5. Hutton, G., Meijer, E.: Monadic parsing in haskell. Journal of Functional Programming 8(4), 437–444 (1998)

    Article  MATH  Google Scholar 

  6. Klink, S., Dengel, A., Kieninger, T.: Document structure analysis based on layout and textual features. In: Proc. of International Workshop on Document Analysis Systems, DAS 2000, pp. 99–111. Citeseer (2000)

    Google Scholar 

  7. Lee, K.H., Choy, Y.C., Cho, S.B.: Logical structure analysis and generation for structured documents: a syntactic approach. IEEE Transactions on Knowledge and Data Engineering, 1277–1294 (2003)

    Google Scholar 

  8. Li, W., McCallum, A.: Pachinko allocation: Scalable mixture models of topic correlations. J. of Machine Learning Research (2008) (submitted)

    Google Scholar 

  9. Mao, S., Rosenfeld, A., Kanungo, T.: Document structure analysis algorithms: a literature survey. In: Proc. SPIE Electronic Imaging, vol. 5010, pp. 197–207. Citeseer (2003)

    Google Scholar 

  10. Moens, M.F., Uyttendaele, C.: Automatic text structuring and categorization as a first step in summarizing legal cases. Information Processing & Management 33(6), 727–737 (1997)

    Article  Google Scholar 

  11. Moors, A., Piessens, F., Odersky, M.: Parser combinators in scala. CW Reports, vol. CW491. Department of Computer Science, KU Leuven (2008)

    Google Scholar 

  12. Namboodiri, A., Jain, A.: Document structure and layout analysis. In: Digital Document Processing, pp. 29–48 (2007)

    Google Scholar 

  13. Odersky, M., Altherr, P., Cremet, V., Emir, B., Maneth, S., Micheloud, S., Mihaylov, N., Schinz, M., Stenman, E., Zenger, M.: An overview of the scala programming language. Technical report. Citeseer (2004)

    Google Scholar 

  14. Rangoni, Y., Belaïd, A.: Document Logical Structure Analysis Based on Perceptive Cycles. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 117–128. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  15. Steyvers, M., Griffiths, T.: Probabilistic topic models. In: Handbook of Latent Semantic Analysis, vol. 427(7), pp. 424–440 (2007)

    Google Scholar 

  16. Summers, K.: Automatic discovery of logical document structure (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Igari, H., Shimazu, A., Ochimizu, K. (2012). Document Structure Analysis with Syntactic Model and Parsers: Application to Legal Judgments. In: Okumura, M., Bekki, D., Satoh, K. (eds) New Frontiers in Artificial Intelligence. JSAI-isAI 2011. Lecture Notes in Computer Science(), vol 7258. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32090-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32090-3_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32089-7

  • Online ISBN: 978-3-642-32090-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics