skip to main content
10.1145/3594536.3595146acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicailConference Proceedingsconference-collections
research-article

Beyond Readability with RateMyPDF: A Combined Rule-based and Machine Learning Approach to Improving Court Forms

Authors Info & Claims
Published:07 September 2023Publication History

ABSTRACT

In this paper, we describe RateMyPDF, a web application that helps authors measure and improve the usability of court forms. It offers a score together with automated suggestions to improve the form drawn from both traditional machine learning approaches and the general purpose GPT-3 large language model. We worked with form authors and usability experts to determine the set of features we measure and validated them by gathering a dataset of approximately 24,000 PDF forms from 46 U.S. States and the District of Columbia. Our tool and automated measures allow a form author or court tasked with improving a large library of forms to work at scale.

This paper describes the features that we find improve form usability, the results from our analysis of the large form dataset, details of the tool, and the implications of our tool on access to justice for self-represented litigants. We found that the RateMyPDF score significantly correlates to the score of expert reviewers.

While the current version of the tool allows automated analysis of Microsoft Word and PDF court forms, the findings of our research apply equally to the growing number of automated wizard-driven interactive legal applications that replace paper forms with interactive websites.

References

  1. Rebekah George Benjamin. 2012. Reconstructing Readability: Recent Developments and Recommendations in the Analysis of Text Difficulty. Educ Psychol Rev 24, 1 (March 2012), 63--88. DOI:https://doi.org/10.1007/s10648-011-9181-8Google ScholarGoogle ScholarCross RefCross Ref
  2. Allen Russell Boehm. Ohio Forms Burden Reduction Act. Ohio (on file with author).Google ScholarGoogle Scholar
  3. G. Bradski. 2000. The OpenCV Library. Dr. Dobb's Journal of Software Tools (2000).Google ScholarGoogle Scholar
  4. Jack Cushman, Matthew Dahl, and Michael Lissner. 2021. eyecite: A tool for parsing legal citations. JOSS 6, 66 (October 2021), 3617. DOI:https://doi.org/10.21105/joss.03617Google ScholarGoogle ScholarCross RefCross Ref
  5. Edgar Dale and Jeanne S. Chall. 1948. A Formula for Predicting Readability: Instructions. Educational Research Bulletin 27, 2 (1948), 37--54.Google ScholarGoogle Scholar
  6. Alice Davison and Robert N. Kantor. 1982. On the Failure of Readability Formulas to Define Readable Texts: A Case Study from Adaptations. Reading Research Quarterly 17, 2 (1982), 187--209. DOI:https://doi.org/10.2307/747483Google ScholarGoogle ScholarCross RefCross Ref
  7. William H. DuBay. 2007. Smart Language: Readers, Readability, and the Grading of Text. Retrieved February 3, 2023 from https://eric.ed.gov/?id=ED506403Google ScholarGoogle Scholar
  8. Anne Fernald, Virginia A. Marchman, and Adriana Weisleder. 2013. SES differences in language processing skill and vocabulary are evident at 18 months. Developmental Science 16, 2 (2013), 234--248. DOI:https://doi.org/10.1111/desc.12019Google ScholarGoogle ScholarCross RefCross Ref
  9. Rudolph Flesch. 1948. A new readability yardstick. Journal of Applied Psychology 32, (1948), 221--233. DOI:https://doi.org/10.1037/h0057532Google ScholarGoogle ScholarCross RefCross Ref
  10. Thomas François, Adeline Müller, Eva Rolin, and Magali Norré. 2020. AMesure: A Web Platform to Assist the Clear Writing of Administrative Texts. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: System Demonstrations, Association for Computational Linguistics, Suzhou, China, 1--7. Retrieved November 9, 2022 from https://aclanthology.org/2020.aacl-demo.1Google ScholarGoogle Scholar
  11. Dr Jörg Fuchs, Tina Heyer, and Diana Langenhan. 2008. Influence of Font Sizes on the Readability and Comprehensibility of Package Inserts. Pharm. Ind. (2008).Google ScholarGoogle Scholar
  12. Paula Hannaford, Scott Graves, and Shelley Spacek Miller. 2015. The Landscape of Civil Litigation in State Courts. National Center for State Courts. Retrieved May 1, 2023 from https://www.ncsc.org/__data/assets/pdf_file/0020/13376/civiljusticereport-2015.pdfGoogle ScholarGoogle Scholar
  13. Matthew Honnibal and Ines Montani. 2017. spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. Retrieved February 2, 2023 from https://spacy.io/Google ScholarGoogle Scholar
  14. Caroline Jarrett, Gerry Gaffney, and Steve Krug. 2008. Forms that Work: Designing Web Forms for Usability (1st edition ed.). Morgan Kaufmann, Amsterdam; Boston.Google ScholarGoogle Scholar
  15. Marc Lauritsen and Quinten Steenhuis. 2019. Substantive Legal Software Quality: A Gathering Storm? In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law, ACM, Montreal QC Canada, 52--62. DOI:https://doi.org/10.1145/3322640.3326706Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Irving Lorge and Raphael Blau. 1941. Reading Comprehension of Adults. Teachers College Record 43, 3 (December 1941), 1--6. DOI:https://doi.org/10.1177/016146814104300303Google ScholarGoogle ScholarCross RefCross Ref
  17. Shelley Miller-Shaul. 2005. The characteristics of young and adult dyslexics readers on reading and reading related cognitive tasks as compared to normal readers. Dyslexia 11, 2 (2005), 132--151. DOI:https://doi.org/10.1002/dys.290Google ScholarGoogle ScholarCross RefCross Ref
  18. A. Miniukovich, A. De angeli, S. Sulpizio, and P. Venuti. 2017. Design guidelines for web readability. In DIS 2017 - Proceedings of the 2017 ACM Conference on Designing Interactive Systems, Association for Computing Machinery, Inc, Edinburgh, 285--296. DOI:https://doi.org/10.1145/3064663.3064711Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12, (2011), 2825--2830.Google ScholarGoogle Scholar
  20. Janice Redish. 2000. Readability formulas have even more limitations than Klare discusses. ACM J. Comput. Doc. 24, 3 (August 2000), 132--137. DOI:https://doi.org/10.1145/344599.344637Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Luz Rello, Martin Pielot, and Mari-Carmen Marcos. 2016. Make It Big! The Effect of Font Size and Line Spacing on Online Readability. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16), Association for Computing Machinery, New York, NY, USA, 3637--3648. DOI:https://doi.org/10.1145/2858036.2858204Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. John Sabatini. 2015. Understanding the Basic Reading Skills of U.S. Adults: Reading Components in the PIAAC Literacy Survey. ETS Center for Research on Human Capital and Education. Retrieved February 3, 2023 from https://eric.ed.gov/?id=ED593006Google ScholarGoogle Scholar
  23. Amir Sepehri, David Matthew Markowitz, and Mitra Mir. 2022. PassivePy: A Tool to Automatically Identify Passive Voice in Big Text Dat. DOI:https://doi.org/10.31234/osf.io/bwp3tGoogle ScholarGoogle ScholarCross RefCross Ref
  24. Quinten Steenhuis and David Colarusso. 2021. Digital Curb Cuts: Towards an Open Forms Ecosystem. Akron Law Review 54, 4 (2021), 2.Google ScholarGoogle Scholar
  25. Suffolk Law School's Legal Innovation and Technology Lab. About Spot. Retrieved February 9, 2021 from https://spot.suffolklitlab.org/Google ScholarGoogle Scholar
  26. Susanne Trauzettel-Klosinski, Klaus Dietz, and the IReST Study Group. 2012. Standardized Assessment of Reading Performance: The New International Reading Speed Texts IReST. Investigative Ophthalmology & Visual Science 53, 9 (August 2012), 5452--5461. DOI:https://doi.org/10.1167/iovs.11-8284Google ScholarGoogle ScholarCross RefCross Ref
  27. Linda Veiga, Tomasz Janowski, and Luís Soares Barbosa. 2016. Digital Government and Administrative Burden Reduction. In Proceedings of the 9th International Conference on Theory and Practice of Electronic Governance (ICEGOV '15-16), Association for Computing Machinery, New York, NY, USA, 323--326. DOI:https://doi.org/10.1145/2910019.2910107Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Washington Law Help. 2022. How to File Petition for Order of Protection. Retrieved February 6, 2023 from https://www.washingtonlawhelp.org/files/C9D2EA3F-0350-D9AF-ACAE-BF37E9BC9FFA/attachments/9100D6C9-D107-4B15-87B3-A898F12B6FD8/3701en_how-to-file-petition-for-order-of-protection.pdfGoogle ScholarGoogle Scholar
  29. Antoinette Welsh. 2013. Effects of Trauma Induced Stress on Attention, Executive Functioning, Processing Speed, and Resilience in Urban Children. Seton Hall University Dissertations and Theses (ETDs) (December 2013). Retrieved from https://scholarship.shu.edu/dissertations/1907Google ScholarGoogle Scholar
  30. Jenny Ziviani and John Elkins. 1984. An Evaluation of Handwriting Performance. Educational Review 36, 3 (November 1984), 249--261. DOI:https://doi.org/10.1080/0013191840360304Google ScholarGoogle ScholarCross RefCross Ref
  31. 2015. Paperwork Reduction Act (44 U.S.C. 3501 et seq.). Digital.gov. Retrieved February 2, 2023 from https://digital.gov/resources/paperwork-reduction-act-44-u-s-c-3501-et-seq/Google ScholarGoogle Scholar
  32. 2023. RateMyPDF. Retrieved February 3, 2023 from https://github.com/SuffolkLITLab/RateMyPDFGoogle ScholarGoogle Scholar
  33. 2023. FormFyxer. Retrieved February 3, 2023 from https://github.com/SuffolkLITLab/FormFyxerGoogle ScholarGoogle Scholar
  34. 2023. Textstat. Retrieved February 7, 2023 from https://github.com/textstat/textstatGoogle ScholarGoogle Scholar
  35. How to write good questions for forms - NHS digital service manual. nhs.uk. Retrieved February 6, 2023 from https://service-manual.nhs.ukGoogle ScholarGoogle Scholar
  36. Restraining order/abuse prevention order court forms | Mass.gov. Retrieved February 6, 2023 from https://www.mass.gov/lists/restraining-orderabuse-prevention-order-court-formsGoogle ScholarGoogle Scholar
  37. How to estimate burden | A Guide to the Paperwork Reduction Act. Retrieved November 9, 2022 from https://pra.digital.gov/burden/estimation/Google ScholarGoogle Scholar
  38. LIST:Legal Issues Taxonomy. LIST: Legal Issues Taxonomy. Retrieved February 7, 2023 from https://taxonomy.legal/Google ScholarGoogle Scholar
  39. About the Form Explorer? Retrieved February 7, 2023 from https://suffolklitlab.org/form-explorer/Google ScholarGoogle Scholar
  40. Requests: HTTP for Humans™ --- Requests 2.28.2 documentation. Retrieved February 3, 2023 from https://requests.readthedocs.io/en/latest/Google ScholarGoogle Scholar
  41. Field labels to use in template files | The Document Assembly Line Project. Retrieved February 3, 2023 from https://suffolklitlab.org/docassemble-AssemblyLine-documentation/docs/label_variablesGoogle ScholarGoogle Scholar
  42. plainlanguage.gov | Choose your words carefully. Retrieved April 29, 2023 from https://www.plainlanguage.gov/guidelines/words/Google ScholarGoogle Scholar

Index Terms

  1. Beyond Readability with RateMyPDF: A Combined Rule-based and Machine Learning Approach to Improving Court Forms

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              ICAIL '23: Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law
              June 2023
              499 pages
              ISBN:9798400701979
              DOI:10.1145/3594536

              Copyright © 2023 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 7 September 2023

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed limited

              Acceptance Rates

              Overall Acceptance Rate69of169submissions,41%
            • Article Metrics

              • Downloads (Last 12 months)36
              • Downloads (Last 6 weeks)6

              Other Metrics

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader