research-article

Automatic Risks Detection and Comparison Techniques for General Conditions of Technical Documents in Purchasing Order

Authors:
Chae-Yeon Kim

Graduate Institute of Ferrous and Energy Materials Technology, Pohang University of Science and Technology (POSTECH), South Korea

Graduate Institute of Ferrous and Energy Materials Technology, Pohang University of Science and Technology (POSTECH), South Korea
View Profile

,
So-Won Choi

Graduate Institute of Ferrous and Energy Materials Technology, Pohang University of Science and Technology (POSTECH), South Korea

Graduate Institute of Ferrous and Energy Materials Technology, Pohang University of Science and Technology (POSTECH), South Korea
View Profile

,
Jong-Gwan Jeong

Graduate Institute of Ferrous and Energy Materials Technology, Pohang University of Science and Technology (POSTECH), South Korea

Graduate Institute of Ferrous and Energy Materials Technology, Pohang University of Science and Technology (POSTECH), South Korea
View Profile

,
Eul-Bum Lee

Graduate Institute of Ferrous and Energy Materials Technology, Pohang University of Science and Technology (POSTECH), South Korea

Graduate Institute of Ferrous and Energy Materials Technology, Pohang University of Science and Technology (POSTECH), South Korea
View Profile

ICCTA '22: Proceedings of the 2022 8th International Conference on Computer Technology ApplicationsMay 2022Pages 236–241https://doi.org/10.1145/3543712.3543721

Published:20 September 2022Publication History

ICCTA '22: Proceedings of the 2022 8th International Conference on Computer Technology Applications

Pages 236–241

ABSTRACT

This research is to develop a technique that recognizes the technical documents as part of purchasing order (PO) exchanged between the owner (buyer) and the supplier (seller) in capital investment, such as maintenance and replacement of equipment, automatically detect specific potential risk contained clauses and shows the comparison results. This research has selected the proof of concept (PoC) technology to (1) the performance guarantee clauses for the purchasing equipment and (2) the delivery schedule requirement clauses to be checked and compared with the utmost cares when reviewing technical documents by the plant owner. The PoC research was implemented based on the Python programming language in conjunction with the spaCy libraries, and further was developed to a cloud-platform-based application for user implementation. This technique preprocesses technical documents of PO in PDF format and, after recognizing and converts into the entire text, detects and extracts the risks-related sentences with logic created by analyzing the patterns of PoC clauses. This research also built a database of all units and formats that can be used in PoC clauses and developed knowledge-based rules to normalize PoC clauses expressed differently in two (buyer's and seller's) and documents. Finally, the result of comparing PoC clauses unified in the same unit and format is output to an Excel or CSV file. Also, these techniques and comparison results were verified through the confusion matrix and accuracy-check. This study is expected to reduce the workload and improve practitioners' productivity in engineering procurement processes for capital investment projects.

References

David Brennan. 2020. Process Industry Economics: Principles, Concepts and Applications (2nd ed.). Elsevier.Google Scholar
Nikil Kumar, Philip Besuner, Steven Lefton, Dwight Agan, and Douglas Hilleman. 2012. Power plant cycling costs (No. NREL/SR-5500-55433). National Renewable Energy Lab.(NREL), Golden, CO (United States).Google Scholar
Olga Kononova, Tanjin He, HaoyanHuo, Amalie Trewartha, Elsa A. Olivetti, and Gerbrand Ceder.2021. Opportunities and Challenges of Text Mining in Materials Research. iScience 24, no. 3, 102155. https://doi.org/10.1016/j.isci.2021.102155.Google Scholar
AshwinIttoo, Le Minh Nguyen, and Antal van den Bosch. 2016. Text Analytics in Industry: Challenges, Desiderata and Trends. Computers in Industry 78, 96–107. https://doi.org/10.1016/j.compind.2015.12.001.Google ScholarDigital Library
Al Omran, Fouad Nasser, and Christoph Treude. 2017. Choosing an NLP Library for Analyzing Software Documentation: A Systematic Literature Review and a Series of Experiments. In Proceedings of the IEEE/ACM 14th International Conference on Mining Software Repositories (MSR’ 17), Buenos Aires, Argentina. https://doi.org/10.1109/msr.2017.42.Google ScholarDigital Library
DuyguAltinok. 2021. MasteringspaCy: An end-to-end practical guide to implementing NLP applications using the Python ecosystem. Packt Publishing Ltd.Google Scholar
Louis Hickman, ThapaStuti, Louis Tay, Mengyang Cao, and Padmini Srinivasan. 2020. Text Preprocessing for Text Mining in Organizational Research: Review and Recommendations. Organizational Research Methods 25, no. 1, 114–146. https://doi.org/10.1177/1094428120971683.Google ScholarCross Ref
ChalermpolTapsai. 2018. Information Processing and Retrieval from CSV File by Natural Language. In Proceedings of the IEEE 3rd International Conference on Communication and Information Systems (ICCIS’18), Singapore. https://doi.org/10.1109/icomis.2018.8644947.Google Scholar
Christopher D. Manning, and HinrichSchutze. 1999. Foundations of Statistical Natural Language Processing. MIT Press.Google ScholarDigital Library
spaCy - Industrial-Strength Natural Language Processing. Explosion. Retrieved December 27, 2021 from https://spacy.io/Google Scholar
Lingraj Dora, Sanjay Agrawal, Rutuparna Panda, and Ajith Abraham. 2018. Nested Cross-Validation Based Adaptive Sparse Representation Algorithm and Its Application to Pathological Brain Classification. Expert Systems with Applications 114. 313–321. https://doi.org/10.1016/j.eswa.2018.07.039.Google Scholar
Scott Vanderbeck, Joseph Bockhorst, and Chad Oldfather. 2011. A Machine Learning Approach to Identifying Sections in Legal Briefs. In Proceedings of the 22nd Midwest Artificial Intelligence and Cognitive Science Conference (MAICS’11), Cincinnati, Ohio, USA.Google Scholar
Sofia Visa, Brian Ramsay, Anca Ralescu and Esther van der Knaap. 2011. Confusion Matrix‐based Feature Selection. In Proceedings of the 22nd Midwest Artificial Intelligence and Cognitive Science Conference (MAICS’11), Cincinnati, Ohio, USA.Google Scholar
Marina Sokolova, Nathalie Japkowicz, and Stan Szpakowicz. 2006. Beyond Accuracy, F-Score and Roc: A Family of Discriminant Measures for Performance Evaluation. Lecture Notes in Computer Science, 4304, 1015–1021. https://doi.org/10.1007/11941439_114.Google ScholarDigital Library

Recommendations

Information retrieval in technical documents: from the user's query to the information-unit tagging
SIGDOC '03: Proceedings of the 21st annual international conference on Documentation

Information retrieval systems within voluminous textual documents raise specific problems, such as the choice of the retrieval-unit and the relevance of each response. For the selection of the retrieval-unit, several solutions have been proposed, such ...
Read More
Consumer informedness and diverse consumer purchasing behaviors: Traditional mass-market, trading down, and trading out into the long tail

As truly informed consumers are increasingly able to find exactly what they want and willing to pay premium prices to obtain products with perfect fit for them, companies have responded with new product portfolio strategies and new pricing strategies, ...
Read More
Proactive and reactive purchasing planning under dependent demand, price, and yield risks

The trend of globalization and outsourcing makes supply unreliable and companies begin to have supplier diversity embedded into their procurement departments. Traditionally, contract suppliers are a major supply channel for many companies, while the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICCTA '22: Proceedings of the 2022 8th International Conference on Computer Technology Applications
May 2022
286 pages
ISBN:9781450396226
DOI:10.1145/3543712

Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 September 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Detection and Comparison Algorithm
Plant Projects
Purchasing Order (PO)
Scope of Work (SoW)
Technical Documents
Technical Specifications
spaCy
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 22
  Total Downloads
- Downloads (Last 12 months)9
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Automatic Risks Detection and Comparison Techniques for General Conditions of Technical Documents in Purchasing Order

ICCTA '22: Proceedings of the 2022 8th International Conference on Computer Technology Applications

ABSTRACT

References

Cited By

Recommendations

Information retrieval in technical documents: from the user's query to the information-unit tagging

Consumer informedness and diverse consumer purchasing behaviors: Traditional mass-market, trading down, and trading out into the long tail

Proactive and reactive purchasing planning under dependent demand, price, and yield risks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Automatic Risks Detection and Comparison Techniques for General Conditions of Technical Documents in Purchasing Order

ICCTA '22: Proceedings of the 2022 8th International Conference on Computer Technology Applications

ABSTRACT

References

Cited By

Recommendations

Information retrieval in technical documents: from the user's query to the information-unit tagging

Consumer informedness and diverse consumer purchasing behaviors: Traditional mass-market, trading down, and trading out into the long tail

Proactive and reactive purchasing planning under dependent demand, price, and yield risks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media