Full length article
Deep learning-based extraction of construction procedural constraints from construction regulations

https://doi.org/10.1016/j.aei.2019.101003Get rights and content

Abstract

Construction procedural constraints are critical in facilitating effective construction procedure checking in practice and for various inspection systems. Nowadays, the manual extraction of construction procedural constraints is costly and time-consuming. The automatic extraction of construction procedural constraint knowledge (e.g., knowledge entities and interlinks/relationships between them) from regulatory documents is a key challenge. Traditionally, natural language processing is implemented using either rule-based or machine learning approaches. Limited efforts on rule-based extraction of construction regulations often rely on pre-defined vocabularies and involve heavy feature engineering. Based on characteristics of the knowledge expression of construction procedural constraints in Chinese regulations, this paper explores a hybrid deep neural network, combining the bidirectional long short-term memory (Bi-LSTM) and the conditional random field (CRF), for the automatic extraction of the qualitative construction procedural constraints. Based on the proposed deep neural network, the recognition and extraction of named entities and relations between them are realized. Unlike existing information extraction research efforts using rule-based methods, the proposed hybrid deep learning approach can be applied without complex handcrafted features engineering. Besides, the long distance dependency relationships between different entities in regulations are considered. The model implementation results demonstrate the good performance of the end-to-end deep neural network in the extraction of construction procedural constraints. This study can be considered as one of the early explorations of knowledge extraction from construction regulations.

Introduction

Monitoring the construction process according to specified procedures in regulatory documents is important for assuring construction quality [1]. This process is called regulation-based construction quality compliance checking, which is domain knowledge intensive. Prior research has been conducted on regulation-based construction quality management systems to facilitate the checking process, and then help to reduce quality inspection errors and regulatory violations. Such research includes: a proposal for a system for construction quality management (QUALICON), integrating different computer-aided project management functions [2]; a computerized system applying PDAs and wireless internet for quality inspection and defect management (QIDMS) [3]; and a construction quality control system based on the 4D BIM and the process, organization, and product (POP) model [4]. For these systems, construction procedural constraint knowledge needs to be extracted from construction regulations and presented in a formal and machine-understandable format [5], [6]. Traditionally, these tasks are manually completed relying on domain expertise, which is costly and time-consuming. Thus, how to extract construction procedural constraint knowledge automatically from construction regulations remains a challenge.

Limited research efforts have been directed to the information extraction (IE) of construction regulatory clauses, which mainly focus on rule-based methods. It is manual effort intensive and time-consuming to develop extraction rules, and they are difficult to be reused for heavily domain-dependent and defined for specific regulatory documents. Moreover, these efforts involve heavy feature engineering, including exploring lexical, semantic, syntactic, general domain and specific domain ontology features [7], [8], [9], [10]. This heuristic and manual feature extraction process is conducted using human-specified shallow features, limited by human domain knowledge. Recent developments in deep neural network, specifically recurrent neural network (RNN), have presented new opportunities to model sequential time-series data with recurrent lateral connections. The motivating point of this research is to design a variant of RNN, with only word embedding features and no complex manual feature engineering, for the automatic extraction of construction procedural constraints.

A construction procedural constraint commonly includes construction procedures, construction objects, and the temporal relationships between activities. Specifically, there are some typical characteristics of the knowledge expression of construction procedural constraints in Chinese regulations, such as the diverse syntactic constructions of construction regulations and the complicated and ambiguous domain entities involved in a sentence without pause symbols. According to the research objective and the characteristics of research objects, semantic representations of entities involved in a regulation clause should be considered in sufficient context. Especially, to effectively identify the domain specific entities related to construction procedural constraints and classify relations among entities as stated in the construction regulations, long distance dependencies between different entities in a regulation clause are critical.

In this research, a hybrid deep learning approach is proposed for extracting construction procedural constraint knowledge automatically from construction regulations, which is divided into two steps: the named entity recognition and the pattern recognition (i.e. the classification of relations between entities). In named entity recognition, the Bi-LSTM-CRF (bidirectional long short-term memory and conditional random field) model is applied to identify and label entities (i.e. construction procedures, construction objects, and time intervals) in clauses. In pattern recognition, the LSTM-MLP (long short-term memory and multilayer perceptron) model is used to identify patterns of construction procedural constraints. Further, temporal relationships of specific construction procedures linked to the constraints can be extracted and expressed in a structured form. For implementing the proposed hybrid deep learning approach, clauses are prepared from 14 types of national standards defined in the Code for Acceptance of Construction Quality in China. Model testing results support the effectiveness of the proposed approach in the automatic extraction of construction procedural constraints. Additionally, the proposed hybrid deep neural network can be generalized to other regulation information extraction tasks in similar fields.

There are three main contributions of our work. (1) Existing research efforts in regulation information extraction mainly focus on quantitative knowledge, and little attention has been given to qualitative knowledge. This research expands regulation information extraction to qualitative construction procedural constraints. (2) The proposed approach expands the current regulation information extraction by incorporating deep-learning. Unlike most of the existing IE efforts in the construction domain which apply rule-based methods, a hybrid deep learning approach is proposed to extract the procedural constraint knowledge from construction regulations, without complex handcrafted feature engineering. To the best of authors’ knowledge, this work is the first attempt on using recurrent neural networks (more specifically, LSTM) to regulation information extraction, considering the long distance dependency relationships between entities. (3) Different from those rule-based methods, this research treats the regulation information extraction as a process of relation classification. This study can be considered as one of the early explorations of the construction procedure relation classification.

Section snippets

Background and related research

The IE is an automatic process for analyzing and identifying specified types of knowledge (e.g., concepts, events, and relations) in natural language texts, and recording them in a structured form [11], [12]. Currently, there are two main IE methods: (1) the rule-based method and (2) the machine learning-based method.

Constraint patterns of construction procedural constraints

In the construction industry, the construction procedure refers to tasks involving the whole construction process within a specific working sequence. In practice, construction activities need to be implemented according to specified procedures, something which is one of the key considerations in construction quality compliance checking. Construction procedural constraints are used to regulate the sequence of activities to be performed (e.g. whether a construction activity should (or not) be

The hybrid deep learning approach for the automatic extraction of construction procedural constraints

As a sequential pattern mining task, the extraction of construction procedural constraints are treated in two steps: the named entity recognition and the pattern recognition (i.e. the classification of relations between entities). (1) As one of the classic NLP task, the named entity recognition is traditionally conducted based on linear statistical models (e.g., HMM and CRF) [37], [38]. However, the named entity recognition performances with above models heavily lie in hand-crafted features

Implementation of the hybrid deep learning approach

To assess the capability of the hybrid deep learning approach, model training and testing were conducted. The implementation procedure is presented as follows.

Conclusions

Extraction of construction procedural constraints from construction regulations is critical in simplifying regulations looking up/learning, supporting the regulation-based construction quality management systems in creating or augmenting structured knowledge bases, and then facilitating the quality inspection process. Although existing supervised or unsupervised learning algorithms have offered ways for regulation knowledge modeling, they extensively rely on specific knowledge sources or

Declaration of Competing Interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled “Deep learning-based Extraction of Construction Procedural Constraints from Construction Regulations”.

Acknowledgements

This research is partly supported by “National Natural Science Foundation of China” (No. 51878311, No. 71732001, No. 71301059, No. 51978302). Besides, this research is supported by the Department of Building and Real Estate of The Hong Kong Polytechnic University, the General Research Fund (GRF) Grant (BRE/PolyU 152099/18E) and (BRE/PolyU 152047/19E).

References (58)

  • L.Y. Ding et al.

    A Deep hybrid learning model to detect unsafe behavior: integrating convolution neural networks and long short-term memory

    Autom. Constr.

    (2018)
  • S. Zheng et al.

    Joint entity and relation extraction based on a hybrid neural network

    Neurocomputing

    (2017)
  • W. Fang et al.

    Automated detection of workers and heavy equipment on construction sites: a convolutional neural network approach

    Adv. Eng. Inform.

    (2018)
  • P. Wang et al.

    Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification

    Neurocomputing

    (2016)
  • E.A. Cudney et al.

    Quality function deployment implementation in construction: a systematic literature review

    Front. Eng. Manage.

    (2016)
  • M.G. Battikha

    Qualicon: computer-based system for construction quality management

    J. Constr. Eng. Manage.

    (2002)
  • Y.S. Kim et al.

    A pda and wireless web-integrated system for quality inspection and defect management of apartment housing projects

    Autom. Constr.

    (2008)
  • S. Demir et al.

    A semantic web-based approach for representing and reasoning with vocabulary for computer based standards processing

    The Proceedings of the 2010 International Conference on Computing in Civil and Building Engineering (ICCCBE'10), Nottingham, UK, June 30–July 2, 2010

    (2010)
  • J.S. Zhang et al.

    Information transformation and automated reasoning for automated compliance checking in construction

  • J.S. Zhang et al.

    Semantic nlp-based IE from construction regulatory documents for automated compliance checking

    J. Comput. Civil Eng.

    (2016)
  • J.S. Zhang et al.

    Integrating semantic nlp and logic reasoning into a unified system for fully-automated code checking

    Autom. Constr.

    (2016)
  • J. Hobbs, E. Riloff, “IE.” Handbook of Natural Language Processing, second ed., Taylor & Francis Group, Boca Raton,...
  • D. Wimalasuriya et al.

    Ontology-based IE: an introduction and a survey of current approaches

    J. Inform. Sci. Eng.

    (2010)
  • L. Zhang et al.

    Aspect and Entity Extraction for Opinion Mining. Data Mining and Knowledge Discovery for Big Data

    (2014)
  • A. Kayed et al.

    Using ontologies to index conceptual structures for teaching automation

    Austr. Comput. Sci. Commun.

    (2002)
  • J. Makki et al.

    Ontology population via nlp techniques in risk management

    Int. J. Hum. Soc. Sci.

    (2009)
  • Y. Abuzir et al.

    Constructing the civil engineering thesaurus (cet) using theswb

    Sci. Educ.

    (2002)
  • A. Lavelli et al.

    Evaluation of machine learning-based IE algorithms: criticisms and recommendations

    Langu. Resour. Eval.

    (2008)
  • D. Freitag

    Machine learning for IE in informal domains

    Mach. Learn.

    (2000)
  • Cited by (73)

    View all citing articles on Scopus
    View full text