Full length articleDeep learning-based extraction of construction procedural constraints from construction regulations
Introduction
Monitoring the construction process according to specified procedures in regulatory documents is important for assuring construction quality [1]. This process is called regulation-based construction quality compliance checking, which is domain knowledge intensive. Prior research has been conducted on regulation-based construction quality management systems to facilitate the checking process, and then help to reduce quality inspection errors and regulatory violations. Such research includes: a proposal for a system for construction quality management (QUALICON), integrating different computer-aided project management functions [2]; a computerized system applying PDAs and wireless internet for quality inspection and defect management (QIDMS) [3]; and a construction quality control system based on the 4D BIM and the process, organization, and product (POP) model [4]. For these systems, construction procedural constraint knowledge needs to be extracted from construction regulations and presented in a formal and machine-understandable format [5], [6]. Traditionally, these tasks are manually completed relying on domain expertise, which is costly and time-consuming. Thus, how to extract construction procedural constraint knowledge automatically from construction regulations remains a challenge.
Limited research efforts have been directed to the information extraction (IE) of construction regulatory clauses, which mainly focus on rule-based methods. It is manual effort intensive and time-consuming to develop extraction rules, and they are difficult to be reused for heavily domain-dependent and defined for specific regulatory documents. Moreover, these efforts involve heavy feature engineering, including exploring lexical, semantic, syntactic, general domain and specific domain ontology features [7], [8], [9], [10]. This heuristic and manual feature extraction process is conducted using human-specified shallow features, limited by human domain knowledge. Recent developments in deep neural network, specifically recurrent neural network (RNN), have presented new opportunities to model sequential time-series data with recurrent lateral connections. The motivating point of this research is to design a variant of RNN, with only word embedding features and no complex manual feature engineering, for the automatic extraction of construction procedural constraints.
A construction procedural constraint commonly includes construction procedures, construction objects, and the temporal relationships between activities. Specifically, there are some typical characteristics of the knowledge expression of construction procedural constraints in Chinese regulations, such as the diverse syntactic constructions of construction regulations and the complicated and ambiguous domain entities involved in a sentence without pause symbols. According to the research objective and the characteristics of research objects, semantic representations of entities involved in a regulation clause should be considered in sufficient context. Especially, to effectively identify the domain specific entities related to construction procedural constraints and classify relations among entities as stated in the construction regulations, long distance dependencies between different entities in a regulation clause are critical.
In this research, a hybrid deep learning approach is proposed for extracting construction procedural constraint knowledge automatically from construction regulations, which is divided into two steps: the named entity recognition and the pattern recognition (i.e. the classification of relations between entities). In named entity recognition, the Bi-LSTM-CRF (bidirectional long short-term memory and conditional random field) model is applied to identify and label entities (i.e. construction procedures, construction objects, and time intervals) in clauses. In pattern recognition, the LSTM-MLP (long short-term memory and multilayer perceptron) model is used to identify patterns of construction procedural constraints. Further, temporal relationships of specific construction procedures linked to the constraints can be extracted and expressed in a structured form. For implementing the proposed hybrid deep learning approach, clauses are prepared from 14 types of national standards defined in the Code for Acceptance of Construction Quality in China. Model testing results support the effectiveness of the proposed approach in the automatic extraction of construction procedural constraints. Additionally, the proposed hybrid deep neural network can be generalized to other regulation information extraction tasks in similar fields.
There are three main contributions of our work. (1) Existing research efforts in regulation information extraction mainly focus on quantitative knowledge, and little attention has been given to qualitative knowledge. This research expands regulation information extraction to qualitative construction procedural constraints. (2) The proposed approach expands the current regulation information extraction by incorporating deep-learning. Unlike most of the existing IE efforts in the construction domain which apply rule-based methods, a hybrid deep learning approach is proposed to extract the procedural constraint knowledge from construction regulations, without complex handcrafted feature engineering. To the best of authors’ knowledge, this work is the first attempt on using recurrent neural networks (more specifically, LSTM) to regulation information extraction, considering the long distance dependency relationships between entities. (3) Different from those rule-based methods, this research treats the regulation information extraction as a process of relation classification. This study can be considered as one of the early explorations of the construction procedure relation classification.
Section snippets
Background and related research
The IE is an automatic process for analyzing and identifying specified types of knowledge (e.g., concepts, events, and relations) in natural language texts, and recording them in a structured form [11], [12]. Currently, there are two main IE methods: (1) the rule-based method and (2) the machine learning-based method.
Constraint patterns of construction procedural constraints
In the construction industry, the construction procedure refers to tasks involving the whole construction process within a specific working sequence. In practice, construction activities need to be implemented according to specified procedures, something which is one of the key considerations in construction quality compliance checking. Construction procedural constraints are used to regulate the sequence of activities to be performed (e.g. whether a construction activity should (or not) be
The hybrid deep learning approach for the automatic extraction of construction procedural constraints
As a sequential pattern mining task, the extraction of construction procedural constraints are treated in two steps: the named entity recognition and the pattern recognition (i.e. the classification of relations between entities). (1) As one of the classic NLP task, the named entity recognition is traditionally conducted based on linear statistical models (e.g., HMM and CRF) [37], [38]. However, the named entity recognition performances with above models heavily lie in hand-crafted features
Implementation of the hybrid deep learning approach
To assess the capability of the hybrid deep learning approach, model training and testing were conducted. The implementation procedure is presented as follows.
Conclusions
Extraction of construction procedural constraints from construction regulations is critical in simplifying regulations looking up/learning, supporting the regulation-based construction quality management systems in creating or augmenting structured knowledge bases, and then facilitating the quality inspection process. Although existing supervised or unsupervised learning algorithms have offered ways for regulation knowledge modeling, they extensively rely on specific knowledge sources or
Declaration of Competing Interest
We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled “Deep learning-based Extraction of Construction Procedural Constraints from Construction Regulations”.
Acknowledgements
This research is partly supported by “National Natural Science Foundation of China” (No. 51878311, No. 71732001, No. 71301059, No. 51978302). Besides, this research is supported by the Department of Building and Real Estate of The Hong Kong Polytechnic University, the General Research Fund (GRF) Grant (BRE/PolyU 152099/18E) and (BRE/PolyU 152047/19E).
References (58)
- et al.
A bim-based construction quality management model and its applications
Autom. Constr.
(2014) - et al.
Ontology-based semantic modeling of regulation constraint for automated construction quality compliance checking
Autom. Constr.
(2012) - et al.
Ontology-based automated IE from building energy conservation codes
Autom. Constr.
(2017) - et al.
Using linguistic features to automatically extract web page title
Expert Syst. Appl.
(2017) - et al.
Ontology-based semi-supervised conditional random fields for automated IE from bridge inspection reports
Autom. Constr.
(2017) Recurrent neural networks for classifying relations in clinical notes
J. Biomed. Inform.
(2017)- et al.
Convolutional neural network: Deep learning-based classification of building quality problems
Adv. Eng. Inf.
(2019) - et al.
A deep learning-based approach for mitigating falls from height with computer vision: Convolutional neural network
Adv. Eng. Inf.
(2019) - et al.
Falls from heights: a computer vision-based approach for safety harness detection
Autom. Constr.
(2018) - et al.
Convolutional neural networks: computer vision-based workforce activity assessment in construction
Autom. Constr.
(2018)
A Deep hybrid learning model to detect unsafe behavior: integrating convolution neural networks and long short-term memory
Autom. Constr.
Joint entity and relation extraction based on a hybrid neural network
Neurocomputing
Automated detection of workers and heavy equipment on construction sites: a convolutional neural network approach
Adv. Eng. Inform.
Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification
Neurocomputing
Quality function deployment implementation in construction: a systematic literature review
Front. Eng. Manage.
Qualicon: computer-based system for construction quality management
J. Constr. Eng. Manage.
A pda and wireless web-integrated system for quality inspection and defect management of apartment housing projects
Autom. Constr.
A semantic web-based approach for representing and reasoning with vocabulary for computer based standards processing
The Proceedings of the 2010 International Conference on Computing in Civil and Building Engineering (ICCCBE'10), Nottingham, UK, June 30–July 2, 2010
Information transformation and automated reasoning for automated compliance checking in construction
Semantic nlp-based IE from construction regulatory documents for automated compliance checking
J. Comput. Civil Eng.
Integrating semantic nlp and logic reasoning into a unified system for fully-automated code checking
Autom. Constr.
Ontology-based IE: an introduction and a survey of current approaches
J. Inform. Sci. Eng.
Aspect and Entity Extraction for Opinion Mining. Data Mining and Knowledge Discovery for Big Data
Using ontologies to index conceptual structures for teaching automation
Austr. Comput. Sci. Commun.
Ontology population via nlp techniques in risk management
Int. J. Hum. Soc. Sci.
Constructing the civil engineering thesaurus (cet) using theswb
Sci. Educ.
Evaluation of machine learning-based IE algorithms: criticisms and recommendations
Langu. Resour. Eval.
Machine learning for IE in informal domains
Mach. Learn.
Cited by (73)
A hybrid deep semantic mining method considering fuzzy expressions for the automatic recognition of construction safety hazard information
2024, Advanced Engineering InformaticsNatural language instructions for intuitive human interaction with robotic assistants in field construction work
2024, Automation in ConstructionAutomatic quality compliance checking in concrete dam construction: Integrating rule syntax parsing and semantic distance
2024, Advanced Engineering InformaticsText mining and natural language processing in construction
2024, Automation in ConstructionDeep learning-based text knowledge classification for whole-process engineering consulting standards
2024, Journal of Engineering Research (Kuwait)A text classification-based approach for evaluating and enhancing the machine interpretability of building codes
2024, Engineering Applications of Artificial Intelligence