To read this content please select one of the options below:

Identifying business information through deep learning: analyzing the tender documents of an Internet-based logistics bidding platform

Ying Yu (Nanjing University of Aeronautics and Astronautics College of Economics and Management, Nanjing, China) (Department of Economics and Management, Keyi College of Zhejiang Sci-Tech University, Shaoxing, China)
Jing Ma (Nanjing University of Aeronautics and Astronautics College of Economics and Management, Nanjing, China)

Data Technologies and Applications

ISSN: 2514-9288

Article publication date: 5 May 2023

Issue publication date: 29 January 2024

697

Abstract

Purpose

The tender documents, an essential data source for internet-based logistics tendering platforms, incorporate massive fine-grained data, ranging from information on tenderee, shipping location and shipping items. Automated information extraction in this area is, however, under-researched, making the extraction process a time- and effort-consuming one. For Chinese logistics tender entities, in particular, existing named entity recognition (NER) solutions are mostly unsuitable as they involve domain-specific terminologies and possess different semantic features.

Design/methodology/approach

To tackle this problem, a novel lattice long short-term memory (LSTM) model, combining a variant contextual feature representation and a conditional random field (CRF) layer, is proposed in this paper for identifying valuable entities from logistic tender documents. Instead of traditional word embedding, the proposed model uses the pretrained Bidirectional Encoder Representations from Transformers (BERT) model as input to augment the contextual feature representation. Subsequently, with the Lattice-LSTM model, the information of characters and words is effectively utilized to avoid error segmentation.

Findings

The proposed model is then verified by the Chinese logistic tender named entity corpus. Moreover, the results suggest that the proposed model excels in the logistics tender corpus over other mainstream NER models. The proposed model underpins the automatic extraction of logistics tender information, enabling logistic companies to perceive the ever-changing market trends and make far-sighted logistic decisions.

Originality/value

(1) A practical model for logistic tender NER is proposed in the manuscript. By employing and fine-tuning BERT into the downstream task with a small amount of data, the experiment results show that the model has a better performance than other existing models. This is the first study, to the best of the authors' knowledge, to extract named entities from Chinese logistic tender documents. (2) A real logistic tender corpus for practical use is constructed and a program of the model for online-processing real logistic tender documents is developed in this work. The authors believe that the model will facilitate logistic companies in converting unstructured documents to structured data and further perceive the ever-changing market trends to make far-sighted logistic decisions.

Keywords

Acknowledgements

Anonymous reviewers are acknowledged by the authors for their helpful suggestions.

Funding: This research was supported by the National Natural Science Foundation of China (No. 7,2174,086).

Conflict of interest: The corresponding author declares on behalf of all authors that there is no conflict of interest.

Citation

Yu, Y. and Ma, J. (2024), "Identifying business information through deep learning: analyzing the tender documents of an Internet-based logistics bidding platform", Data Technologies and Applications, Vol. 58 No. 1, pp. 42-61. https://doi.org/10.1108/DTA-08-2022-0308

Publisher

:

Emerald Publishing Limited

Copyright © 2023, Emerald Publishing Limited

Related articles