skip to main content
10.1145/3477314.3507339acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
poster

A transfer learning approach to predict shipment description quality

Published: 06 May 2022 Publication History

Abstract

International shipments always have a harmonized system code (HSCode) associated with them, to determine the tariff for the custom declaration. The HSCode is derived from the shipment description that the customer provides, which makes the quality of the description important to assign the correct code. When the description is too generic or incomplete, the logistic company will have to contact the customer in order to find out the content of the shipment. Due to the fact that there is no effective way to identify the quality of description, we developed a description quality evaluation model, based on deep learning combined with domain knowledge. By using a 2000 shipments data set with scores ranging from 0 to 4 provided by experts, where 4 represents the best quality possible, the developed model can classify 45.17% of the data correctly and 43.95% of the data with 1 score difference(i.e predict label 1 as 2 or 0) from the human annotated ground truth. This model can be used for historical data analysis, and potentially giving customers on-site feedback when they are providing a bad description for the shipment content.

References

[1]
Fabian Pedregosa-Izquierdo. Feature extraction and supervised learning on fMRI: from practice to theory. PhD thesis, Université Pierre et Marie Curie-Paris VI, 2015.
[2]
Alice Davison and Robert N Kantor. On the failure of readability formulas to define readable texts: A case study from adaptations. Reading research quarterly, pages 187--209, 1982.
[3]
Eleni Miltsakaki, Rashmi Prasad, Aravind K Joshi, and Bonnie L Webber. The penn discourse treebank. In LREC. Citeseer, 2004.
[4]
Michael Alexander Kirkwood Halliday and Ruqaiya Hasan. Cohesion in english. Number 9. Routledge, 2014.
[5]
Emily Pitler and Ani Nenkova. Revisiting readability: A unified framework for predicting text quality. In Proceedings of the 2008 conference on empirical methods in natural language processing, pages 186--195, 2008.
[6]
Lijun Feng, Martin Jansche, Matt Huenerfauth, and Noémie Elhadad. A comparison of features for automatic readability assessment. 2010.
[7]
Scott A Crossley, Stephen Skalicky, Mihai Dascalu, Danielle S McNamara, and Kristopher Kyle. Predicting text comprehension, processing, and familiarity in adult readers: New approaches to readability formulas. Discourse Processes, 54(5--6):340--359, 2017.
[8]
Miriam Cha, Youngjune Gwon, and HT Kung. Language modeling by clustering with word embeddings for text readability assessment. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pages 2003--2006, 2017.
[9]
Tovly Deutsch, Masoud Jasbi, and Stuart Shieber. Linguistic features for readability assessment. arXiv preprint arXiv:2006.00377, 2020.
[10]
Farah Nadeem and Mari Ostendorf. Estimating linguistic complexity for science texts. In Proceedings of the thirteenth workshop on innovative use of NLP for building educational applications, pages 45--55, 2018.
[11]
Ion Madrazo Azpiazu and Maria Soledad Pera. Multiattentive recurrent neural network architecture for multilingual readability assessment. Transactions of the Association for Computational Linguistics, 7:421--436, 2019.
[12]
Matej Martinc, Senja Pollak, and Marko Robnik-Šikonja. Supervised and un-supervised neural approaches to text readability. Computational Linguistics, 47(1):141--179, 2021.
[13]
Michael Gamon, Anthony Aue, and Martine Smets. Sentence-level mt evaluation without reference translations: Beyond language modeling. In Proceedings of the 10th EAMT Conference: Practical applications of machine translation, 2005.
[14]
Mirella Lapata, Regina Barzilay, et al. Automatic evaluation of text coherence: Models and representations. In IJCAI, volume 5, pages 1085--1090. Citeseer, 2005.
[15]
Lucia Specia, Marco Turchi, Nicola Cancedda, Nello Cristianini, and Marc Dymetman. Estimating the sentence-level quality of machine translation systems. In EAMT, volume 9, pages 28--35, 2009.
[16]
Hyun Kim and Jong-Hyeok Lee. A recurrent neural networks approach for estimating the quality of machine translation output. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 494--498, 2016.
[17]
Kashif Shah, Fethi Bougares, Loïc Barrault, and Lucia Specia. Shef-lium-nn: Sentence level quality estimation with neural network features. In Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers, pages 838--842, 2016.
[18]
Hyun Kim, Jong-Hyeok Lee, and Seung-Hoon Na. Predictor-estimator using multilevel task learning with stack propagation for neural quality estimation. In Proceedings of the Second Conference on Machine Translation, pages 562--568, 2017.
[19]
Maoxi Li, Qingyu Xiang, Zhiming Chen, and Mingwen Wang. A unified neural network for quality estimation of machine translation. IEICE TRANSACTIONS on Information and Systems, 101(9):2417--2421, 2018.
[20]
Qu Cui, Shujian Huang, Jiahuan Li, Xiang Geng, Zaixiang Zheng, Guoping Huang, and Jiajun Chen. Directqe: Direct pretraining for machine translation quality estimation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 12719--12727, 2021.
[21]
Zixuan Ke and Vincent Ng. Automated essay scoring: A survey of the state of the art. In IJCAI, volume 19, pages 6300--6308, 2019.
[22]
Kaveh Taghipour and Hwee Tou Ng. A neural approach to automated essay scoring. In Proceedings of the 2016 conference on empirical methods in natural language processing, pages 1882--1891, 2016.
[23]
Dimitrios Alikaniotis, Helen Yannakoudakis, and Marek Rei. Automatic text scoring using neural networks. arXiv preprint arXiv:1606.04289, 2016.
[24]
Fei Dong, Yue Zhang, and Jie Yang. Attention-based recurrent convolutional neural network for essay scoring. In Proceedings of the 21st conference on computational natural language learning (CoNLL 2017), pages 153--162, 2017.
[25]
Yi Tay, Minh C Phan, Luu Anh Tuan, and Siu Cheung Hui. Skipflow: Incorporating neural coherence features for end-to-end automatic text scoring. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
[26]
Youmna Farag, Helen Yannakoudakis, and Ted Briscoe. Neural automated essay scoring and coherence modeling for adversarially crafted input. arXiv preprint arXiv:1804.06898, 2018.
[27]
Dan Hendrycks and Kevin Gimpel. A baseline for detecting misclassified and out-of-distribution examples in neural networks. arXiv preprint arXiv:1610.02136, 2016.
[28]
Yonatan Geifman and Ran El-Yaniv. Selective classification for deep neural networks. arXiv preprint arXiv:1705.08500, 2017.
[29]
anonymous for blind review. anonymous for blind review. Number ? ?, ?
[30]
Alec Radford, Jeffrey Wu, Dario Amodei, Daniela Amodei, Jack Clark, Miles Brundage, and Ilya Sutskever. Better language models and their implications. OpenAI Blog https://openai.com/blog/better-language-models, 1:2, 2019.
[31]
Fabian Pedregosa, Francis Bach, and Alexandre Gramfort. On the consistency of ordinal regression methods. Journal of Machine Learning Research, 18:1--35, 2017.
[32]
Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St John, Noah Constant, Mario Guajardo-Céspedes, Steve Yuan, Chris Tar, et al. Universal sentence encoder. arXiv preprint arXiv:1803.11175, 2018.

Cited By

View all
  • (2023)Sustainable PRS: A hybrid DL-based parcel recognition approach2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT56998.2023.10307236(1-5)Online publication date: 6-Jul-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '22: Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing
April 2022
2099 pages
ISBN:9781450387132
DOI:10.1145/3477314
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 May 2022

Check for updates

Author Tags

  1. automated text scoring
  2. description quality
  3. logistics

Qualifiers

  • Poster

Conference

SAC '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Sustainable PRS: A hybrid DL-based parcel recognition approach2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT56998.2023.10307236(1-5)Online publication date: 6-Jul-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media