skip to main content
research-article

What Is the Intended Usage Context of This Model? An Exploratory Study of Pre-Trained Models on Various Model Repositories

Published: 03 May 2023 Publication History

Abstract

There is a trend of researchers and practitioners to directly apply pre-trained models to solve their specific tasks. For example, researchers in software engineering (SE) have successfully exploited the pre-trained language models to automatically generate the source code and comments. However, there are domain gaps in different benchmark datasets. These data-driven (or machine learning based) models trained on one benchmark dataset may not operate smoothly on other benchmarks. Thus, the reuse of pre-trained models introduces large costs and additional problems of checking whether arbitrary pre-trained models are suitable for the task-specific reuse or not. To our knowledge, software engineers can leverage code contracts to maximize the reuse of existing software components or software services. Similar to the software reuse in the SE field, reuse SE could be extended to the area of pre-trained model reuse. Therefore, according to the model card’s and FactSheet’s guidance for suppliers of pre-trained models on what information they should be published, we propose model contracts including the pre- and post-conditions of pre-trained models to enable better model reuse. Furthermore, many non-trivial yet challenging issues have not been fully investigated, although many pre-trained models are readily available on the model repositories. Based on our model contract, we conduct an exploratory study of 1908 pre-trained models on six mainstream model repositories (i.e., the TensorFlow Hub, PyTorch Hub, Model Zoo, Wolfram Neural Net Repository, Nvidia, and Hugging Face) to investigate the gap between necessary pre- and post-condition information and actual specifications. Our results clearly show that (1) the model repositories tend to provide confusing information of the pre-trained models, especially the information about the task’s type, model, training set, and (2) the model repositories cannot provide all of our proposed pre/post-condition information, especially the intended use, limitation, performance, and quantitative analysis. On the basis of our new findings, we suggest that (1) the developers of model repositories shall provide some necessary options (e.g., the training dataset, model algorithm, and performance measures) for each of pre/post-conditions of pre-trained models in each task type, (2) future researchers and practitioners provide more efficient metrics to recommend suitable pre-trained model, and (3) the suppliers of pre-trained models should report their pre-trained models in strict accordance with our proposed pre/post-condition and report their models according to the characteristics of each condition that has been reported in the model repositories.

References

[1]
Abdullah and Mohammad S. Hasan. 2017. An application of pre-trained CNN for image classification. In Proceedings of 2017 20th International Conference of Computer and Information Technology (ICCIT’17). 1–6.
[2]
Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, and KaiWei Chang. 2021. Unified pre-training for program understanding and generation. In Proceedings of the International Conference of the North American Chapter of the Association for Computational Linguistics. 1–14.
[3]
Hassan Akbari, Liangzhe Yuan, Rui Qian, WeiHong Chuang, Shih-Fu Chang, Yin Cui, and Boqing Gong. 2021. VATT: Transformers for multimodal self-supervised learning from raw video, audio and text. arXiv:2104.11178 (2021), 1–15. https://arxiv.org/pdf/2104.11178v2.pdf.
[4]
Alhanoof Althnian, Duaa AlSaeed, Heyam AI-Baity, Amani Samha, Alanoud Bin Dris, Najla Alzakari, Afnan Abou Elwafa, and Heba Kurdi. 2021. Impact of dataset size on classification performance: An empirical evaluation in the medical domain. Applied Sciences 11, 796 (2021), 1–18.
[5]
Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software engineering for machine learning: A case study. In Proceedings of the International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP’19).25–31.
[6]
M. Arnold, R. K. E. Bellamy, M. Hind, S. Houde, S. Mehta, A. Mojsilovic, R. Nair, et al. 2019. FactSheets: Increasing trust in AI services through supplier’s declarations of conformity. IBM Journal of Research and Development 63, 4-5 (2019), 1–13.
[7]
Ron Artstein and Massimo Poesio. 2008. Inter-coder agreement for computational linguistics. Computational Linguistics 34, 4 (2008), 555–596.
[8]
Arun Babu, Changhan Wang, Andros Tjandra, Kushal Lakhotia, Qiantong Xu, Naman Goyal, Kritika Singh, et al. 2021. XLS-R: Self-supervised cross-lingual speech representation learning at scale. arxiv:2111.09296 (2021).
[9]
Alexei Baevski, Henry Zhou, Abdelrahman Mohamed, and Michael Auli. 2020. wav2vec 2.0: A framework for self-supervised learning of speech representations. In Proceedings of the International Conference on Neural Information Processing Systems.1–19.
[10]
Rabi Narayan Behera and Kajaree Das. 2017. A survey on machine learning: Concept, algorithms and applications. International Journal of Innovative Research in Computer and Communication Engineering 5, 2 (2017), 1301–1309.
[11]
Natalie Best, Jordan Ott, and Erik J. Linstead. 2020. Exploring the efficacy of transfer learning in mining image-based software artifacts. Journal of Big Data 59 (2020), 1–10.
[12]
Eeshita Biswas, Mehmet Efruz Karabulut, Lori Pollock, and K. Vijay-Shanker. 2020. Achieving reliable sentiment analysis in the software engineering domain using BERT. In Proceedings of the 2020 IEEE International Conference on Software Maintenance and Evolution. 162–173.
[13]
Amar Budhiraja, Kartik Dutta, Raghu Reddy, and Manish Shrivastava. 2018. DWEN: Deep word embedding network for duplicate bug report detection in software repositories. In Proceedings of the 40th International Conference on Software Engineering. 193–194.
[14]
Joy Buolamwini and Timnit Gebru. 2018. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Proceedings of the 1st Conference on Fairness, Accountability, and Transparency. 77–91. https://proceedings.mlr.press/v81/buolamwini18a.html.
[15]
Luca Buratti, Saurabh Pujar, Mihaela Bornea, Scott McCarley, Yunhui Zheng, Gaetano Rossiello, Alessandro Morari, et al. 2020. Exploring software naturalness through neural language models. arXiv:2006.12641 (2020), 1–12.
[16]
Yue Cao and Fatemeh H. Fard. 2022. Pre-Trained neural language models for automatic mobile app user feedback answer generation. In Proceedings of the International Conference on Automated Software Engineering. 1–6.
[17]
Kathy Charmaz. 2006. Constructing Grounded Theory. Sage.
[18]
Zhenpeng Chen, Yanbin Cao, Xuan Lu, Qiaozhu Mei, and Xuanzhe Liu. 2019. SEntiMoji: An emoji-powered learning approach for sentiment analysis in software engineering. In Proceedings of the 2019 ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1–12.
[19]
Zhenpeng Chen, Sheng Shen, Ziniu Hu, Xuan Lu, Qiaozhu Mei, and Xuanzhe Liu. 2019. Emoji-powered representation learning for cross-lingual sentiment classification. In Proceedings of the 29th International Joint Conference on Artificial Intelligence. 251–262.
[20]
William Dieterich. 2016. COMPAS Risk Scales, Demonstrating Accuracy, Equity, and Predictive Parity. Ph.D. Dissertation. University of Pretrial.
[21]
Lucas Dixon, John Li, Jeffrey Sorensen, Nithum Thain, and Lucy Vasserman. 2018. Measuring and mitigating unintended bias in text classification. In Proceedings of the 2018 AAAI/ACM conference on AI, Ethics, and Society. 67–73.
[22]
Vasiliki Efstathiou, Christos Chatzilenas, and Diomidis Spinellis. 2018. Word embeddings for the software engineering domain. In Proceedings of the 15th International Conference on Mining Software Repositories. 38–41.
[23]
Radwa Elshawi, Abdul Wahab, Ahmed Barnawi, and Sherif Sakr. 2021. DLBench: A comprehensive experimental evaluation of deep learning frameworks. Cluster Computing 24 (2021), 2017–2038.
[24]
Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song. 2018. Robust physical-world attacks on deep learning visual classification. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’18). 1036–1041.
[25]
George Fairbanks. 2019. Better code reviews with design by contract. IEEE Software 36, 6 (Oct. 2019), 53–56.
[26]
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A pre-trained model for programming and natural languages. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, 1536–1547.
[27]
W. B. Frakes and Kyo Kang. 2005. Software reuse research: Status and future. IEEE Transactions on Software Engineering 31, 7 (July2005), 529–536.
[28]
William B. Frakes and Kyo Kang. 2005. Software reuse research: Status and future. IEEE Transactions on Software Engineering 31, 7 (Aug. 2005), 529–536.
[29]
Eliane M. De Bortoli Fávero, Dalcimar Casanova, and Andrey Ricardo Pimentel. 2020. SE3M: A model for software effort estimation using pre-trained embedding models. Information and Software Technology 147, C (2020), 1–17.
[30]
Jingyue Gao, Xiting Wang, Yasha Wang, Zhao Yang, Junyi Gao, Jiangtao Wang, Wen Tang, and Xing Xie. 2019. CAMP: Co-Attention memory networks for diagnosis prediction in healthcare. In Proceedings of 19th IEEE International Conference on Data Mining. 1036–1041.
[31]
Lobna Ghadhab, Ilyes Jenhani, Mohamed Wiem Mkaouer, and Montassar Ben Messaoud. 2021. Augmenting commit classification by using fine-grained source code changes and a pre-trained deep neural language model. Information and Software Technology 135 (Dec. 2021), 1–13.
[32]
Mohab Ghanem, Ahmed Elnaggar, Adam McKinnon, Christian Debes, Olivier Boisard, and Florian Mattnes. 2021. Automated employee objective matching using pre-trained word embeddings. In Proceedings of the 2021 IEEE 25th International Enterprise Distributed Object Computing Conference. 51–60.
[33]
Antonios Gkortzis, Daniel Feitosa, and Diomidis Spinellis. 2021. Software reuse cuts both ways: An empirical analysis of its relationship with security vulnerabilities. Journal of Systems and Software 172 (Feb. 2021), 1–14.
[34]
Mohammad Abdul Hadi and Fatemeh H. Fard. 2021. Evaluating pre-trained models for user feedback analysis in software engineering: A study on classification of app-reviews. arXiv:2104.05861 (2021), 1–12.
[35]
Andrew F. Hayes and Klaus Krippendorff. 2007. Answering the call for a standard reliability measure for coding data. Communication Methods and Measures 1, 1 (2007), 77–89.
[36]
Vincent J. Hellendoorn and Premkumar Devanbu. 2017. Are deep neural networks the best choice for modeling source code? In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. 763–773.
[37]
Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861 (2017), 1–9. https://arxiv.org/pdf/1704.04861.pdf.
[38]
Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, and Paolo Tonella. 2020. Taxonomy of real faults in deep learning systems nargiz. In Proceedings of the International Conference on Software Engineering. 1110–1121.
[39]
J.-M. Jazequel and B. Meyer. 1997. Design by contract: The lessons of Ariane. Computer 1 (1997), 129–130.
[40]
Tae-Hwan Jung. 2021. CommitBERT: Commit message generation using pre-trained programming language model. In Proceedings of the 1st Workshop on Natural Language Processing for Programming. 26–33.
[41]
Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, and Kensen Shi. 2020. Learning and evaluating contextual embedding of source code. In Proceedings of the International Conference on Machine Learning. 1–21.
[42]
Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, and Kensen Shi. 2020. Pre-trained contextual embedding of source code. In Proceedings of the International Conference on Machine Learning. 1–21.
[43]
Anjan Karmakar and Romain Robbes. 2021. What do pre-trained code models know about code? arXiv:2108.11308 (2021), 1–5.
[44]
Will Kay, Joao Carreira, Karen Simonyan, Brian Zhang, Chloe Hillier, Sudheendra Vijayanarasimhan, Fabio Viola, et al. 2017. The Kinetics human action video dataset. arXiv:1705.06950 (2017), 1–22. https://arxiv.org/pdf/1705.06950v1.pdf.
[45]
Patrick Keller, Abdoul Kader Kabore, Laura Plein, Jacques Klein, Yves Le Traon, and Tegawende F. Bissyande. 2021. What you see is what it means! Semantic representation learning of code based on visualization and transfer learning. ACM Transactions on Software Engineering and Methodology 31, 2 (Dec. 2021), 1–34.
[46]
Foutse Khomh, Bram Adams, Jinghui Cheng, Marios Fokaefs, and Giuliano Antoniol. 2018. Software engineering for machine-learning applications: The road ahead. IEEE Software 35, 5 (2018), 81–84.
[47]
Marie-Anne Lachaux, Baptise Roziere, Marc Szafraniec, and Guillaume Lample. 2021. DOBF: A deobfuscation pre-training objective for programming languages. In Proceedings of the 35th Conference on Neural Information Processing Systems. 1–18.
[48]
Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.1–19.
[49]
Boao Li, Meng Yan, Xin Xia, Xing Hu, Ge Li, and David Lo. 2020. DeepCommenter: A deep code comment generation tool with hybrid lexical and syntactical information. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1571–1575.
[50]
Hao Li, Filipe R. Cogo, and Cor-Paul Bezemer. 2022. An empirical study of yanked releases in the Rust package registry. IEEE Transactions on Software Engineering. Early access, February 16, 2022.
[51]
Heng Li, Weiyi Shang, Bram Adams, Mohammed Sayagh, and Ahmed E. Hassan. 2021. A qualitative study of the benefits and costs of logging from developers’ perspectives. IEEE Transactions on Software Engineering 47 (12) (Dec. 2021), 2858–2873.
[52]
Peiliang Li, Xiaozhi Chen, and Shaojie Shen. 2019. Stereo R-CNN based 3D object detection for autonomius driving. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’19). 1–9.
[53]
Shengcai Liao and Ling Shao. 2021. TransMatcher: Deep image matching through transformers for generalizable person re-identification. In Proceedings of the 35th International Conference on Neural Information Processing Systems.1–12.
[54]
Jinfeng Lin, Yalin Liu, Qingkai Zeng, MengJiang, and Jane Cleland-Huang. 2021. Traceability transformed: Generating more accurate links with pre-trained BERT models. In Proceedings of the International Conference on Software Engineering. 324–335.
[55]
TsungYi Lin, Michael Maire, Serge Belongie, Lubomir Bourdev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, and Piotr Dollar. 2015. Microsoft COCO: Common objects in context. arXiv:1405.0312 (2015), 1–15. https://arxiv.org/pdf/1405.0312v3.pdf.
[56]
Bingchang Liu, Wei Huo, Chao Zhang, Wenchao Li, Feng Li, Aihua Piao, and Wei Zou. 2018. \(\alpha\)Diff: Cross-version binary code similarity detection with DNN. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 667–678.
[57]
Fang Liu, Ge Li, Yunfei Zhao, and Zhi Jin. 2020. Multi-task learning based pre-trained language model for code completion. In Proceedings of the International Conference on Automated Software Engineering. 1–13.
[58]
Kui Liu, Dongsun Kim, Tegawende F. Bissyande, Shin Yoo, and Yves Le Traon. 2018. Mining fix patterns for findbugs violations. IEEE Transactions on Software Engineering 47, 1 (Dec. 2018), 165–188.
[59]
Zhongxin Liu, Qiao Huang, Xin Xia, Emad Shihab, and David Lo. 2018. SATD detector: A text-mining-based self-admitted technical debt detection tool. In Proceedings of the 35th IEEE/ACM International Conference on Software Engineering. 9–12.
[60]
Anna Iliukovich Strakovskaia, Alexey Dral, and Emeli Dral. 2016. Using pre-trained models for fine-grained image classification in fashion field. In Proceedings of the First International Workshop on Fashion and KDD, 31–40.
[61]
Silverio Martinez Fernandez, Justus Bogner, Xavier Franch, Marc Oriol, Julien Siebert, Adam Trendowicz, Anna Maria Vollmer, and Stefan Wagner. 2022. Software engineering for AI-based systems: A survey. ACM Transactions on Software Engineering and Methodology 31, 2 (April 2022), Article 37e, 59 pages.
[62]
Ehsan Mashhadi and Hadi Hemmati. 2021. Applying CodeBERT for automated program repair of Java simple bugs. In Proceedings of the 2021 IEEE/ACM 18th International Conference on Mining Software Repositories. 505–509.
[63]
Antonio Mastropaolo, Simone Scalabrino, Nathan Cooper, David Nader Palacio, Denys Poshyvanyk, Rocco Oliveto, and Gabriele Bavota. 2021. Studying the usage of text-to-text transfer transformer to support code-related tasks. In Proceedings of the International Conference on Software Engineering. 1–12.
[64]
Kristof Meixner, Arndt Luder, Jan Herzog, Dietmar Winkler, and Stefan Biffl. 2021. Patterns for reuse in production systems engineering. International Journal of Software Engineering and Knowledge Engineering 31, 11n12 (2021), 1623–1659.
[65]
Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model cards for model reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 220–229.
[66]
Koji Nakamichi, Kyoko Ohashi, Isao Namba, Rieko Yamamoto, Mikio Aoyama, Lisa Joeckel, Julien Siebert, and Jens Heidrich. 2020. Requirements-driven method to determine quality characteristics and measurements for machine learning software and its evaluation. In Proceedings of 2020 IEEE 28th International Requirements Engineering Conference. 67–73.
[67]
Nathalia Nascimento, Carlos Lucena, Paulo Alencar, and Donald Cowan. 2018. Software engineers vs. machine learning algorithms: An empirical study assessing performance and reuse tasks. arXiv:1802.01096 (2018), 1–22.
[68]
Vassil Panayotov, Guoguo Chen, Daniel Povey, and Sanjeev Khudanpur. 2015. Librispeech: An ASR corpus based on public domain audio books. In Proceedings of 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’15). 353–355.
[69]
Giovanni Paolini, BenAthiwaratkun, Jason Krone, Jie Ma, Alessandro Achille, Rishita Anubhai, Cicero Nogueira dos Santos, Bing Xiang, and Stefano Soatto. 2021. Structured prediction as translation between augmented natural languages. In Proceedings of the International Conference on Learning Representations.1–26.
[70]
Luis Perez, Lizi Ottens, and Sudharshan Viswanathan. 2021. Automatic code generation using pre-trained language models. arXiv:2102.10535 (2021), 1–9.
[71]
Julian Aron Prenner and Romain Robbes. 2021. Making the most of small Software Engineering datasets with modern machine learning. arXiv:2106.15209 (2021), 1–23.
[72]
Xipeng Qiu, Tianxiang Sun, Yige Xu, YunFan Shao, Ning Dai, and Xuanjing Huang. 2020. Pre-trained models for natural language processing: A survey. Science China Technological Sciences 63 (2020), 1872–1897.
[73]
Thejan Rajapakshe, Rajib Rana, Sara Khalifa, Bjorn W. Schuller, and Jiajun Liu. 2021. A novel policy for pre-trained deep reinforcement learning for speech emotion recognition. arXiv:2101.00738 (2021), 1–11. http://arxiv.org/abs/2101.00738.
[74]
Stewart Robinson, Richard E. Nance, Ray J. Paul, Michael Pidd, and Simon J. E. Taylor. 2004. Simulation model reuse: Definitions, benefits and obstacles. Simulation Modelling Practice and Theory 12, 7-8 (2004), 479–494.
[75]
Amandeep Kaur Sandhu and Ranbir Singh Bath. 2021. Integration of artificial intelligence into software reuse: An overview of software intelligence. In Proceedings of the International Conference on Computation, Automation, and Knowledge Management. 357–362.
[76]
Amandeep Kaur Sandhu and Ranbir Singh Batth. 2021. Software reuse analytics using integrated random forest and gradient boosting machine learning algorithm. Software: Practice and Experience 51 (Oct. 2021), 735–747.
[77]
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Lingchieh Chen. 2018. MobileNetV2: Inverted residuals and linear bottlenecks. In Proceedings of the International Conference on Computer Vision and Pattern Recognition.1–12.
[78]
Tushar Sharma, Vasiliki Efstathiou, Panos Louridas, and Diomidis Spinellis. 2021. Code smell detection by deep direct-learning and transfer-learning. Journal of Systems and Software 176 (June2021), 1–25.
[79]
Mettu Srinivas, G. Sucharitha, and Anjanna Matta. 2016. Machine Learning: Algorithms and Applications. CRC Press, Boca Raton, FL.
[80]
Zhensu Sun, Yan Liu, Ziming Cheng, Chen Yang, and Pengyu Che. 2020. Req2Lib: A semantic neural model for software library recommendation. In Proceedings of the International Conference on Software Analysis, Evolution, and Reengineering. 1–5.
[81]
Ankur Tagra, Haoxiang Zhang, Gopi Krishnan Rajbahadur, and Ahmed E. Hassan. 2021. Revisiting reopened bugs in open source software systems. Empirical Software Engineering 27, 4 (Oct. 2021), 1–41.
[82]
Haoye Tian, Kui Liu, Abdoul Kader Kabore, Anil Koyuncu, Li Li, Jacques Klein, and Tegawende F. Bissyande. 2020. Evaluating representation learning of code changes for predicting patch correctness in program repair. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 981–992.
[83]
Rosalia Tufano, Simone Masiero, Antonio Mastropaolo, Luca Pascarella, Denys Poshyvanyk, and Gabriele Bavota. 2022. Using pre-trained models to boost code review automation. In Proceedings of the 35th IEEE/ACM International Conference on Software Engineering. 1–12.
[84]
Yuan-Hsin Tung, Chih-Ju Chuang, and Hwai-Ling Shan. 2014. A framework of code reuse in open source software. In Proceedings of the Asia-Pacific Network Operations and Management Symposium. 1–6.
[85]
Carmine Vassallo, Sebastiano Panichella, Fabio Palomba, Sebastian Proksch, Harald C. Gall, and Andy Zaidman. 2020. How developers engage with static analysis tools in different contexts. Empirical Software Engineering 25 (2020), 1419–1457.
[86]
Julian von der Mosel, Alexander Trautsch, and Steffen Herbold. 2021. On the validity of pre-trained transformers for natural language processing in the software engineering domain. arXiv:2109.04738 (2021), 1–20.
[87]
Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel Bowman. 2018. GLUE: A multi-task benchmark and analysis platform for natural language understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. 353–355.
[88]
Deze Wang, Zhuyang Jia, Shanshan Li, Yue Yu, Yun Xiong, Wei Dong, and Xiangke Liao. 2022. Bridging pre-trained models and downstream tasks for source code understanding. In Proceedings of the 44th International Conference on Software Engineering. 1–12.
[89]
Jun Wang, Xiaofang Zhang, and Lin Chen. 2021. How well do pre-trained contextual language representations recommend labels for GitHub issues? Knowledge-Based Systems 232, 28 (Nov. 2021), 1–11.
[90]
Yue Wang, Weishi Wang, Shafiq Joty, and Steven C. H. Hoi. 2021. CodeT5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv:2109.00859 (2021), 1–13.
[91]
Cody Watson, Nathan Cooper, David Nader Palacio, Kevin Moran, and Denys Poshyvanyk. 2020. A systematic literature review on the use of deep learning in software engineering research. arXiv:2009.06520 (2020).
[92]
Cody Watson, Nathan Cooper, David Nader Palacio, Kevin Moran, and Denys Poshyvanyk. 2022. A systematic literature review on the use of deep learning in software engineering research. ACM Transactions on Software Engineering and Methodology 31, 2 (2022), Article 32, 59 pages.
[93]
Joanna K. Webb, Krista A. Keller, Ken Welle, and Matthew C. Allender. 2020. Evaluation of the inter-and intraindividual agreement of a pododermatitis scoring model in greater flamingos (Phoenicopterus roseus). Journal of Zoo and Wildlife Medicine 51, 2 (2020), 379–384.
[94]
Xizhu Wu and Zhihua Zhou. 2017. Model reuse with domain knowledge. Scientia Sinica Informationis 47, 11 (May2017), 1483–1492.
[95]
Yanzhao Wu, Ling Liu, Calton Pu, Wenqi Cao, Semih Sahin, Wenqi Wei, and Qi Zhang. 2022. A comparative measurement study of deep learning as a service framework. IEEE Transactions on Services Computing 15, 1 (2022), 551–566.
[96]
Minke Xiu, Zhen Ming (Jack) Jiang, and Bram Adams. 2021. An exploratory study of machine learning model stores. IEEE Software 38, 1 (2021), 114–122.
[97]
Bowen Xu, Le An, Ferdian Thung, Foutse Khomh, and David Lo. 2020. Why reinventing the wheels? An empirical study on library reuse and re-implementation. Empirical Software Engineering 25 (2020), 755–789.
[98]
Yun Xu and Royston Goodacre. 2018. On splitting training and validation set: A comparative study of cross-validation, bootstrap and systematic sampling for estimating the generalization performance of supervised learning. Journal of Analysis and Testing 2, 3 (2018), 249–262.
[99]
Chengran Yang, Bowen Xu, Junaed Younus Khan, Gias Uddin, Donggyun Han, Zhou Yang, and David Lo. 2022. Aspect-Based API review classification: How far can pre-trained transformer model go? In Proceedings of the International Conference on Software Analysis, Evolution, and Reengineering. 1–11.
[100]
Zhou Yang, Jieke Shi, Junda He, and David Lo. 2022. Natural attack for pre-trained models of code. In Proceedings of the International Conference on Software Engineering. 1–12.
[101]
Ting Zhang, Bowen Xu, Ferdian Thung, Stefanus Agus Haryono, David Lo, and Lingxiao Jiang. 2020. Sentiment analysis for software engineering: How far can pre-trained transformer models go? In Proceedings of the International Conference on Software Maintenance and Evolution. 70–80.
[102]
Xufan Zhang, Yilin Yang, Yang Feng, and Zhenyu Chen. 2019. Software engineering practice in the development of deep learning applications. In Proceedings of the International Conference on Software Engineering.1–11.
[103]
Ziqi Zhang, Yuanchun Li, Jindong Wang, Bingyan Liu, Ding Li, Xiangqun Chen, Yao Guo, and Yunxin Liu. 2022. ReMoS: Reducing defect inheritance in transfer learning via relevant model slicing. In Proceedings of the International Conference on Software Engineering. 1856–1868.
[104]
Hui Zhao, Jimin Liang, Xuezhen Yin, Lingfeng Yang, Peili Yang, and Yuhang Wang. 2018. Domain-specific modelware: To make the machine learning model reusable and reproducible. In Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. 1–2.
[105]
Zhihua Zhou. 2016. Learnware: On the future of machine learning. Frontiers of Computer Science 10, 1 (2016), 589–590.
[106]
Weiqin Zou, David Lo, Pavneet Singh Kochhar, Xuan-Bach Dinh Le, Xin Xia, Yang Feng, Zhenyu Chen, and Baowen Xu. 2019. Smart contract development: Challenges and opportunities. IEEE Transactions on Software Engineering 47, 10 (Sept. 2019), 2084–2106.

Cited By

View all
  • (2024)Enhancing Software Effort Estimation with Pre-Trained Word Embeddings: A Small-Dataset Solution for Accurate Story Point PredictionElectronics10.3390/electronics1323484313:23(4843)Online publication date: 8-Dec-2024
  • (2024)Large Language Models for Software Engineering: A Systematic Literature ReviewACM Transactions on Software Engineering and Methodology10.1145/369598833:8(1-79)Online publication date: 20-Sep-2024
  • (2024)What do we know about Hugging Face? A systematic literature review and quantitative validation of qualitative claimsProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3686665(13-24)Online publication date: 24-Oct-2024
  • Show More Cited By

Index Terms

  1. What Is the Intended Usage Context of This Model? An Exploratory Study of Pre-Trained Models on Various Model Repositories

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Software Engineering and Methodology
      ACM Transactions on Software Engineering and Methodology  Volume 32, Issue 3
      May 2023
      937 pages
      ISSN:1049-331X
      EISSN:1557-7392
      DOI:10.1145/3594533
      • Editor:
      • Mauro Pezzè
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 03 May 2023
      Online AM: 29 October 2022
      Accepted: 06 October 2022
      Revised: 02 October 2022
      Received: 01 March 2022
      Published in TOSEM Volume 32, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Software engineering for artificial intelligence
      2. pre-trained models
      3. model reuse
      4. model contract

      Qualifiers

      • Research-article

      Funding Sources

      • National Natural Science Foundation of China
      • Natural Science Foundation of Jiangsu Province, China
      • Foundation of the Key National Laboratory of New Technology in Computer Software (Nanjing University)
      • Foundation of the Key Laboratory of Safety-Critical Software

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)296
      • Downloads (Last 6 weeks)31
      Reflects downloads up to 28 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Enhancing Software Effort Estimation with Pre-Trained Word Embeddings: A Small-Dataset Solution for Accurate Story Point PredictionElectronics10.3390/electronics1323484313:23(4843)Online publication date: 8-Dec-2024
      • (2024)Large Language Models for Software Engineering: A Systematic Literature ReviewACM Transactions on Software Engineering and Methodology10.1145/369598833:8(1-79)Online publication date: 20-Sep-2024
      • (2024)What do we know about Hugging Face? A systematic literature review and quantitative validation of qualitative claimsProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3686665(13-24)Online publication date: 24-Oct-2024
      • (2024)Generative Information Systems Are Great If You Can ReadProceedings of the 2024 Conference on Human Information Interaction and Retrieval10.1145/3627508.3638345(165-177)Online publication date: 10-Mar-2024
      • (2023)BTLink : automatic link recovery between issues and commits based on pre-trained BERT modelEmpirical Software Engineering10.1007/s10664-023-10342-728:4Online publication date: 12-Jul-2023

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      HTML Format

      View this article in HTML Format.

      HTML Format

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media