research-article

What Is the Intended Usage Context of This Model? An Exploratory Study of Pre-Trained Models on Various Model Repositories

Authors:

Zhiqiu HuangAuthors Info & Claims

ACM Transactions on Software Engineering and Methodology, Volume 32, Issue 3

Article No.: 69, Pages 1 - 57

https://doi.org/10.1145/3569934

Published: 03 May 2023 Publication History

Get Access

Abstract

There is a trend of researchers and practitioners to directly apply pre-trained models to solve their specific tasks. For example, researchers in software engineering (SE) have successfully exploited the pre-trained language models to automatically generate the source code and comments. However, there are domain gaps in different benchmark datasets. These data-driven (or machine learning based) models trained on one benchmark dataset may not operate smoothly on other benchmarks. Thus, the reuse of pre-trained models introduces large costs and additional problems of checking whether arbitrary pre-trained models are suitable for the task-specific reuse or not. To our knowledge, software engineers can leverage code contracts to maximize the reuse of existing software components or software services. Similar to the software reuse in the SE field, reuse SE could be extended to the area of pre-trained model reuse. Therefore, according to the model card’s and FactSheet’s guidance for suppliers of pre-trained models on what information they should be published, we propose model contracts including the pre- and post-conditions of pre-trained models to enable better model reuse. Furthermore, many non-trivial yet challenging issues have not been fully investigated, although many pre-trained models are readily available on the model repositories. Based on our model contract, we conduct an exploratory study of 1908 pre-trained models on six mainstream model repositories (i.e., the TensorFlow Hub, PyTorch Hub, Model Zoo, Wolfram Neural Net Repository, Nvidia, and Hugging Face) to investigate the gap between necessary pre- and post-condition information and actual specifications. Our results clearly show that (1) the model repositories tend to provide confusing information of the pre-trained models, especially the information about the task’s type, model, training set, and (2) the model repositories cannot provide all of our proposed pre/post-condition information, especially the intended use, limitation, performance, and quantitative analysis. On the basis of our new findings, we suggest that (1) the developers of model repositories shall provide some necessary options (e.g., the training dataset, model algorithm, and performance measures) for each of pre/post-conditions of pre-trained models in each task type, (2) future researchers and practitioners provide more efficient metrics to recommend suitable pre-trained model, and (3) the suppliers of pre-trained models should report their pre-trained models in strict accordance with our proposed pre/post-condition and report their models according to the characteristics of each condition that has been reported in the model repositories.

References

[1]

Abdullah and Mohammad S. Hasan. 2017. An application of pre-trained CNN for image classification. In Proceedings of 2017 20th International Conference of Computer and Information Technology (ICCIT’17). 1–6.

Abstract

References

Cited By

Index Terms

Recommendations

An extensive study on pre-trained models for program understanding and generation

Reusing Deep Neural Network Models through Model Re-Engineering

ModelFoundry: A Tool for DNN Modularization and On-Demand Model Reuse Inspired by the Wisdom of Software Engineering

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Full Text

HTML Format

Share

Share this Publication link

Share on social media

Affiliations