Processing math: 50%
AutoSrh: An Embedding Dimensionality Search Framework for Tabular Data Prediction | IEEE Journals & Magazine | IEEE Xplore
Scheduled Maintenance: On Tuesday, 25 February, IEEE Xplore will undergo scheduled maintenance from 1:00-5:00 PM ET (1800-2200 UTC). During this time, there may be intermittent impact on performance. We apologize for any inconvenience.

AutoSrh: An Embedding Dimensionality Search Framework for Tabular Data Prediction


Abstract:

Prediction over tabular data is often a crucial task in many real-life applications. Recent advances in deep learning give rise to various deep models for tabular data pr...Show More

Abstract:

Prediction over tabular data is often a crucial task in many real-life applications. Recent advances in deep learning give rise to various deep models for tabular data prediction. A common and essential step in these models is to vectorize raw input features in tabular data into dense embeddings. Choosing a suitable dimension for each feature is challenging yet necessary to improve model's performance and reduce memory cost of model parameters. Existing solutions to embedding dimensionality search always choose dimensions from a restricted candidate set. This restriction improves the search efficiency but would produce suboptimal embedding dimensions that hurt model's predictive performance. In this paper, we develop AutoSrh, a flexible embedding dimensionality search framework that can select varying dimensions for different features through differentiable optimization. The key idea of AutoSrh is to relax the search space to be continuous and optimize the selection of embedding dimensions via gradient descent. After optimization, AutoSrh performs embedding pruning to derive the mixed embedding dimensions and retrains the model to further improve the performance. Extensive experiments on five real-world tabular datasets demonstrate that AutoSrh can achieve better predictive performance than the existing approaches with 1.1\sim1.6x lower training time cost and reserve model's predictive performance while reducing 50\sim95% embedding parameters.
Published in: IEEE Transactions on Knowledge and Data Engineering ( Volume: 35, Issue: 7, 01 July 2023)
Page(s): 6673 - 6686
Date of Publication: 27 June 2022

ISSN Information:

Funding Agency:


References

References is not available for this document.