Processing math: 100%
Cross-Situational Word Learning in Disentangled Latent Space | IEEE Conference Publication | IEEE Xplore

Cross-Situational Word Learning in Disentangled Latent Space


Abstract:

Cross-situational word learning (CSL) is a fast and efficient method for humans to acquire word meanings. Many studies have replicated human CSL using computational model...Show More

Abstract:

Cross-situational word learning (CSL) is a fast and efficient method for humans to acquire word meanings. Many studies have replicated human CSL using computational models. Among these, cross-situational learning with Bayesian probabilistic generative model (CSL-PGM) can estimate word meanings from observations that include multiple attributes, such as color and shape. However, as CSL-PGM receives observations for each attribute on a separate channel, it cannot perform CSL for images with multiple attributes. Therefore, we introduce a disentangled representation that captures the attributes within an image. Additionally, we propose CSL+VAE, which integrates CSL-PGM and a \beta-VAE to obtain a disentangled representation in an unsupervised manner. CSL+VAE can discover attributes hidden in images and word sequences and infer the meanings of words. Additionally, it can obtain a more disentangled representation using a learning framework wherein both models share parameters. During experiments, the model was trained on a set of images comprising five attributes and one to five words describing them. The results showed that 99.9% of the words correctly estimated the attributes of the words and correctly estimated the correspondence between the image features and the words. The proposed model also outperformed existing multimodal models in inferring images from word sequences, achieving an accuracy of 0.870.
Date of Conference: 09-11 November 2023
Date Added to IEEE Xplore: 25 December 2023
ISBN Information:
Conference Location: Macau, China

References

References is not available for this document.