Recognition of users’ facial expressions and reflecting them on the face of the user’s virtual avatar is a key technology for realizing immersive virtual reality (VR)-based metaverse applications. As a method to realize this technology, a facial electromyogram (fEMG)-based facial expression recognition (FER) system, with the fEMG electrodes being attached on the pad of a VR headset, has recently been proposed. However, the performance of such FER systems has severely deteriorated when the locations of fEMG electrodes change by the re-wearing of the VR headset, requiring long and tedious calibration sessions every time the user wears the VR headset. In this study, we developed an fEMG-based FER system that is robust against electrode shifts by employing new signal processing techniques: covariate shift adaptation techniques in feature and classifier domains. To verify the feasibility of the proposed method, fEMG data were recorded while participants were making 11 facial expressions repeatedly in four sessions, between which they detached and reattached the fEMG electrodes on their faces. Our experiments showed that classification accuracy dropped from 88 to 79% by the change of the electrode locations when the proposed method was not applied, whereas the accuracy was significantly improved up to 86% when the proposed covariate shift adaptation method was employed. It is expected that the proposed method would contribute to enhancing the practicality of the fEMG-based FER, promoting the practical application of the fEMG-based FER to VR-based metaverse applications.

Accessible. We uploaded the dataset (bdf format), which is available at https://figshare.com/s/bed96c783b4328d4ad1d, with the data description document (doc format).
Code availability
Not applicable.
This work was supported by the Institute of Information & communications Technology Planning & Evaluation (IITP) grants funded by the Korea government (MIST) (Nos. 2017-0-00432 & 2020-0-01373).
H. Cha conducted overall data analyses and wrote a major part of the paper. C. Im provided important insight for the design of the paper and revised the manuscript. All authors listed have contributed considerably to this paper and approved the submitted version.
Appendix A
Geodesic on a Riemannian manifold is the shortest path between two SPD matrices on a Riemannian manifold (Yger et al. 2017). The Geodesic between \({{\varvec{C}}}_{1}\) and \({{\varvec{C}}}_{2}\) is defined as
where \(c\in \left[0, 1\right]\). Note that output of the \(\gamma \left({{\varvec{C}}}_{1},{{\varvec{C}}}_{2}, c\right)\) is located between \({{\varvec{C}}}_{i}\) and \({{\varvec{C}}}_{2}\) depending on the constant \(c\). For example, \(\gamma \left({{\varvec{C}}}_{1},{{\varvec{C}}}_{2},c\right)\) is \({{\varvec{C}}}_{1}\) if \(c=0\) and \({{\varvec{C}}}_{2}\) if \(c=1\). \(\gamma \left({{\varvec{C}}}_{1},{{\varvec{C}}}_{2},c\right)\) will be placed at a center point between \({{\varvec{C}}}_{1}\) and \({{\varvec{C}}}_{2}\) along the geodesic if \(c=0.5\).
Appendix B
The distance between two SPD matrices (Yger et al. 2017) on the Riemannian manifold can be defined as
where the \(\mathrm{logm}\) is the logarithm of a matrix and \(\Vert \cdot \Vert\) is the Frobenius norm of a matrix. Equation (7) can be easily computed by \({\left[\sum_{i=1}^{n}{\lambda }_{i}\right]}^\frac{1}{2}\) where \({\lambda }_{i}\) s are the real positive eigenvalues of \({{\varvec{C}}}_{1}^{-1}{{\varvec{C}}}_{2}\).
Appendix C
The geometric mean is defined as
where \(C\left(n\right)\) is the set of all \(n\times n\) SPD matrices. Equation (8) is not closed form; therefore, interactive algorithm (Barachant et al. 2010) can be employed instead of this, which is written as follows:

Appendix D
Statistical derivation of LDA classification is as follow. The LDA assumes that data within a class label has the multivariate normal distribution. The probability density function (pdf) that feature vector \({{\varvec{x}}}_{i}\) given that label \({y}_{i}\) is \(k\) can be defined as
where \({{\varvec{\mu}}}_{k}\in {R}^{36}\) is a mean vector of feature vector \({{\varvec{x}}}_{i}\) within the label \(k\) and \(\boldsymbol{\Sigma }\in {R}^{36\times 36}\) is a pooled covariance matrix (PCM). \({{\varvec{\mu}}}_{k}\) and \(\boldsymbol{\Sigma }\) can be estimated as
where \(N\), \({N}_{k}\) and \(K\) are the number of total samples of feature vectors, the number of samples of feature vector within a label \(k\), and the total number of labels, respectively.
The probability that label \(k\) is classified given feature vector \({{\varvec{x}}}_{i}\) can be written using Bayes rules as
\(p\left(y=k\right)\) is the prior probability and can be estimated as
where \(\delta \left(i,j\right)=1\) if \(i=j\) and 0 if \(i\ne j\). Let \(p\left(y=k\right)\) and \(p({\varvec{x}}={{\varvec{x}}}_{{\varvec{i}}}|y=k)\) represent \({\pi }_{k}\) and \({f}_{k}\left({\varvec{x}}\right)\), respectively, then \(p\left(y=k|{\varvec{x}}={{\varvec{x}}}_{{\varvec{i}}}\right)\)~\({\pi }_{k}{f}_{k}({\varvec{x}})\). \({\pi }_{k}{f}_{k}({\varvec{x}})\) is monotonic increment function; \(p\left(y=k|{\varvec{x}}={{\varvec{x}}}_{{\varvec{i}}}\right)\)~\({\pi }_{k}{f}_{k}({\varvec{x}})\)~\(\mathrm{log}({\pi }_{k}{f}_{k}\left({\varvec{x}}\right))\). Let \(\mathrm{log}({\pi }_{k}{f}_{k}\left({\varvec{x}}\right))\) be the decision function \({\varphi }_{k}({\varvec{x}})\), then \({\varphi }_{k}({\varvec{x}})\) can be represented by
Finally, the predicted label \({\widehat{y}}_{j}\) can be estimated with the test data \({{\varvec{x}}}_{j}\) as follows:
Simply put, in the training stage, mean vector \({{\varvec{\mu}}}_{k}\) for every class label (\(y=1, 2, 3, \dots K\)) and \(\boldsymbol{\Sigma }\) were estimated with training dataset using (10) and (11). In the test stage, a test feature vector \({{\varvec{x}}}_{j}\) unseen in training dataset is predicted using (15).
