Elsevier

Information Sciences

Volume 571, September 2021, Pages 262-278
Information Sciences

Separation and recovery Markov boundary discovery and its application in EEG-based emotion recognition

https://doi.org/10.1016/j.ins.2021.04.071Get rights and content

Abstract

In a Bayesian network (BN), the Markov boundary (MB) presents the local causal structure around a target. Due to the interpretability and robustness, it has been widely applied to feature selection and BN structure learning. However, existing MB discovery algorithms might fail to identify some true positives, leading to poor performance in real-world applications. To tackle this issue, we introduce a two-phase-discovery strategy to search more true positives. Based on this strategy, we propose a more accurate and data-efficient algorithm, separation and recovery MB discovery algorithm (SRMB). SRMB first discovers an incomplete parent–child set and spouse set via an MB separation process, and then retrieves the ignored true positives via an MB recovery process, which further exploits a symmetry test to improve accuracy in unfaithful cases. Experiments on standard BN and real-world data sets demonstrate the effectiveness and superiority of SRMB in terms of MB discovery, BN structure learning, and feature selection. To demonstrate the superiority of SRMB in data with distribution shift, we further apply SRMB to EEG-based emotion recognition tasks, where distribution shift exists in multiple unstable sessions. We prove that the most predictive features are from Gamma/Beta frequency bands and are distributed at the lateral temporal area.

Introduction

Recent years have witnessed the proliferation of the application of Markov boundary (MB) in machine learning [1], [2] and data mining [3], [4], which could explicitly induce the local causal relationships around a target variable [5] and thus be used to improve the interpretability and robustness of learning models [6]. In a faithful [5] Bayesian network (BN), MB consists of the direct causes (parents), direct effects (children), and other direct causes of direct effects (spouses) of its corresponding target [5]. These variables provide a complete picture of the local causal structure around the target [7], and have the potential ability to imply the underlying mechanism of the target. Thus, MB discovery algorithms are widely applied to numerous real-world tasks. For example, MB discovery is the first step in causal learning and BN structure learning, where the skeleton of the BN without orientation is constructed through learning MB (or subset of MB) [8]. Another important application is feature selection [9], since all other features are independent of the class attribute conditioned on its MB [9]. Some studies [10], [11] have proved that the MB set is the theoretically optimal subset for learning and inference tasks. To search the MB, extensive algorithms are proposed, including simultaneous MB learning algorithms and divide-and-conquer MB learning algorithms according to a recent review [6].

Some up-to-date research studies [12], [13], [14], [15] targeted the improvement in the accuracy of MB discovery. However, some true positives1 still fail to be identified especially when training samples are small-scale and insufficient. The main reason behind this issue is that the studies focus on theoretical guarantee and employ a strict criterion to filter false positives in a discovered MB set. In this case, some true positives are easily discarded by mistake, leading to negligible improvement in accuracy. Conditional independence (CI) test between variables is another primary cause for this problem since the CI test is limited by the scale of its conditioning set. More precisely, when the conditioning set in a CI test is large, the CI test might fail and the tested relationship is directly judged to be independent. Therefore, MB discovery algorithms based on the CI test will discard some true positives when the input target has a relatively large-scale MB. Nevertheless, it is pertinent to note that these algorithms guarantee the absence of false positives in a discovered MB under the faithfulness assumption, which is the basis of the idea in this article.

The above analyses inspire us to design a two-phase-discovery algorithm targeting the detection of true positives in MB discovery process which may be ignored due to the issues pointed above. More specifically, the first phase, called separation phase, uses a time-efficient MB discovery algorithm to get an initial MB, which is incomplete but includes very few false variables, especially when the faithfulness assumption is satisfied. Based on the initial MB, the second phase, called recovery phase, exploits a recovery process to retrieve discarded parent–child (PC) and spouse (SP) variables through a divide-and-conquer search. Due to the complementarity between the two phases, the proposed algorithm discovers a more accurate MB.

However, the two-phase-discovery strategy gives rise to another problem, that is, how to distinguish between PC variables and SP variables in a discovered MB from the first phase? To solve this problem, a score function is designed to rank variables in an MB, where the statistical property differences between PC and SP variables are analyzed and the score is assigned in a way that PC and SP variables receive a distinct score based on their difference. Based on the score function, we propose an algorithm called SeparateMB to identify spouses in a ranked MB variable list, which could effectively and efficiently separate the PC and SP sets. To maintain high accuracy in real-world applications violating the faithfulness assumption, a symmetry test is added in the recovery phase, to help avoid many errors by testing the symmetry property between parent and child variables.

Motivated by the aforementioned ideas, we proposed a two-phase-discovery algorithm, called separation and recovery MB discovery (SRMB) algorithm. The main contributions of this article are summarized as follows:

  • 1.

    Based on the proposed two-phase-discovery strategy, SRMB could detect more true positives. Thus, SRMB is a more accurate and more data-efficient MB discovery algorithm compared with existing algorithms. Furthermore, SRMB is robust in real-world applications due to the symmetry test in its recovery process, which could avoid most of the errors introduced by unfaithfulness.

  • 2.

    For the first time, the statistical properties of PC and SP variables are analyzed in this study to differentiate between PC and SP variables in a discovered MB. This analysis not only helps to design a novel SRMB algorithm, but could facilitate the other relevant research studies in this domain.

  • 3.

    We theoretically prove the correctness of the proposed SRMB, and conduct extensive experiments on BN data sets and real-world data sets to validate its superiority over other algorithms in terms of MB discovery, BN structure learning, and causal feature selection. Moreover, we specially employ SRMB to select signal features in an EEG-based emotion recognition data set, which demonstrates the effectiveness of SRMB for EEG-based emotion recognition and further proves that the critical features belong to Gamma and Beta Frequency bands. We also conclude that these critical features are distributed at the lateral temporal area.

Section snippets

Brief review of MB discovery algorithms

The concept of MB is first proposed and discussed by Judea Pearl [5], which provides a complete picture of the local causal relationships around its corresponding target. MB is naturally capable of bridging causation with predictivity and thus has an extensive application prospect for many real-world tasks, such as causal discovery, causal inference, and feature selection. Hence, MB has attracted much attention and numerous MB discovery algorithms have been proposed recently. A recent

The proposed algorithm

In this section, the novel MB discovery algorithm is presented. We first give an overview of the proposed SRMB in Section 3.1. Afterwards, we explain the MB separation process in Section 3.2, and the recovery process in Section 3.3. Finally, we present theoretical analyses of the proposed SRMB in Section 3.4.

Experimental studies

In this section, we compare the proposed SRMB with four state-of-the-art MB discovery algorithms on different tasks. The comparing algorithms include one simultaneous MB learning algorithm (IAMB [18]) and three divide-and-conquer MB learning algorithms (PCMB [12], MBOR [13], and STMB [15]). In Section 4.1, we first use standard BN data sets to demonstrate the superiority of SRMB for MB discovery. Furthermore, we apply SRMB to BN structure learning and feature selection tasks with 5 BN data sets

Applied to EEG-based emotion recognition

In this section, we apply the SRMB to emotion recognition task based on electroencephalography (EEG) data. EEG is used to record the electrical activity of the brain through several electrodes placed on the scalp. As we know, EEG data over multiple sessions are unstable [41], [42] since recordings are done in different sessions where possible distribution shift might exist between these data from multiple sessions. This shift is induced due to changes in parameters such as [43], [44]: the

Conclusion: Discussion, limitation, and future work

This research study focuses on the problem that existing MB discovery algorithms discard many true positives, and designs a new strategy to address this problem, called the two-phase-discovery strategy. Based on this strategy, we propose a novel MB discovery algorithm, SRMB, which is a more accurate and more data-efficient method.

Compared with state-of-the-art algorithms [21], [13], [15], [22], SRMB combines the advantage of simultaneous MB learning algorithms and divide-and-conquer MB learning

CRediT authorship contribution statement

Xingyu Wu: Investigation, Methodology, Software, Writing - original draft. Bingbing Jiang: Investigation, Writing - original draft, Writing - review & editing. Kui Yu: Resources, Writing - review & editing. Huanhuan Chen: Resources, Writing - review & editing, Supervision.

Declaration of Competing Interest

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Organizations: University of Science and Technology of China, Hefei, 230027, China. Hangzhou Normal University, Hangzhou, 311121, China. Hefei University of Technology, Hefei, 230601, China.

Acknowledgments

We thank the native speaker Muhammad Usman for help to improve the language. This research is supported in part by the National Natural Science Foundation of China under Grant No. 91746209, 62006065, and 61876206, the Open Project Foundation of Intelligent Information Processing Key Laboratory of Shanxi Province under Grant No. CICIP2020003, the Scientific Research Foundation of HZNU under Grant No. 4115C50220204003, and the Fundamental Research Funds for the Central Universities.

References (48)

  • Ioannis Tsamardinos, Constantin F. Aliferis, Towards principled feature selection: Relevance, filters, and wrappers,...
  • Jean-Philippe Pellet et al.

    Using Markov blankets for causal structure learning

    Journal of Machine Learning Research

    (2008)
  • Sergio Rodrigues De Morais et al.

    A novel scalable and data efficient feature subset selection algorithm

  • Constantin F. Aliferis et al.

    HITON: a novel Markov blanket algorithm for optimal variable selection

  • Tian Gao et al.

    Efficient Markov blanket discovery and its application

    IEEE Transactions on Cybernetics

    (2017)
  • Daphne Koller et al.

    Toward optimal feature selection

  • Dimitris Margaritis et al.

    Bayesian network induction via local neighborhoods

  • Ioannis Tsamardinos et al.

    Algorithms for large scale Markov blanket discovery

  • Sandeep Yaramakala et al.

    Speculative Markov blanket discovery for optimal feature selection

  • Ioannis Tsamardinos et al.

    Time and sample efficient discovery of Markov blankets and direct causal relations

  • Fu. Shunkai et al.

    Fast Markov blanket discovery algorithm via local learning within single pass

  • Wu Xingyu et al.

    Accurate markov boundary discovery for causal feature selection

    IEEE Transactions on Cybernetics

    (2020)
  • Wu Xingyu et al.

    Multi-label causal feature selection

  • Sangmin Lee et al.

    Parallel simulated annealing with a greedy algorithm for bayesian network structure learning

    IEEE Transactions on Knowledge and Data Engineering

    (2019)
  • Cited by (24)

    • EEG based classification of children with learning disabilities using shallow and deep neural network

      2023, Biomedical Signal Processing and Control
      Citation Excerpt :

      However, when the features of all the bands are concatenated together it yielded in the highest maximum accuracy of 94.8 % using cubic SVM classifier whereas shallow and deep network obtained highest maximum accuracy of 94.4 % and 91.6 %, respectively. However using the whole feature set may affect the classifier performance due to feature redundancy and also increases the computational time of the classifiers [50,51]. So, the selected top 10, 20 and 30 subset of features were used and we obtained the highest average and maximum accuracy of 95.8 % and 97.5 % respectively using a shallow neural network with top 30 features extracted using reliefF algorithm.

    View all citing articles on Scopus
    View full text