EP-Based Infinite Inverted Dirichlet Mixture Learning: Application to Image Spam Detection

Fan, Wentao; Bourouis, Sami; Bouguila, Nizar; Aldosari, Fahd; Sallay, Hassen; Khayyat, K. M. Jamil

doi:10.1007/978-3-319-92058-0_33

Wentao Fan¹⁷,
Sami Bourouis¹⁸,
Nizar Bouguila¹⁹,
Fahd Aldosari²⁰,
Hassen Sallay²⁰ &
…
K. M. Jamil Khayyat²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10868))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

3068 Accesses
2 Citations

Abstract

We propose in this paper a new fully unsupervised model based on a Dirichlet process prior and the inverted Dirichlet distribution that allows the automatic inferring of clusters from data. The main idea is to let the number of mixture components increases as new vectors arrive. This allows answering the model selection problem in a elegant way since the resulting model can be viewed as an infinite inverted Dirichlet mixture. An expectation propagation (EP) inference methodology is developed to learn this model by obtaining a full posterior distribution on its parameters. We validate the model on a challenging application namely image spam filtering to show the merits of the framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://www.cs.jhu.edu/~mdredze/datasets/image_spam.

References

McLachlan, G., Peel, D.: Finite Mixture Models. Wiley, New York (2000)
Book Google Scholar
Bdiri, T., Bouguila, N.: Positive vectors clustering using inverted dirichlet finite mixture models. Expert Syst. Appl. 39(2), 1869–1882 (2012)
Article Google Scholar
Bouguila, N., Ziou, D.: High-dimensional unsupervised selection and estimation of a finite generalized dirichlet mixture model based on minimum message length. IEEE Trans. Pattern Anal. Mach. Intell. 29(10), 1716–1731 (2007)
Article Google Scholar
Rasmussen, C.E.: The infinite Gaussian mixture model. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), pp. 554–560. MIT Press (2000)
Google Scholar
Blackwell, D., MacQueen, J.: Ferguson distributions via pólya urn schemes. Ann. Stat. 1(2), 353–355 (1973)
Article Google Scholar
Korwar, R.M., Hollander, M.: Contributions to the theory of Dirichlet processes. Ann. Prob. 1, 705–711 (1973)
Article MathSciNet Google Scholar
Blei, D.M., Jordan, M.I.: Variational inference for Dirichlet process mixtures. Bayesian Anal. 1, 121–144 (2005)
Article MathSciNet Google Scholar
Bouguila, N., Ziou, D.: A Dirichlet process mixture of Dirichlet distributions for classification and prediction. In: Proceedings of the IEEE Workshop on Machine Learning for Signal Processing (MLSP), pp. 297–302 (2008)
Google Scholar
Zhang, X., Chen, B., Liu, H., Zuo, L., Feng, B.: Infinite max-margin factor analysis via data augmentation. Pattern Recogn. 52(Suppl. C), 17–32 (2016)
Article Google Scholar
Bertrand, A., Al-Osaimi, F.R., Bouguila, N.: View-based 3D objects recognition with expectation propagation learning. In: Bebis, G., Boyle, R., Parvin, B., Koracin, D., Porikli, F., Skaff, S., Entezari, A., Min, J., Iwai, D., Sadagic, A., Scheidegger, C., Isenberg, T. (eds.) ISVC 2016. LNCS, vol. 10073, pp. 359–369. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50832-0_35
Chapter Google Scholar
Minka, T., Ghahramani, Z.: Expectation propagation for infinite mixtures. In: NIPS 2003 Workshop on Nonparametric Bayesian Methods and Infinite Models (2003)
Google Scholar
Bouguila, N.: Infinite Liouville mixture models with application to text and texture categorization. Pattern Recogn. Lett. 33(2), 103–110 (2012)
Article Google Scholar
Bouguila, N., Ziou, D.: A Dirichlet process mixture of generalized Dirichlet distributions for proportional data modeling. IEEE Trans. Neural Netw. 21(1), 107–122 (2010)
Article Google Scholar
Minka, T.: Expectation propagation for approximate Bayesian inference. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), pp. 362–369 (2001)
Google Scholar
Minka, T., Lafferty, J.: Expectation-propagation for the generative aspect model. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), pp. 352–359 (2002)
Google Scholar
Chang, S., Dasgupta, N., Carin, L.: A Bayesian approach to unsupervised feature selection and density estimation using expectation propagation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1043–1050 (2005)
Google Scholar
Maybeck, P.S.: Stochastic Models, Estimation and Control. Academic Press, New York (1982)
MATH Google Scholar
Ishwaran, H., James, L.F.: Gibbs sampling methods for stick-breaking priors. J. Am. Stat. Assoc. 96, 161–173 (2001)
Article MathSciNet Google Scholar
Ma, Z., Leijon, A.: Expectation propagation for estimating the parameters of the beta distribution. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 2082–2085 (2010)
Google Scholar
Zhang, Y., Wang, S., Phillips, P., Ji, G.: Binary PSO with mutation operator for feature selection using decision tree applied to spam detection. Knowl. Based Syst. 64, 22–31 (2014)
Article Google Scholar
Amayri, O., Bouguila, N.: Improved online support vector machines spam filtering using string kernels. In: Bayro-Corrochano, E., Eklundh, J.-O. (eds.) CIARP 2009. LNCS, vol. 5856, pp. 621–628. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-10268-4_73
Chapter Google Scholar
Amayri, O., Bouguila, N.: Online spam filtering using support vector machines. In: Proceedings of the 14th IEEE Symposium on Computers and Communications (ISCC 2009), 5–8 July Sousse, Tunisia, pp. 337–340. IEEE Computer Society (2009)
Google Scholar
Biggio, B., Fumera, G., Pillai, I., Roli, F.: A survey and experimental evaluation of image spam filtering techniques. Pattern Recogn. Lett. 32, 1436–1446 (2011)
Article Google Scholar
Fumera, G., Pillai, I., Roli, F.: Spam filtering based on the analysis of text information embedded into images. J. Mach. Learn. Res. 7, 2699–2720 (2006)
Google Scholar
Biggio, B., Fumera, G., Pillai, I., Roli, F.: Image spam filtering using visual information. In: Proceedings of the 14th International Conference on Image Analysis and Processing (ICIAP), pp. 105–110 (2007)
Google Scholar
Mehta, B., Nangia, S., Gupta, M., Nejdl, W.: Detecting image spam using visual features and near duplicate detection. In: Proceedings of the 17th International Conference on World Wide Web, pp. 497–506 (2008)
Google Scholar
Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1/2), 177–196 (2001)
Article Google Scholar
Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, 8th European Conference on Computer Vision (ECCV), pp. 1–22 (2004)
Google Scholar
Dredze, M., Gevaryahu, R., Elias-Bachrach, A.: Learning fast classifiers for image spam. In: Proceedings of the Conference on Email and Anti-Spam (CEAS), pp. 487–493 (2007)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article MathSciNet Google Scholar
Bdiri, T., Bouguila, N.: An infinite mixture of inverted Dirichlet distributions. In: Lu, B.-L., Zhang, L., Kwok, J. (eds.) ICONIP 2011. LNCS, vol. 7063, pp. 71–78. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24958-7_9
Chapter Google Scholar
Fan, W., Bouguila, N.: Topic novelty detection using infinite variational inverted Dirichlet mixture models. In: 14th IEEE International Conference on Machine Learning and Applications, ICMLA 2015, Miami, FL, USA, 9–11 December 2015, pp. 70–75 (2015)
Google Scholar

Download references

Acknowledgements

The authors would like to thank the Deanship of Scientific Research at umm Al-Qura University for the continuous support. This work was supported financially by the Deanship of Scientific Research at Umm Al-Qura University under the grant number 15-COM-3-1-0006. The first author was supported by the National Natural Science Foundation of China (61502183).

Author information

Authors and Affiliations

Huaqiao University, Xiamen, China
Wentao Fan
Taif university, Taif, Kingdom of Saudi Arabia
Sami Bourouis
Concordia University, Montreal, QC, Canada
Nizar Bouguila
Umm Al-Qura University, Makkah, Saudi Arabia
Fahd Aldosari, Hassen Sallay & K. M. Jamil Khayyat

Authors

Wentao Fan
View author publications
You can also search for this author in PubMed Google Scholar
Sami Bourouis
View author publications
You can also search for this author in PubMed Google Scholar
Nizar Bouguila
View author publications
You can also search for this author in PubMed Google Scholar
Fahd Aldosari
View author publications
You can also search for this author in PubMed Google Scholar
Hassen Sallay
View author publications
You can also search for this author in PubMed Google Scholar
K. M. Jamil Khayyat
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nizar Bouguila .

Editor information

Editors and Affiliations

University of Regina, Regina, SK, Canada
Malek Mouhoub
University of Regina, Regina, SK, Canada
Samira Sadaoui
Concordia University, Montreal, QC, Canada
Otmane Ait Mohamed
Texas State University, San Marcos, TX, USA
Moonis Ali

A The calculation of $Z_i$ in Eq. (21)

The normalized constant $Z_i$ in Eq. (21) can be calculated as

$$\begin{aligned} Z_i = \int f_i(\varTheta )q^{\setminus i}(\varTheta )d\varTheta =\sum _{j=1}^J \bar{\lambda }_j\prod _{s=1}^{j-1}(1-\bar{\lambda }_s) \int \mathcal {ID}(\varvec{X}_i|\varvec{\alpha }_j) N(\varvec{\alpha }_j|\varvec{\mu }_j^{\setminus i},A_j^{\setminus i})\mathrm {d}\varvec{\alpha }_j \end{aligned}$$

(30)

where $\bar{\lambda }_j$ is the expected value of $\lambda _j$. Since the integration involved in Eq. (30) is analytically intractable, we tackle this problem by adopting the Laplace approximation to approximate the integrand with a Gaussian distribution [19]. First, we define $h(\varvec{\alpha }_j)$ as the integrand in Eq. (30):

$$\begin{aligned} h(\varvec{\alpha }_j) =\mathcal {ID}(\varvec{X}_i|\varvec{\alpha }_j)\mathcal {N}(\varvec{\alpha }_j|\varvec{\mu }_j^{\setminus i},A^{\setminus i}_{j}) \end{aligned}$$

(31)

Then, the normalized distribution for this integrand which is indeed a product of a Dirichlet distribution and a Gaussian distribution is given by

$$\begin{aligned} \mathcal {H}(\varvec{\alpha }_j) =\frac{h(\varvec{\alpha }_j)}{\int h(\varvec{\alpha }_j)d\varvec{\alpha }_j} \end{aligned}$$

(32)

Our goal for the Laplace method is to find a Gaussian approximation which is centered on the mode of the distribution $\mathcal {H}(\varvec{\alpha }_j)$. We may obtain the mode $\varvec{\alpha }_j^*$ numerically by setting the first derivative of $\ln h(\varvec{\alpha }_j)$ to 0, where

$$\begin{aligned}&\ln h(\varvec{\alpha }_j) = \ln \frac{\varGamma (\sum _{l=1}^{D+1}\alpha _{jl})}{\prod _{l=1}^{D+1}\varGamma (\alpha _{jl})} + \sum _{l=1}^D(\alpha _{jl}-1)\ln X_{il} - \sum _{l=1}^{D+1}\alpha _{jl} \nonumber \\&\ln (1+\sum _{l=1}^DX_{il}) - \frac{1}{2}(\varvec{\alpha }_j - \varvec{\mu }^{\setminus i}_{j})^T A^{\setminus i}_j (\varvec{\alpha }_j - \varvec{\mu }^{\setminus i}_{j})+\text{ const. } \end{aligned}$$

(33)

We can calculate the first and second derivatives with respect to $\varvec{\alpha }_j$ as

$$\begin{aligned} \frac{\partial \ln h(\varvec{\alpha }_j)}{\partial \varvec{\alpha }_j} =\begin{bmatrix} \varPsi (\mathop {\sum }\nolimits _{l=1}^{D+1}\alpha _{jl}) - \varPsi (\alpha _{j1}) + \ln X_{i1}-\ln (1+\mathop {\sum }\nolimits _{l=1}^DX_{il})\\ \vdots \\ \varPsi (\sum _{l=1}^{D+1}\alpha _{jl}) - \varPsi (\alpha _{jD}) + \ln X_{iD}-\ln (1+\mathop {\sum }\nolimits _{l=1}^DX_{il}) \end{bmatrix}-A_j^{\setminus i}(\varvec{\alpha }_j- \varvec{\mu }^{\setminus i}_j) \end{aligned}$$

(34)

$$\begin{aligned} \frac{\partial ^2\ln h(\varvec{\alpha }_j)}{\partial \varvec{\alpha }_j^2} = \begin{bmatrix} \varPsi '(\mathop {\sum }\nolimits _{l=1}^D\alpha _{jl}) - \varPsi '(\alpha _{j1})&\cdots&\varPsi '(\mathop {\sum }\nolimits _{l=1}^D\alpha _{jl})\\ \vdots&\ddots&\vdots \\ \varPsi '(\mathop {\sum }\nolimits _{l=1}^D\alpha _{jl})&\cdots&\varPsi '(\mathop {\sum }\nolimits _{l=1}^D\alpha _{jl}) - \varPsi '(\alpha _{jD}) \end{bmatrix}-A^{\setminus i}_{j} \end{aligned}$$

(35)

where $\varPsi (\cdot )$ is the digamma function. Then, we can approximate $h(\varvec{\alpha }_j)$

$$\begin{aligned} h(\varvec{\alpha }_j)\simeq h(\varvec{\alpha }_j^*)\exp \bigg (-\frac{1}{2}(\varvec{\alpha }_j-\varvec{\alpha }_j^*)\widehat{A}_{j}(\varvec{\alpha }_j-\varvec{\alpha }_j^*)\bigg ) \end{aligned}$$

(36)

where the precision matrix $\widehat{A}_{j}$ is given by

$$\begin{aligned} \widehat{A}_{j} = - \left. \frac{\partial ^2\ln h(\varvec{\alpha }_j)}{\partial \varvec{\alpha }_j^2} \right| _{\varvec{\alpha }_j =\varvec{\alpha }_j^*} \end{aligned}$$

(37)

Therefore, the integration of $h(\varvec{\alpha }_j)$ can be approximated by using Eq. (36) as

$$\begin{aligned} \int h(\varvec{\alpha }_j)d\varvec{\alpha }_j \simeq h(\varvec{\alpha }_j^*)\int \exp (-\frac{1}{2}(\varvec{\alpha }_j-\varvec{\alpha }_j^*)\widehat{A}_{j}(\varvec{\alpha }_j-\varvec{\alpha }_j^*))d\varvec{\alpha }_j= h(\varvec{\alpha }_j^*) \frac{(2\pi )^{(D+1)/2}}{|\widehat{A}_j|^{1/2}} \end{aligned}$$

(38)

Finally, we can rewrite Eq. (30) as following:

$$\begin{aligned} Z_i=\sum _{j=1}^J \bar{\lambda }_j\prod _{s=1}^{j-1}(1-\bar{\lambda }_s)h(\varvec{\alpha }_j^*)\frac{(2\pi )^{{(D+1)}/2}}{|\widehat{A}_j|^{1/2}} \end{aligned}$$

(39)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fan, W., Bourouis, S., Bouguila, N., Aldosari, F., Sallay, H., Khayyat, K.M.J. (2018). EP-Based Infinite Inverted Dirichlet Mixture Learning: Application to Image Spam Detection. In: Mouhoub, M., Sadaoui, S., Ait Mohamed, O., Ali, M. (eds) Recent Trends and Future Technology in Applied Intelligence. IEA/AIE 2018. Lecture Notes in Computer Science(), vol 10868. Springer, Cham. https://doi.org/10.1007/978-3-319-92058-0_33

Download citation

DOI: https://doi.org/10.1007/978-3-319-92058-0_33
Published: 30 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-92057-3
Online ISBN: 978-3-319-92058-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

EP-Based Infinite Inverted Dirichlet Mixture Learning: Application to Image Spam Detection

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A The calculation of \(Z_i\) in Eq. (21)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

EP-Based Infinite Inverted Dirichlet Mixture Learning: Application to Image Spam Detection

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A The calculation of \(Z_i\) in Eq. (21)

A The calculation of \(Z_i\) in Eq. (21)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation