Temperature controlled PSO on optimizing the DBN parameters for phoneme classification

Laxmi Sree, B. R.; Vijaya, M. S.

doi:10.1007/s10772-018-09586-2

Temperature controlled PSO on optimizing the DBN parameters for phoneme classification

Published: 10 January 2019

Volume 22, pages 143–156, (2019)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

245 Accesses
5 Citations
Explore all metrics

Abstract

Speech recognition has become an essential component to communicate with the latest gadgets and machines in ease through speech. Phoneme classification model for phonemes in Tamil continuous speech is built here by exploring the power of deep belief network (DBN), a powerful neural network architecture that is capable of learning complex problems. But building an efficient DBN highly relies on several parameters like number of layers, number of neurons, connection weights and bias. The effect of increasing the number of layers in DBN for phoneme recognition has been studied in our previous experiments. In addition, a methodology which employed particle swarm optimization (PSO) or its variants second generation PSO (SGPSO) and new method PSO (NMPSO) for optimizing the connection weights and bias of the DBN for phoneme classification were studied in our earlier work. Pre-training DBN with PSO faced the problem of particle stagnation and took longer time to converge, whereas DBN with SGPSO, NMPSO converges faster but still suffers from particle stagnation which prevents it from reaching an optimal solution. Here we try to minimize stagnation of particles in the population in addition to faster convergence by proposing a new improved PSO, named Temperature controlled TPSO to optimize the initial connection weights and bias parameters that controls the DBN efficiency. TPSO seems to converge faster with better optimizing the DBN connection weights and bias parameters when compared to the existing ones with reduced stagnation of population. The TPSO–DBN is designed and applied on a phoneme classification problem for Tamil continuous speech and found to classify phonemes comparatively better with a classification accuracy of 89.2%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Fig. 8

Fig. 9

Duo Features with Hybrid-Meta-Heuristic-Deep Belief Network Based Pattern Recognition for Marathi Speech Recognition

PSO-based optimized CNN for Hindi ASR

Article 24 October 2019

English Speech Recognition Hybrid Algorithm Based on BP Neural Network

References

Chen, M. (2008). Second generation particle swarm optimization. In IEEE World Congress on computational intelligence evolutionary computation (pp. 90–96). CEC 2008.
Ciregan, D., Meier, U., & Schmidhuber, J. (2012). Multi-column deep neural networks for image classification. In IEEE conference on computer vision and pattern recognition (CVPR), 2012, IEEE, pp. 3642–3649.
Da, Y., & Xiurun, G. (2005). An improved pso-based ann with simulated annealing technique. Neurocomputing, 63, 527–533.
Article Google Scholar
Eberhart, R., & Kennedy, R. J. (1995). Particle swarm optimization. In Proceedings of IEEE international conference on neural networks IV, Vol. 1000.
Garro, B.A., Sossa, H., & Vazquez, R.A. (2009). Design of artificial neural networks using a modified particle swarm optimization algorithm. In International Joint Conference on Neural Networks, 2009. IJCNN 2009, IEEE, pp. 938–945.
Gordan, B., Armaghani, D. J., Hajihassani, M., & Monjezi, M. (2016). Prediction of seismic slope stability through combination of particle swarm optimization and neural network. Engineering with Computers, 32(1), 85–97.
Article Google Scholar
Hand, D. J., & Till, R. J. (2001). A simple generalisation of the area under the roc curve for multiple class classification problems. Machine Learning, 45(2), 171–186.
Article MATH Google Scholar
Hinton, G. E. (2006). Training products of experts by minimizing contrastive divergence. Training, 14(8), 1771–1800.
MATH Google Scholar
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R.R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:12070580.
Laxmi Sree, B. R., & Suguna, M. (2016). Aayudha: A tool for automatic segmentation and labelling of continuous tamil speech. International Journal of Computer Applications, 143(1), 31–35.
Article Google Scholar
Laxmi Sree, B. R., & Vijaya, M. S. (In Pressa) Building acoustic model for phoneme recognition using PSO-DBN. International Journal of Business Intelligence and Data Mining. https://doi.org/10.1504/IJBIDM.2018.10010711.
Laxmi Sree, B. R., & Vijaya, M. S. (2017). Deep belief networks for phoneme recognition in continuous Tamil speech–an analysis. Traitement du signal, 34(3–4), 137–151. https://doi.org/10.3166/ts.34.137-151.
Google Scholar
Løvbjerg, M., Rasmussen, T.K., & Krink, T. (2001). Hybrid particle swarm optimiser with breeding and subpopulations. In Proceedings of the 3rd annual conference on genetic and evolutionary computation, Morgan Kaufmann Publishers Inc., pp. 469–476.
Laxmi Sree, B. R., & Vijaya, M. S. (2016). Graph cut based segmentation method for Tamil continuous speech. In S. Subramanian, R. Nadarajan, S. Rao, & S. Sheen (Eds.), Digital connectivity—Social impact. Communications in computer and information science (Vol. 679). Singapore: Springer.
Google Scholar
Srinivasan, A., & Srinivasan, A. (1999). Note on the location of optimal classifiers in n-dimensional roc space. Tech. rep.
Yu, J., Xi, L., & Wang, S. (2007). An improved particle swarm optimization for evolving feedforward artificial neural networks. Neural Processing Letters, 26(3), 217–231.
Article Google Scholar
Yu, J., Wang, S., & Xi, L. (2008). Evolving artificial neural networks using an improved pso and dpso. Neurocomputing, 71(4), 1054–1060.
Article Google Scholar
Zhang, R., & Tao, J. (2017). Data-driven modeling using improved multi-objective optimization based neural network for coke furnace system. IEEE Transactions on Industrial Electronics, 64(4), 3147–3155.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Dr. G.R. Damodaran College of Science, Avanashi Road, Civil Aerodrome Post, Coimbatore, Tamil Nadu, India
B. R. Laxmi Sree
Department of Computer Science, PSGR Krishnammal College for Women, Avinashi Road, Peelamedu, Coimbatore, Tamil Nadu, 641004, India
M. S. Vijaya

Authors

B. R. Laxmi Sree
View author publications
You can also search for this author in PubMed Google Scholar
M. S. Vijaya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to B. R. Laxmi Sree.

Appendix

See Table 8.

Table 8 A corpus Kazhangiyam: phoneme distribution

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Laxmi Sree, B.R., Vijaya, M.S. Temperature controlled PSO on optimizing the DBN parameters for phoneme classification. Int J Speech Technol 22, 143–156 (2019). https://doi.org/10.1007/s10772-018-09586-2

Download citation

Received: 15 July 2018
Accepted: 22 December 2018
Published: 10 January 2019
Issue Date: 15 March 2019
DOI: https://doi.org/10.1007/s10772-018-09586-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Temperature controlled PSO on optimizing the DBN parameters for phoneme classification

Abstract

Access this article

Similar content being viewed by others

Duo Features with Hybrid-Meta-Heuristic-Deep Belief Network Based Pattern Recognition for Marathi Speech Recognition

PSO-based optimized CNN for Hindi ASR

English Speech Recognition Hybrid Algorithm Based on BP Neural Network

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Temperature controlled PSO on optimizing the DBN parameters for phoneme classification

Abstract

Access this article

Similar content being viewed by others

Duo Features with Hybrid-Meta-Heuristic-Deep Belief Network Based Pattern Recognition for Marathi Speech Recognition

PSO-based optimized CNN for Hindi ASR

English Speech Recognition Hybrid Algorithm Based on BP Neural Network

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation