Low delay-code excited linear prediction (LD-CELP) is an attractive algorithm in implementing vocoders in voice over Internet protocol networks. This algorithm has been proposed for the coding of speech at 16 kbps with toll quality. However, operation at transmission rates lower than 16 kbps is desirable, so that traffic can be accommodated during system overload conditions. In this paper, an array of self-organizing maps (SOMs) is employed instead of traditional codebook search module, recommended in ITU-T G.728, to determine the optimum index value of shape codebook. It is noted that a modified supervised training algorithm is used for SOMs in which some of the training parameters are optimized using particle swarm optimization (PSO) algorithm. Based on the occurrence frequency characteristics of codevectors, six bits for shape codebook and two bits for gain codebook are used in this work to produce a vocoder with lower bit rate as compared with traditional ITU-T G.728 vocoder. The performance comparison of the proposed SOM array trained by PSO-optimized supervised algorithm as the codebook search module in the structure of LD-CELP with a conventional implementation of LD-CELP coder shows that execution time of the algorithm is reduced up to 44 %. However, the degradation of voice quality in terms of mean opinion score, perceived evaluation of speech quality and segmental signal-to-noise ratio (SNRseg) is acceptable.

Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Roychoudhuri L, Al-Saher E, Brewster GB (2006) On the impact of loss and delay variation on Internet packet audio transmission. Comput Commun 29:1578–1589
Chen JH, Cox RV, Lin YC, Jayant N (1992) A low delay CELP coder for CCITT 16 kb/s speech coding standard. IEEE J Sel Areas Commmun 10:830–847
International Telephone and Telegraph Consultative Committee (1992) Recommendation G.728: coding of speech at 16 kbit/s using low-delay code excited linear prediction, Geneva. Available on http://www.itu.int/rec/T-REC-G.728/en
Knyva V, Savickas M (2002) Increasing of speech compression degree of LD-CELP algorithm. J Electron Electr Eng 39:13–16
Sheikhan M, Tebyani M, Lotfizad M (1997) Continuous speech recognition and syntactic processing in Iranian Farsi language. Int J Speech Technol 1:135–141
Sheikhan M (2003) Suboptimum extracted features and classifier for speaker-independent Farsi digit recognizer. In: The proceedings of the international symposium on telecommunications, pp 246–249
Sheikhan M, Gharavian D, Ashoftedel F (2011) Using DTW-neural based MFCC warping to improve emotional speech recognition. Neural Comput Appl (article in press). Available online 14 May 2011. doi:10.1007/s00521-011-0620-8
Sheikhan M (2003) Prosody generation in Farsi language. In: The proceedings of the international symposium on telecommunications, pp 250–253
Sheikhan M, Nasirzadeh M, Daftarian A (2006) Text to speech for Iraninan dialect of Farsi language. In: The proceedings of the second workshop on Farsi computer speech, pp 39–53
Gharavian D, Sheikhan M, Nazerieh AR, Garoucy S (2011) Speech emotion recognition using FCBF feature selection method and GA-optimized fuzzy ARTMAP neural network. Neural Comput Appl (article in press). Available online 27 May 2011. doi:10.1007/s00521-011-0643-1
Sheikhan M, Bejani M, Gharavian D (2012) Modular neural-SVM scheme for speech emotion recognition using ANOVA feature selection method. Neural Comput Appl (article in press). Available online 20 Jan 2012. doi:10.1007/s00521-012-0814-8
Gharavian D, Sheikhan M, Ashoftedel F (2012) Emotion recognition improvement using normalized formant supplementary features by hybrid of DTW-MLP-GMM model. Neural Comput Appl (article in press). Available online 15 Feb 2012. doi:10.1007/s00521-012-0884-7
Sheikhan M, Tebyani M, Lotfizad M (1996) Using symbolic and connectionist approaches to automate editing Persian sentences syntacticly. In: The proceedings of the international conference on intelligent and cognitive systems, pp 250–253
Birgmeier M (1996) Nonlinear prediction of speech signals using radial basis function networks. In: The proceedings of the European signal processing conference, vol 1, pp 459–462
Faundez M (1999) Adaptive hybrid speech coding with a MLP/LPC structure. In: The proceedings of the international work-conference on artificial and natural neural networks, vol 11, pp 814–823
Sassi SB, Braham R, Belghith A (2001) Neural speech synthesis system for Arabic language using CELP algorithm. In: The proceedings of the ACS/IEEE international conference on computer systems and applications, pp 119–121
Faúndez-Zanuy M (2003) Nonlinear speech coding with MLP, RBF and Elman based prediction. Lecture Notes Comput Sci 2687:671–678
Easton MG, Goodyear CC (1991) A CELP codebook and search technique using a Hopfield net. In: The proceedings of the IEEE international conference on acoustics, speech and signal processing, pp 685–688
Indrayanto A, Langi A, Kinsner W (1991) A neural network mapper for stochastic codebook parameter encoding in code excited linear predictive speech processing. In: The proceedings of the IEEE western Canada conference on computer, power and communication systems in a rural environment, pp 221–224
Hernandez-Gomez LA, Lopez-Gonzalo E (1993) Phonetically-driven CELP coding using self-organizing maps. In: The proceedings of the IEEE international conference on acoustics, speech and signal processing, vol 2, pp 628–631
Wu S, Zhang G, Zhang X, Zhao Q (2008) A LD-aCELP speech coding algorithm based on modified SOFM vector quantizer. In: The proceedings of the international symposium on intelligent information technology application, pp 408–411
Huong V, Min BJ, Park DC, Woo DM (2008) A new vocoder based on AMR 7.4 kbit/s mode in speaker dependent coding system. In: The proceedings of the ACIS international conference on software engineering, artificial intelligence, networking, and parallel/distributed computing, pp 163–167
Zhang G, Xie K, Zhao Z, Xue C (2006) The LD-CELP gain filter based on BP NN. Lecture Notes Comput Sci 3973:150–155
Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biol Cybern 43:59–69
Tokunaga K, Furukawa T (2009) Modular network SOM. Neural Netw 22:82–90
Ghouila A, Ben Yahia S, Malouche D, Jmel H, Laouini D, Guerfali FZ, Abdelhak S (2009) Application of multi-SOM clustering approach to macrophage gene expression analysis. Infect Genet Evol 9:328–336
Furukawa T (2009) SOM of SOMs. Neural Netw 22:463–478
Zhang J, Dai D (2009) An adaptive spatial clustering method for automatic brain MR image segmentation. Prog Nat Sci 19:1373–1382
Xu L, Xu Y, Chow TWS (2010) PolSOM: a new method for multidimensional data visualization. Pattern Recognit 43:1668–1675
Xu L, Xu Y, Chow TWS (2011) PPoSOM: a new variant of PolSOM by using probabilistic assignment for multidimensional data visualization. Neurocomputing 74:2018–2027
Kamimura R (2011) Supposed maximum information for comprehensible representations in SOM. Neurocomputing 74:1116–1134
Kamimura R (2011) Relative information maximization and its application to the extraction of explicit class structure in SOM. Neurocomputing 82:37–51
Jiang X, Liu K, Yan J, Chen W (2012) Application of improved SOM neural network in anomaly detection. Phys Procedia 33:1093–1099
Sirisin S, Jonburom W, Rattanakorn N, Pornsuwancharoen N (2012) A new technique gray scale display of input data using shooting SOM and genetic algorithm. Procedia Eng 32:556–563
Tai W-S, Hsu C–C (2012) Growing self-organizing map with cross insert for mixed-type data clustering. Appl Soft Comput 12:2856–2866
D’Urso P, De Giovanni L (2008) Temporal self-organizing maps for telecommunications market segmentation. Neurocomputing 71:2880–2892
Mo J, Kiang MY, Zou P, Li Y (2010) A two-stage clustering approach for multi-region segmentation. Expert Syst Appl 37:7120–7131
Hadavandi E, Shavandi H, Ghanbari A (2010) Integration of genetic fuzzy systems and artificial neural networks for stock price forecasting. Knowl Based Syst 23:800–808
Ghaseminezhad MH, Karami A (2011) A novel self-organizing map (SOM) neural network for discrete groups of data clustering. Appl Soft Comput 11:3771–3778
Liu Y-C, Wu C, Liu M (2011) Research of fast SOM clustering for text information. Expert Syst Appl 38:9325–9333
Sadeghi F, Izadinia H, Safabakhsh R (2011) A new active contour model based on the conscience, archiving and mean-movement mechanisms and the SOM. Pattern Recognit Lett 32:1622–1634
Rasti J, Monadjemi A, Vafaei A (2011) Color reduction using a multi-stage Kohonen self-organizing map with redundant features. Expert Syst Appl 38:13188–13197
Yu Z, Wong H-S, You J, Han G (2012) Visual query processing for efficient image retrieval using a SOM-based filter-refinement scheme. Inf Sci 203:83–101
Chattopadhyay M, Dan PK, Mazumdar S (2012) Application of visual clustering properties of self organizing map in machine-part cell formation. Appl Soft Comput 12:600–610
Gorricha J, Lobo V (2012) Improvements on the visualization of clusters in geo-referenced data using self-organizing maps. Comput Geosci 43:177–186
Sánchez-Lasheras F, de Andrés J, Lorca P, de Cos Juez FJ (2012) A hybrid device for the solution of sampling bias problems in the forecasting of firms’ bankruptcy. Expert Syst Appl 39:7512–7523
di Tollo G, Tanev S, Davide DM, Ma Z (2012) Neural networks to model the innovativeness perception of co-creative firms. Expert Syst Appl 39:12719–12726
Pham HV, Cooper EW, Cao T, Kamei K (2011) Hybrid Kansei-SOM model using risk management and company assessment for stock trading. Inf Sci (article in press). Available online 6 Dec 2011. doi:10.1016/j.ins.2011.11.036
Liao W-C, Hsu C–C (2012) A self-organizing map for transactional data and the related categorical domain. Appl Soft Comput 12:3141–3157
Tisan A, Cirstea M (2012) SOM neural network design - a new Simulink library based approach targeting FPGA implementation. Math Comput Simul (article in press). Available online 6 June 2012. doi:10.1016/j.matcom.2012.05.006
Gao M, Hong X, Chen S, Harris CJ (2011) A combined SMOTE and PSO based RBF classifier for two-class imbalanced problems. Neurocomputing 74:3456–3466
Zhao L, Qian F (2011) Tuning the structure and parameters of a neural network using cooperative binary-real particle swarm optimization. Expert Syst Appl 38:4972–4977
Leung SYS, Tang Y, Wong WK (2012) A hybrid particle swarm optimization and its application in neural networks. Expert Syst Appl 39:395–405
Nabavi-Kerizi SH, Abadi M, Kabir E (2010) A PSO-based weighting method for linear combination of neural networks. Comput Electr Eng 36:886–894
Zhang JR, Zhang J, Lok TM, Lyu MR (2007) A hybrid particle swarm optimization-back propagation algorithm for feedforward neural network training. Appl Math Comput 185:1026–1037
Yu J, Wang S, Xi L (2008) Evolving artificial neural networks using an improved PSO and DPSO. Neurocomputing 71:1054–1060
Lee CM, Ko CN (2009) Time series prediction using RBF neural networks with a nonlinear time-varying evolution PSO algorithm. Neurocomputing 73:449–460
Khayat O, Ebadzadeh MM, Shahdoosti HR, Rajaei R, Khajehnasiri I (2009) A novel hybrid algorithm for creating self-organizing fuzzy neural networks. Neurocomputing 73:517–524
Luitel B, Venayagamoorthy GK (2010) Quantum inspired PSO for the optimization of simultaneous recurrent neural networks as MIMO learning systems. Neural Networks 23:583–586
Subrahmanya N, Shin YC (2010) Constructive training of recurrent neural networks using hybrid optimization. Neurocomputing 73:2624–2631
Li J, Liu X (2011) Melt index prediction by RBF neural network optimized with an MPSO-SA hybrid algorithm. Neurocomputing 74:735–740
Yaghini M, Khoshraftar MM, Fallahi M (2012) A hybrid algorithm for artificial neural network training. Eng Appl Artif Intell (article in press). Available online 23 Mar 2012. doi:10.1016/j.engappai.2012.01.023
Cavuslu MA, Karakuzu C, Karakaya F (2012) Neural identification of dynamic systems on FPGA with improved PSO learning. Appl Soft Comput 12:2707–2718
Green II RC, Wang L, Alam M (2012) Training neural networks using central force optimization and particle swarm optimization: insights and comparisons. Expert Syst Appl 39:555–563
de Mingo López LF, Blas NG, Arteta A (2012) The optimal combination: grammatical swarm, particle swarm optimization and neural networks. J Comput Sci 3:46–55
Dehuri S, Roy R, Cho S-B, Ghosh A (2012) An improved swarm optimized functional link artificial neural network (ISO-FLANN) for classification. J Syst Softw 85:1333–1345
Sheikhan M, Mohammadi N (2012) Time series prediction using PSO-optimized neural network and hybrid feature selection algorithm for IEEE load data. Neural Comput Appl (article in press). Available online 7 June 2012. doi:10.1007/s00521-012-0980-8
Sheikhan M, Hemmati E (2012) PSO-optimized Hopfield neural network-based multipath routing for mobile ad-hoc networks. Int J Comput Intell Syst 5:568–581
Xiao Y, Feng L (2012) A novel neural-network approach of analog fault diagnosis based on kernel discriminant analysis and particle swarm optimization. Appl Soft Comput 12:904–920
Sheikhan M, Sha’bani AA (2012) PSO-optimized modular neural network trained by OWO-HWO algorithm for fault location in analog circuits. Neural Comput Appl (article in press). Available online 25 Apr 2012. doi:10.1007/s00521-012-0947-9
Sheikhan M, Pardis R, Gharavian D (2012) State of charge neural computational models for high energy density batteries in electric vehicles. Neural Comput Appl (article in press). Available online 17 Feb 2012. doi:10.1007/s00521-012-0883-8
Sheikhan M, Shahnazi R, Hemmati E (2012) Adaptive active queue management controller for TCP communication networks using PSO-RBF models. Neural Comput Appl (article in press). Available online 4 Jan 2012. doi:10.1007/s00521-011-0786-0
Sheikhan M, Shahnazi R, Garoucy S (2011) Hyperchaos synchronization using PSO-optimized RBF-based controllers to improve security of communication systems. Neural Comput Appl (article in press) Available online 16 Dec 2011. doi:10.1007/s00521-011-0774-4
Sheikhan M, Pezhmanpour M, Moin MS (2011) Improved contourlet-based steganalysis using binary particle swarm optimization and radial basis neural networks. Neural Comput Appl (article in press). Available online 19 Aug 2011. doi:10.1007/s00521-011-0729-9
Poultangari I, Shahnazi R, Sheikhan M (2012) RBF neural network based PI pitch controller for a class of 5-MW wind turbines using particle swarm optimization algorithm. ISA Trans (article in press). Available online 28 Jun 2012. doi:10.1016/j.isatra.2012.06.001
Vasumathi B, Moorthi S (2012) Implementation of hybrid ANN-PSO algorithm on FPGA for harmonic estimation. Eng Appl Artif Intell 25:476–483
Telecommunication Standardization Sector of ITU (1999) Recommendation G.728—Annex H: variable bit rate LD-CELP operation mainly for DCME at rates less than 16 kbit/s, Geneva. Available on http://www.itu.int/rec/T-REC-G.728-199905-S!AnnH/en
Sheikhan M, Tabataba Vakili V, Garoucy S (2009) Complexity reduction of LD-CELP speech coding in prediction of gain using neural networks. World Appl Sci J 7(Special Issue of Computer & IT):38–44
Sheikhan M, Garoucy S (2010) Reducing the codebook search time in G.728 speech coder using fuzzy ARTMAP neural networks. World Appl Sci J 8:1260–1266
Sheikhan M, Tabataba Vakili V, Garoucy S (2009) Codebook search in LD-CELP speech coding algorithm based on multi-SOM structure. World Appl Sci J 7(Special Issue of Computer & IT):59–68
Sheikhan M, Garoucy S (2011) Computational complexity reduction of AMR-WB speech coding algorithm using new GA-optimized fast codebook search techniques. World Appl Sci J 14:63–70
Telecommunication Standardization Sector of ITU (2002) Recommendation G.722.2: wideband coding of speech at around 16kbit/s using adaptive multi-rate wideband (AMR-WB), Geneva. Available on http://www.itu.int/rec/T-REC-G.722.2/en
Sheikhan M, Garoucy S, Ghoreishi SA (2011) An efficient codebook search method for speech coders optimized by evolutionary and swarm-based techniques. Sci Acad Trans Comput Commun Netw 1:60–67
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: The proceedings of the IEEE international conference on neural networks, vol 4, pp 1942–1948
Shi Y, Eberhart R (1998) Parameter selection in particle swarm optimization. In: The proceedings of the international conference on evolutionary programming, pp 591–601
Telecommunication Standardization Sector of ITU (1996) Recommendation P.800: methods for subjective determination of transmission quality, Geneva. Available on http://www.itu.int/rec/T-REC-P.800-199608-I/en
Al-Akhras M, Zedan H, John R, Al-Momani I (2009) Non-intrusive speech quality prediction in VoIP networks using a neural network approach. Neurocomputing 72:2595–2608
Telecommunication Standardization Sector of ITU (2001) Recommendation P.802: perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs, Geneva. Available on http://www.itu.int/rec/T-REC-P.862/
Xueying Z, Qunqun Z, Zhaoyang M (2008) Reducing the complexity of LD-CELP speech coding algorithm using direct vector quantization. In: The proceedings of the international conference on communications, circuits and systems, pp 811–815
Zahir Azami SB, Feng G (2000) Robust vector quantizer using self-organizing neural networks. Signal Process 80:1289–1298
Kohonen T (2001) Self-organizing maps, 3rd edn. Springer, Berlin
Hagenbuchner M, Tsoi A, Sperduti A (2001) A supervised self-organizing map for structured data. In: Allinson N, Yin H, Allinson L, Slack J (eds) Advances in self-organizing maps. Springer, Heidelberg, pp 21–28
Hagenbuchner M, Sperduti A, Tsoi A (2003) A self-organizing map for adaptive processing of structured data. IEEE Trans Neural Netw 14:491–505
Kawano N, Yajima H, Hotta A, Naito Y (1995) A variable bit-rate LD-CELP speech coder at 16, 12.8 and 9.6 kbit/s. In: The proceedings of the IEEE workshop on speech coding for telecommunications, pp 95–96
Linde Y, Buzo A, Gray RM (1980) An algorithm for vector quantizer design. IEEE Trans Commun 28:84–95
Deller JR, Hansen JHL, Proakis JG (2000) Discrete-time processing of speech signals, 2nd edn. IEEE Press, New York
Uriarte EA, Martin FD (2006) Topology preservation in SOM. World Acad Sci Eng Technol 21:52–55
Max J (1960) Quantizing for minimum distortion. IRE Trans Inf Theory 6:7–12
Paez MD, Glisson TH (1972) Minimum mean squared-error quantization in speech. IEEE Trans Commun 20:225–230
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sheikhan, M., Garoucy, S. Substitution of G.728 vocoder’s codebook search module with SOM array trained by PSO-optimized supervised algorithm. Neural Comput & Applic 23, 2309–2321 (2013). https://doi.org/10.1007/s00521-012-1183-z
Issue Date:
DOI: https://doi.org/10.1007/s00521-012-1183-z