Skip to main content
Log in

Time-frequency visual representation and texture features for audio applications: a comprehensive review, recent trends, and challenges

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The conventional audio feature extraction methods employed in the audio analysis are categorized into time-domain and frequency-domain. Recently, a new audio feature extraction approach using time-frequency texture image is developed and utilized for different applications. In this approach, the input audio signal is first converted into a time-frequency image, and then textural features are extracted from the visual representation. The distinctive two-dimensional time-frequency visualization textural descriptors can produce better features for improved audio detection and classification. In this article, a comprehensive review of state-of-the-art techniques used for audio detection and classification is presented. The generalized architecture of time-frequency texture feature extraction approaches in audio classification algorithms is presented first. Based on a review of over 70 papers, the key contributions in the area of time-frequency representations of various researchers are highlighted in addition to the textural features. This survey also compares and analyzes the existing experimental algorithms proposed for various audio classification tasks. Finally, the critical challenges and limitations with different visual representations are highlighted, along with potential future research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Data Availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study

References

  1. Abidin S, Togneri R, Sohel F (2017) Enhanced lbp texture features from time frequency representations for acoustic scene classification. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 626–630. https://doi.org/10.1109/ICASSP.2017.7952231

  2. Abidin S, Togneri R, Sohel F (2018) Acoustic scene classification using joint time-frequency image-based feature representations. In: 2018 15th IEEE international conference on advanced video and signal based surveillance (AVSS), pp 1–6. https://doi.org/10.1109/AVSS.2018.8639164

  3. Abidin S, Togneri R, Sohel F (2018) Spectrotemporal analysis using local binary pattern variants for acoustic scene classification. IEEE/ACM Trans Audio Speech Lang Process 26(11):2112–2121. https://doi.org/10.1109/TASLP.2018.2854861

    Article  Google Scholar 

  4. Abidin S, Xia X, Togneri R, Sohel F (2018) Local binary pattern with random forest for acoustic scene classification. In: 2018 IEEE international conference on multimedia and expo, ICME 2018. IEEE, institute of electrical and electronics engineers, United States, vol 2018-July. https://doi.org/10.1109/ICME.2018.8486578

  5. Agera N, Chapaneri S, Jayaswal D (2015) Exploring textural features for automatic music genre classification. In: 2015 International conference on computing communication control and automation, pp 822–826. https://doi.org/10.1109/ICCUBEA.2015.164

  6. Ahmed F, Paul PP, Gavrilova M (2016) Music genre classification using a gradient-based local texture descriptor. In: Czarnowski I, Caballero AM, Howlett RJ, Jain LC (eds) Intelligent decision technologies 2016. Springer international publishing, Cham, pp 455–464. https://doi.org/10.1007/978-3-319-39627-9-40

  7. Alam MS, Jassim WA, Zilany MSA (2018) Radon transform of auditory neurograms: a robust feature set for phoneme classification. IET Sig Process 12(3):260–268. https://doi.org/10.1049/iet-spr.2017.0170

    Article  Google Scholar 

  8. Ashfaque Mostafa T, Soltaninejad S, McIsaac TL, Cheng I (2021) A comparative study of time frequency representation techniques for freeze of gait detection and prediction. Sensors, vol 21(19). https://doi.org/10.3390/s21196446

  9. Battaglino D, Lepauloux L, Pilati L, Evans N (2015) Acoustic context recognition using local binary pattern codebooks. In: 2015 IEEE workshop on applications of signal processing to audio and acoustics (WASPAA), pp 1–5. https://doi.org/10.1109/WASPAA.2015.7336886

  10. Bhattacharjee M, Prasanna SRM, Guha P (2018) Time-frequency audio features for speech-music classification

  11. Bhatti UA, Huang M, Wu D, Zhang Y, Mehmood A, Han H (2019) Recommendation system using feature extraction and pattern recognition in clinical care systems. Enterpr Inf Syst 13(3):329–351. https://doi.org/10.1080/17517575.2018.1557256

    Article  Google Scholar 

  12. Bhatti UA, Ming-Quan Z, Qing-Song H, Ali S, Hussain A, Yuhuan Y, Yu Z, Yuan L, Nawaz SA (2021) Advanced color edge detection using clifford algebra in satellite images. IEEE Photon J 13(2):1–20. https://doi.org/10.1109/JPHOT.2021.3059703

    Article  Google Scholar 

  13. Bhatti UA, Zhaoyuan Y, Linwang Y, Zeeshan Z, Ali NS, Mughair B, Anum M, Ul AQ, Luo W (2020) Geometric algebra applications in geospatial artificial intelligence and remote sensing image processing. IEEE Access 8:155783–155796. https://doi.org/10.1109/ACCESS.2020.3018544

    Article  Google Scholar 

  14. Birajdar GK, Patil MD (2019) Speech and music classification using spectrogram based statistical descriptors and extreme learning machine. Multimed Tools Appl 78(11):15141–15168. https://doi.org/10.1007/s11042-018-6899-z

    Article  Google Scholar 

  15. Birajdar GK, Patil MD (2020) Speech/music classification using visual and spectral chromagram features. J Ambient Intell Humanized Comput 11:329–347. https://doi.org/10.1007/s12652-019-01303-4

    Article  Google Scholar 

  16. Birajdar GK, Raveendran S (2022) Indian language identification using time-frequency texture features and kernel ELM. J Ambient Intell Humanized Comput:1–12. https://doi.org/10.1007/s12652-022-03781-5

  17. Bisot V, Essid S, Richard G (2015) HOG and subband power distribution image features for acoustic scene classification. In: 2015 23rd European signal processing conference (EUSIPCO), pp 719–723. https://doi.org/10.1109/EUSIPCO.2015.7362477

  18. Breve B, Cirillo S, Cuofano M, Desiato D (2020) Perceiving space through sound: mapping human movements into MIDI. In: 26th International conference on distributed multimedia systems, virtual conference center, USA, pp 49–56. https://doi.org/10.18293/DMSVIVA20-011

  19. Breve B, Cirillo S, Cuofano M, Desiato D (2022) Enhancing spatial perception through sound: mapping human movements into MIDI. Multimed Tools Appl 81(1):73–94. https://doi.org/10.1007/s11042-021-11077-7

    Article  Google Scholar 

  20. Chen Y, Li H, Hou L, Bu X (2019) Feature extraction using dominant frequency bands and time-frequency image analysis for chatter detection in milling. Precis Eng 56:235–245. https://doi.org/10.1016/j.precisioneng.2018.12.004

    Article  Google Scholar 

  21. Chowdhury AA, Borkar VS, Birajdar GK (2020) Indian language identification using time-frequency image textural descriptors and gwo-based feature selection. J Exp Theor Artif Intell 32(1):111–132. https://doi.org/10.1080/0952813X.2019.1631392

    Article  Google Scholar 

  22. Connolly J, Edmonds E, Guzy J, Johnson S, Woodcock A (1986) Automatic speech recognition based on spectrogram reading. Int J Man-Mach Stud 24(6):611–621. https://doi.org/10.1016/S0020-7373(86)80012-8 . http://www.sciencedirect.com/science/article/pii/S0020737386800128

    Article  Google Scholar 

  23. Costa Y, Oliveira L, Koerich A, Gouyon F (2013) Music genre recognition based on visual features with dynamic ensemble of classifiers selection. In: 2013 20th International conference on systems, signals and image processing (IWSSIP), pp 55–58

  24. Costa Y, Oliveira L, Koerich A, Gouyon F (2013) Music genre recognition using gabor filters and LPQ texture descriptors. In: Ruiz-Shulcloper J, Sanniti di Baja G (eds) Progress in pattern recognition, image analysis, computer vision, and applications. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 67–74. https://doi.org/10.1007/978-3-642-41827-3-9

  25. Costa Y, Oliveira L, Koerich A, Gouyon F, Martins J (2012) Music genre classification using LBP textural features. Sig Process 92(11):2723–2737. https://doi.org/10.1016/j.sigpro.2012.04.023

    Article  Google Scholar 

  26. Costa YMG, Oliveira LS, Koericb AL, Gouyon F (2011) Music genre recognition using spectrograms. In: 2011 18th International conference on systems, signals and image processing, pp 1–4

  27. Costa YMG, Oliveira LS, Koerich AL, Gouyon F (2012) Comparing textural features for music genre classification. In: The 2012 international joint conference on neural networks (IJCNN), pp 1–6. https://doi.org/10.1109/IJCNN.2012.6252626

  28. Demir F, Sengür A, Cummins N, Amiriparian S, Schuller BW (2018) Low level texture features for snore sound discrimination. In: 2018 40th Annual international conference of the IEEE engineering in medicine and biology society (EMBC), pp 413–416. https://doi.org/10.1109/EMBC.2018.8512459

  29. Dennis J, Tran HD, Chng ES (2013) Image feature representation of the subband power distribution for robust sound event classification. IEEE Trans Audio Speech Lang Process 21(2):367–377. https://doi.org/10.1109/TASL.2012.2226160

    Article  Google Scholar 

  30. Dennis J, Tran HD, Li H (2011) Spectrogram image feature for sound event classification in mismatched conditions. IEEE Sig Process Lett 18 (2):130–133. https://doi.org/10.1109/LSP.2010.2100380

    Article  Google Scholar 

  31. Dutta A, Sil D, Chandra A, Palit S (2022) Cnn based musical instrument identification using time-frequency localized features. Int Technol Lett 5 (1):e191. https://doi.org/10.1002/itl2.191

    Article  Google Scholar 

  32. Felipe GZ, Aguiar RL, Costa YMG, Silla C, Brahnam S, Nanni L, McMurtrey S (2019) Identification of infants’ cry motivation using spectrograms. In: 2019 International conference on systems, signals and image processing (IWSSIP), pp 181–186. https://doi.org/10.1109/IWSSIP.2019.8787318

  33. Felipe GZ, Maldonado Y, Costa DG, Helal LG (2017) Acoustic scene classification using spectrograms. In: 2017 36th International conference of the chilean computer science society (SCCC), pp 1–7. https://doi.org/10.1109/SCCC.2017.8405119

  34. Ghosal A, Chakraborty R, Dhara BC, Saha SK (2012) Song/instrumental classification using spectrogram based contextual features. In: Proceedings of the CUBE international information technology conference, CUBE ’12. Association for computing machinery, New York, NY, USA, pp 21–25. https://doi.org/10.1145/2381716.2381722

  35. Godbole S, Jadhav V, Birajdar G (2020) Indian language identification using deep learning. ITM Web Conf 32:01010. https://doi.org/10.1051/itmconf/20203201010

    Article  Google Scholar 

  36. Jassim WA, Harte N (2018) Voice activity detection using neurograms. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 5524–5528. https://doi.org/10.1109/ICASSP.2018.8461952

  37. Jog AH, Jugade OA, Kadegaonkar AS, Birajdar GK (2018) Indian language identification using cochleagram based texture descriptors and ann classifier. In: 2018 15th IEEE India council international conference (INDICON), pp 1–6. https://doi.org/10.1109/INDICON45594.2018.8987167

  38. Klatt D, Stevens K (1973) On the automatic recognition of continuous speech:implications from a spectrogram-reading experiment. IEEE Trans Audio Electroacoustics 21(3):210–217. https://doi.org/10.1109/TAU.1973.1162453

    Article  Google Scholar 

  39. Kobayashi T, Ye J (2014) Acoustic feature extraction by statistics based local binary pattern for environmental sound classification. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 3052–3056. https://doi.org/10.1109/ICASSP.2014.6854161

  40. Lacerda EB, Mello CA (2017) Automatic classification of laryngeal mechanisms in singing based on the audio signal. Procedia Comput Sci 112:2204–2212. https://doi.org/10.1109/ICASSP.2014.6854161

    Article  Google Scholar 

  41. Li Y, Huang H, Wu Z (2019) Animal sound recognition based on double feature of spectrogram. Chinese J Electron 28(4):667–673. https://doi.org/10.1049/cje.2019.04.005

    Article  Google Scholar 

  42. Lim H, Kim MJ, Kim H (2015) Robust sound event classification using LBP-HOG based bag-of-audio-words feature representation. In: INTERSPEECH, pp 3325–3329

  43. Matsui T, Goto M, Vert J, Uchiyama Y (2011) Gradient-based musical feature extraction based on scale-invariant feature transform. In: 2011 19th European signal processing conference, pp 724–728

  44. McLoughlin IV, Xie Z, Song Y, Phan H, Palaniappan R (2020) Time-frequency feature fusion for noise-robust audio event classification. Circ Syst Sig Process 39:1672–1687. https://doi.org/10.1007/s00034-019-01203-0

    Article  Google Scholar 

  45. Montalvo A, Costa YMG, Calvo JR (2015) Language identification using spectrogram texture. In: Pardo A, Kittler J (eds) Progress in pattern recognition, image analysis, computer vision, and applications. Springer international publishing, Cham, pp 543–550. https://doi.org/10.1007/978-3-319-25751-8-65

  46. Mulimani M, Koolagudi SG (2019) Robust acoustic event classification using fusion fisher vector features. Appl Acoust 155:130–138. https://doi.org/10.1016/j.apacoust.2019.05.020

    Article  Google Scholar 

  47. Nanni L, Aguiar RL, Costa YMG, Brahnam S, Silla CN, Brattin RL, Zhao Z (2018) Bird and whale species identification using sound images. IET Comput Vis 12(2):178–184. https://doi.org/10.1049/iet-cvi.2017.0075

    Article  Google Scholar 

  48. Nanni L, Costa Y, Brahnam S (2014) Set of texture descriptors for music genre classification

  49. Nanni L, Costa Y, Lucio D, Silla C, Brahnam S (2017) Combining visual and acoustic features for audio classification tasks. Pattern Recog Lett 88:49–56. https://doi.org/10.1016/j.patrec.2017.01.013

    Article  Google Scholar 

  50. Nanni L, Costa YM, Lumini A, Kim MY, Baek SR (2016) Combining visual and acoustic features for music genre classification. Expert Syst Appl 45:108–117. https://doi.org/10.1016/j.eswa.2015.09.018

    Article  Google Scholar 

  51. Nanni L, Costa YMG, Aguiar RL, Jr CNS, Brahnam S (2018) Ensemble of deep learning, visual and acoustic features for music genre classification. J New Music Res 47(4):383–397. https://doi.org/10.1080/09298215.2018.1438476

  52. Nanni L, Costa YMG, Lucio DR, Silla C, Brahnam S (2016) Combining visual and acoustic features for bird species classification. In: 2016 IEEE 28th international conference on tools with artificial intelligence (ICTAI), pp 396–401. https://doi.org/10.1109/ICTAI.2016.0067

  53. Oo MM, Oo LL (2020) Fusion of Log-Mel spectrogram and GLCM feature in acoustic scene classification. Springer international publishing, Cham, pp 175–187. https://doi.org/10.1007/978-3-030-24344-9-11

  54. Özseven T (2018) Investigation of the effect of spectrogram images and different texture analysis methods on speech emotion recognition. Appl Acoust 142:70–77. https://doi.org/10.1016/j.apacoust.2018.08.003

    Article  Google Scholar 

  55. Rahmeni R, Ben Aicha A, Ben Ayed Y (2019) On the contribution of the voice texture for speech spoofing detection. In: 2019 19th International conference on sciences and techniques of automatic control and computer engineering (STA), pp 501–505

  56. Rakotomamonjy A, Gasso G (2015) Histogram of gradients of time-frequency representations for audio scene classification. IEEE/ACM Trans Audio Speech Lang Process 23(1):142–153. https://doi.org/10.1109/TASLP.2014.2375575

    Google Scholar 

  57. Ren J, Jiang X, Yuan J, Magnenat-Thalmann N (2017) Sound-event classification using robust texture features for robot hearing. IEEE Trans Multimed 19(3):447–458. https://doi.org/10.1109/TMM.2016.2618218

    Article  Google Scholar 

  58. Sell G, Clark P (2014) Music tonality features for speech/music discrimination. 2014. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP) pp 2489–2493. https://doi.org/10.1109/ICASSP.2014.6854048

  59. Sharan RV, Abeyratne UR, Swarnkar VR, Porter P (2019) Automatic croup diagnosis using cough sound recognition. IEEE Trans Biomed Eng 66(2):485–495. https://doi.org/10.1109/TBME.2018.2849502

    Article  Google Scholar 

  60. Sharan RV, Moir TJ (2014) Audio surveillance under noisy conditions using time-frequency image feature. In: 2014 19th International conference on digital signal processing, pp 130–135. https://doi.org/10.1109/ICDSP.2014.6900815

  61. Sharan RV, Moir TJ (2015) Cochleagram image feature for improved robustness in sound recognition. In: 2015 IEEE international conference on digital signal processing (DSP), pp 441–444. https://doi.org/10.1109/ICDSP.2015.7251910

  62. Sharan RV, Moir TJ (2015) Noise robust audio surveillance using reduced spectrogram image feature and one-against-all SVM. Neurocomputing 158:90–99. https://doi.org/10.1016/j.neucom.2015.02.001

    Article  Google Scholar 

  63. Sharan RV, Moir TJ (2015) Robust audio surveillance using spectrogram image texture feature. In: 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 1956–1960. https://doi.org/10.1109/ICASSP.2015.7178312

  64. Sharan RV, Moir TJ (2015) Subband spectral histogram feature for improved sound recognition in low SNR conditions. In: 2015 IEEE international conference on digital signal processing (DSP), pp 432–435. https://doi.org/10.1109/ICDSP.2015.7251908

  65. Sharan RV, Moir TJ (2018) Pseudo-color cochleagram image feature and sequential feature selection for robust acoustic event recognition. Appl Acoust 140:198–204. https://doi.org/10.1016/j.apacoust.2018.05.030

    Article  Google Scholar 

  66. Sharma G, Umapathy K, Krishnan S (2020) Trends in audio signal feature extraction methods. Appl Acoust 158:107020. https://doi.org/10.1016/j.apacoust.2019.107020

    Article  Google Scholar 

  67. Shi X, Zhou F, Liu L, Zhao B, Zhang Z (2015) Textural feature extraction based on time-frequency spectrograms of humans and vehicles. IET Radar Sonar Navig 9(9):1251–1259. https://doi.org/10.1049/iet-rsn.2014.0432

    Article  Google Scholar 

  68. Spyrou E, Nikopoulou R, Vernikos I, Mylonas P (2019) Emotion recognition from speech using the bag-of-visual words on audio segment spectrograms. Technologies, vol 7(1). https://doi.org/10.3390/technologies7010020

  69. Valerio VD, Pereira RM, Costa YMG, Bertolini D, Silla CN (2018) A resampling approach for imbalanceness on music genre classification using spectrograms. In: Thirty-first international florida artificial intelligence research society conference (FLAIRS), pp 500–505

  70. Vyas S, Patil MD, Birajdar GK (2021) Classification of heart sound signals using time-frequency image texture features, Chapter 5, Wiley, pp 81–101. https://doi.org/10.1002/9781119818717.ch5

  71. Wakefield GH (1999) Mathematical representation of joint time-chroma distributions. pp 3807–3807-9. https://doi.org/10.1117/12.367679

  72. Wu H, Zhang M (2012) Gabor-lbp features and combined classifiers for music genre classification. In: Proceedings of the 2012 2nd international conference on computer and information application (ICCIA 2012), pp 419–423. Atlantis Press. https://doi.org/10.2991/iccia.2012.101

  73. Wu HQ, Zhang M (2013) Gabor-lbp features and combined classifiers for music genre classification. In: Information technology applications in industry, computer engineering and materials science, advanced materials research, vol 756, pp 4407-4411. Trans Tech Publications Ltd. https://doi.org/10.4028/www.scientific.net/AMR.756-759.4407

  74. Wu M, Chen Z, Jang JR, Ren J, Li Y, Lu C (2011) Combining visual and acoustic features for music genre classification. In: 2011 10th International conference on machine learning and applications and workshops, vol 2, pp 124–129. https://doi.org/10.1109/ICMLA.2011.48

  75. Wu MJ, Jang JSR (2015) Combining acoustic and multilevel visual features for music genre classification. ACM Trans Multimed Comput Commun Appl, vol 12(1). https://doi.org/10.1145/2801127

  76. Xie J, Zhu M (2019) Investigation of acoustic and visual features for acoustic scene classification. Expert Syst Appl 126:20–29. https://doi.org/10.1016/j.eswa.2019.01.085

    Article  Google Scholar 

  77. Yang W, Krishnan S, Yang W, Krishnan S (2017) Combining temporal features by local binary pattern for acoustic scene classification. IEEE/ACM Trans Audio Speech Lang Proc 25(6):1315–1321. https://doi.org/10.1109/TASLP.2017.2690558

    Article  Google Scholar 

  78. Yang X, Luo J, Wang Y, Zhao X, Li J (2018) Combining auditory perception and visual features for regional recognition of chinese folk songs. In: Proceedings of the 2018 10th international conference on computer and automation engineering, ICCAE 2018. Association for computing machinery, New York, NY, USA, pp 75–81. https://doi.org/10.1145/3192975.3193006

  79. Yasmin G, Das AK (2019) Speech and non-speech audio files discrimination extracting textural and acoustic features. In: Bhattacharyya S, Mukherjee A, Bhaumik H, Das S, Yoshida K (eds) Recent trends in signal and image processing. Springer Singapore, Singapore, pp 197–206. https://doi.org/10.1007/978-981-10-8863-6_20

  80. Ye J, Kobayashi T, Murakawa M, Higuchi T (2015) Acoustic scene classification based on sound textures and events. In: Proceedings of the 23rd ACM international conference on multimedia. Association for computing machinery, New York, NY, USA, pp 1291–1294. https://doi.org/10.1145/2733373.2806389

  81. Yu G, Slotine JJE (2009) Audio classification from time-frequency texture. In: 2009 IEEE international conference on acoustics, speech and signal processing pp 1677–1680. https://doi.org/10.1109/ICASSP.2009.4959924

  82. Zhang S, Zhao Z, Xu Z, Bellisario K, Pijanowski BC (2018) Automatic bird vocalization identification based on fusion of spectral pattern and texture features. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 271–275. https://doi.org/10.1109/ICASSP.2018.8462156

  83. Zhang Y, Dai S, Song W, Zhang L, Li D (2020) Exposing speech resampling manipulation by local texture analysis on spectrogram images. Electronics 9(1):1–23. https://doi.org/10.3390/electronics9010023

    Article  Google Scholar 

  84. Zhang Y, Zhang K, Wang J, Su Y (2021) Robust acoustic event recognition using AVMD-PWVD time-frequency image. Appl Acoust 178:107970. https://doi.org/10.1016/j.apacoust.2021.107970

    Article  Google Scholar 

  85. Zottesso RH, Costa Y, Bertolini D, Oliveira L (2018) Bird species identification using spectrogram and dissimilarity approach. Ecol Inform 48:187–197. https://doi.org/10.1109/ICASSP.1979.1170735

    Article  Google Scholar 

  86. Zue V, Cole R (1979) Experiments on spectrogram reading. In: ICASSP ’79. IEEE international conference on acoustics, speech, and signal processing, vol 4, pp 116–119. https://doi.org/10.1109/ICASSP.1979.1170735

  87. Zue V, Lamel L (1986) An expert spectrogram reader: a knowledge-based approach to speech recognition. In: ICASSP ’86. IEEE international conference on acoustics, speech, and signal processing, vol 11, pp 1197–1200. https://doi.org/10.1109/ICASSP.1986.1168798

Download references

Funding

Funding No funding was received to assist with the preparation of this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gajanan K. Birajdar.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mistry, Y.D., Birajdar, G.K. & Khodke, A.M. Time-frequency visual representation and texture features for audio applications: a comprehensive review, recent trends, and challenges. Multimed Tools Appl 82, 36143–36177 (2023). https://doi.org/10.1007/s11042-023-14734-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-14734-1

Keywords

Navigation