Skip to main content

Part of the book series: Lecture Notes in Computer Science ((TLDKS,volume 12630))

  • 452 Accesses

Abstract

In this chapter, we will emphasize on some of the most prominent advances in smart technologies that formulate the smart city ecosystem. Furthermore, we will be highlighting the automation of numerous developments based on the extraction and analysis of digital media, using speech and images. At present, there is a multitude of practical systems used for personalization and recommendation of different media. On the other hand, there are assorted types of services in different areas that are directly benefiting from these advancements. Most of them were created with human-machine interaction methodology in mind, where people had to interact with the machines in various ways. In the past this type of interaction has been completed through the use of conventional interfaces such as a mouse and a keyboard, where the user had to type a response manually, which was in turn recorded by the machine for subsequent analysis. Therefore, in order to simplify these types of interactions and lead to improvement of services, new methodologies must be studied, discovered and developed so as to improve services such as recommendation and personalization services.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. https://www.ibm.com/blogs/insights-on-business/consumer-products/2-5-quintillion-bytes-of-data-created-every-day-how-does-cpg-retail-manage-it/. Accessed Apr 2020

  2. https://www.networkworld.com/article/3325397/storage/idc-expect-175-zettabytes-of-data-worldwide-by-2025.html. Accessed Apr 2020

  3. Lea, R.: Smart Cities: An Overview of the Technology Trends Driving Smart Cities (2017)

    Google Scholar 

  4. Bala, A., et al.: Voice command recognition system based on MFCC and DTW. Int. J. Eng. Sci. Technol. 2(12), 7335–7342 (2010)

    Google Scholar 

  5. Parameshachari, B.D., Sawan, K.G., Gooneshwaree, H., Tulsirai, T.G.: A study on smart home control system through speech. Int. J. Comput. Appl. 69(19), 30–39 (2013). 0975 – 8887

    Google Scholar 

  6. Alkhawaldeh, R.S.: DGR: gender recognition of human speech using one-dimensional conventional neural network. Hindawi Sci. Program. (2019). Article ID 7213717, 12 pages

    Google Scholar 

  7. Akçay, M.B., Oğuz, K.: Speech emotion recognition: emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers. Speech Communication 116, 56–76 (2020)

    Article  Google Scholar 

  8. Cowie, R., et al.: Emotion recognition in human-computer interaction. IEEE Signal Process. Mag. 18(1), 32–80 (2001). https://doi.org/10.1109/79.911197

  9. Teager, H., Teager, S.: Evidence for nonlinear sound production mechanisms in the vocal tract. In: Hardcastle, W.J., Marchal, A. (eds.) Speech Production and Speech Modeling, pp. 241–261. Springer, Cham (1990). https://doi.org/10.1007/978-94-009-2037-8_10

  10. Ghazanfar Latif, A.H., Khan, M., Butt, M., Butt, O.: IoT based real-time voice analysis and smart monitoring system for disabled people. In: Asia Pacific Journal of Contemporary Education and Communication Technology, Asia Pacific Institute of Advanced Research (APIAR), vol. 3, no. 2, pp. 227–234 (2017). https://doi.org/10.25275/apjcectv3i2ict5. ISBN (eBook) 978 0 9943656 8 2 | ISSN: 2205-6181

  11. Smith, B.: Raspberry Pi Assembly Language RASPBIAN Beginners: Hands on Guide. CreateSpace Independent Publishing Platform (2013)

    Google Scholar 

  12. Kumar, S.S., RangaBabu, T.: Emotion and gender recognition of speech signals using SVM. Emotion 4(3) (2015)

    Google Scholar 

  13. Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2002)

    Google Scholar 

  14. Belin, P., Fillion-Bilodeau, S., Gosselin, F.: The montreal affective voices: a validated set of nonverbal affect bursts for research on auditory affective processing. Behav. Res. Methods 40(2), 531–539 (2008)

    Article  Google Scholar 

  15. Davis, J., Goadrich, M.: The relationship between Precision-Recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 233–240. ACM, June 2006

    Google Scholar 

  16. Sood, S.K., Mahajan, I.: Wearable IoT sensor based healthcare system for identifying and controlling chikungunya virus. Comput. Ind. 91(2017), 33–44 (2017)

    Article  Google Scholar 

  17. Gope, P., Hwang, T.: BSN-care: a secure IoT-based modern healthcare system using body sensor network. IEEE Sensors J. 16(5), 1368–1376 (2016)

    Article  Google Scholar 

  18. Frant, E., Ispas, I., Dragomir, V., Dascalu, M., Zoltan, E., Stoica, I.C.: Voice based emotion recognition with convolutional neural networks for companion robots. Romanian J. Inf. Sci. Technol. 20(3), 222–240 (2017)

    Google Scholar 

  19. Cowie, R., Cornelius, R.: Describing the emotional states that are expressed in speech. Speech Commun. 40, 5–32 (2003)

    Article  Google Scholar 

  20. Bhatti, M., Wang, Y., Guan, L.: A neural network approach for human emotion recognition in speech. In: IEEE International Symposium on Circuits and Systems, Vancouver, BC, pp. 181–184 (2004)

    Google Scholar 

  21. Noda, T., Yano, Y., Doki, S., Okuma, S.: Adaptive emotion recognition in speech by feature selection based on KL-divergence. In: IEEE International Conference on Systems, Man, and Cybernetics in Taipei, Taiwan, 8–11 October 2006, pp. 1921–1926 (2006)

    Google Scholar 

  22. Murray and Arnott: Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. J. Acoust. Soc. Am. 93(i2), 1097–1108 (1993)

    Google Scholar 

  23. Nazia, H., Mahmuda, N.: Sensing emotion from voice jitter. In: SenSys 2018, Shenzhen, China, November 4–7 2018, pp. 359–360 (2018). ISBN 978-1-4503-5952-8

    Google Scholar 

  24. Ganapathy, H.H.S., Mallidi, S.H.: Robust feature extraction using modulation filtering of autoregressive models. IEEE Trans. Audio, Speech, Lang. Process. 22(8), 1285–1295 (2014)

    Article  Google Scholar 

  25. Kheder, M.A.D., Bausquet, P.: Dealing with additive noise in speaker recognition systems based on i-vector approach. In: IEEE ICASSP, Canada (2013)

    Google Scholar 

  26. Atal, B., Hanauer, S.: Speech analysis and synthesis by linear prediction of the speech wave. J. Acoust. Soc. Am. 50(2), 637–655 (1971)

    Article  Google Scholar 

  27. Iliev, A.I., Stanchev, P.L.: Smart multifunctional digital content ecosystem using emotion analysis of voice. In: 18th International Conference on Computer Systems and Technologies CompSysTech 2017, Ruse, Bulgaria, 22–24 June 2017, volume 1369, pp. 58–64. ACM (2017). ISBN 978-1-4503-5234-5

    Google Scholar 

  28. Iliev, A.: Monograph: Emotion Recognition From Speech. Lambert Academic Publishing (2012)

    Google Scholar 

  29. Iliev, A.I., Stanchev, L.P.: Information retrieval and recommendation using emotion from speech signal. In: 2018 IEEE Conference on Multimedia Information Processing and Retrieval, Miami, FL, USA, 10–12 April 2018, pp. 222–225 (2018). https://doi.org/10.1109/MIPR.2018.00054

  30. Iliev, A.I., Scordilis, M.S.: Spoken emotion recognition using glottal symmetry. EURASIP J. Adv. Signal Process. (2011). Article ID 624575, ISSN 1687-6180

    Google Scholar 

  31. Iliev, A.I., Scordilis, M.S.: Emotion recognition in speech using inter-sentence glottal statistics. In: Proceedings of the 15th International Conference on systems, Signals and Image Processing, IEEE-IWSSIP 2008, Bratislava, Slovakia, 25–28 June 2008, pp. 465–468 (2008)

    Google Scholar 

  32. Iliev, A.I., Stanchev, P.L.: Glottal attributes extracted from speech with application to emotion driven smart systems. In: Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2018), KDIR, vol. 1, pp. 297–302, Thomson Reuters, Seville, Spain, 18–20 September 2018. ISBN 978-989-758-330-8

    Google Scholar 

  33. Iliev, A.I., Scordilis, M.S., Papa, J.P., Falcão, A.X.: Spoken emotion recognition through optimum-path forest classification using glottal features. J. Comput. Speech Lang. 24(3), 445–460 (2010). ISSN 0885-2308

    Google Scholar 

  34. Iliev, A.I., Zhang, Y., Scordilis, M.S.: Spoken emotion classification using ToBI features and GMM. In: Proceedings of the 14th International Workshop on Signals and Image Processing 2007 and the 6th EURASIP Conference focused on Speech and Image Processing, Multimedia Communications and Services. IEEE-IWSSIP 2007, Maribor, Slovenia, 27–30 June 2007, pp. 495–498 (2007). ISSN 16874722, 16874714

    Google Scholar 

  35. Iliev, A.I.: Emotion recognition in speech using inter-sentence time-domain statistics. IJIRSET Int. J. Innov. Res. Sci. Eng. Technol. 5(3), 3245–3254 (2016)

    Google Scholar 

  36. Iliev, A.I.: Feature vectors for emotion recognition in speech. In: National Informatics Conference, Sofia, Bulgaria, pp. 225–238 (2016)

    Google Scholar 

  37. Iliev, A.I.: Content discovery using perceptual automation. In: Proceedings of the 10th International Conference on Management of Digital EcoSystems (MEDES 2018), Tokyo, Japan, 25–28 September 2018, pp. 233–238. ACM (2018). https://doi.org/10.1145/3281375.3281399. ISBN 978-1-4503-5622-0

  38. Lukose, S., Upadhya, S.: Music player based on emotion recognition of voice signals. In: 2017 International Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT), IEEE 2017, pp. 1751–1754 (2017). ISBN 978-1-5090-6106-8

    Google Scholar 

  39. Stanchev, P.: Using image mining for image retrieval. In: IASTED International Conference Computer Science and Technology, Cancun, Mexico, 19–21 May 2003, pp. 214–218 (2003)

    Google Scholar 

  40. Viana, W.: Using images to extend smart object discovery in an Internet of Things scenario. file:///C:/Users/pstan/Desktop/4057-829-4030-1-10-20181009.pdf

    Google Scholar 

  41. Stanchev, P., Green Jr., D., Dimitrov, B.: Some issues in the art image database systems. J. Digit. Inf. Manag. 4(4), 227–232 (2006)

    Google Scholar 

  42. Stanchev, P., Green Jr., D., Dimitrov, B.: High level color similarity retrieval. Int. J. Inf. Theor. Appl. 10(3), 283–287 (2003)

    Google Scholar 

  43. Ivanova, K., et al.: Local features in APICAS (analyzing of added value of the descriptors based on MPEG-7 vector quantization). Int. J Comput. Sci. Artif. Intell. 2(4), 23–32 (2012). ISSN: 2226-4450 (online) 2226-4469 (print)

    Google Scholar 

  44. Radenski, A., et al.: Big data techniques, systems, applications, and platforms: case studies from academia. In: Proceedings of the Federated Conference on Computer Science and Information Systems, pp. 893–898 (2016)

    Google Scholar 

  45. Ivanova, K., Mitov, I., Stanchev, P., Ein-Dor, P., Vanhoof, K.: Establishing correspondences between attribute spaces and complex concept spaces using meta-PGN classifier. In: Proc. of the 2nd International Conference on Digital Preservation and Presentation of Cultural Heritage, V. Tarnovo, Bulgaria, IMI-BAS, Sofia, pp. 71–77 (2012). ISSN 1314-4006

    Google Scholar 

  46. Stanchev, P., Kolinski, M.: Novel artist identification approach through digital image analysis using machine learning and merged images. In: Rocha, Á., Ferrás, C., Paredes, M. (eds.) ICITS 2019. AISC, vol. 918, pp. 465–471. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11890-7_45

    Chapter  Google Scholar 

  47. Stanchev, P., Geske, J.: Autonomous cars. History. State of art. Research problems. Springer Communications in Computer and Information Science, vol. 601, pp 1–10 (2016)

    Google Scholar 

  48. Viswanathan, V., Hussein, R.: Applications of image processing and real-time embedded systems in autonomous cars: a short review. Int. J. Image Process. (IJIP) 11(2), 36–49 (2017)

    Google Scholar 

  49. Salhi, A., Minaoui, B., Fakir, M.: Robust automatic traffic signs detection using fast polygonal approximation of digital curves. In: 2014 International Conference on Multimedia Computing and Systems (ICMCS), Marrakech, pp. 433–437 (2014)

    Google Scholar 

  50. Amato, G., Carrara, F., Falchi, F., Gennaro, C., Meghini, C., Vairo, C.: Deep learning for decentralized parking lot occupancy detection. Expert Syst. Appl. 72, 327–334 (2017)

    Article  Google Scholar 

  51. de Almeida, P.R.L., Oliveira, L.S., Britto Jr., A.S., Silva Jr., E.J., Koerich, A.L.: PKLot – a robust dataset for parking lot classification. Expert Syst. Appl. 42, 4937–4949 (2015)

    Google Scholar 

  52. Stanchev, P., Geske, J.: Smart Parking. Geoinformatics Research Papers, vol. 5, BS1002 (2017). https://doi.org/10.2205/codata2017

  53. Falchi, F., Gennaro, C., Savino, P., Stanchev, P.: Efficient video stream filtering. IEEE Multimed. 52–61 (2008)

    Google Scholar 

  54. Shapiro, M.: ‘The choice of reference points in best-match file searching’. Comm. ACM 20(5), 339–343 (1977)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander I. Iliev .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer-Verlag GmbH Germany, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Iliev, A.I., Stanchev, P.L. (2021). Smart Services Using Voice and Images. In: Hameurlain, A., Tjoa, A.M., Chbeir, R. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems XLVII. Lecture Notes in Computer Science(), vol 12630. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-62919-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-62919-2_6

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-62918-5

  • Online ISBN: 978-3-662-62919-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics