Skip to main content

Distributed Platform for the Extraction and Analysis of Information

  • Conference paper
  • First Online:
Sustainable Smart Cities and Territories (SSCTIC 2021)

Abstract

Information analysis has become a key tool today. Most large companies use generic or custom-developed applications that allow them to extract knowledge from data and translate that knowledge into greater benefit. However, in the field of information ingesting and processing, there are not many generic tools in terms of purpose and scalability to process larger amounts of information or perform the processing tasks faster. In this article, we present a tool designed to perform all kinds of personalized searches, and later, on the information retrieved from the Internet apply different transformations and analysis. The platform that supports the tool is based on a distributed architecture capable of adapting to the automatically available computing resources and guaranteeing optimal performance for these resources, allowing it to scale to various machines with relative easiness. However, in the area of information intake and processing, there are not many generic tools that are purposeful and scalable enough to process larger amounts of information or perform the processing tasks faster. In this article, we present a tool designed to perform all kinds of personalized searches, and later, on the information retrieved from the Internet apply different transformations and analysis. The platform that supports the tool is based on a distributed architecture capable of adapting to automatically available computing resources and ensuring optimal performance for those resources, allowing it to scale to multiple machines with relative ease. The system has been designed, deployed and evaluated successfully, and is presented throughout this document.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chamoso, P., Bartolomé, Á., García-Retuerta, D., Prieto, J., De La Prieta, F.: Profile generation system using artificial intelligence for information recovery and analysis. J. Ambient Intell. Hum. Comput. 11(11), 1–10 (2020)

    Article  Google Scholar 

  2. Corchado, J.M., et al.: Deepint.net: a rapid deployment platform for smart territories. Sensors 21(1), 236 (2021). https://doi.org/10.3390/s21010236. https://www.mdpi.com/1424-8220/21/1/236

  3. Germann, J.E.: Approaching the largest ‘API’: extracting information from the internet with python. Code4Lib J. (39) (2018)

    Google Scholar 

  4. Jun, S.P., Yoo, H.S., Choi, S.: Ten years of research change using Google trends: from the perspective of big data utilizations and applications. Technol. Forecast. Soc. Change 130, 69–87 (2018)

    Article  Google Scholar 

  5. Liang, S., Zhang, X., Ren, Z., Kanoulas, E.: Dynamic embeddings for user profiling in Twitter. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1764–1773 (2018)

    Google Scholar 

  6. Liu, P., Yi, S.P.: Investment decision-making and coordination of a three-stage supply chain considering data company in the big data era. Ann. Oper. Res. 270(1–2), 255–271 (2018)

    Article  MathSciNet  Google Scholar 

  7. Magdy, A., et al.: Geotrend: spatial trending queries on real-time microblogs. In: Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. SIGSPACIAL 2016, Association for Computing Machinery, New York, NY, USA (2016). https://doi.org/10.1145/2996913.2996986

  8. Rajaraman, V.: Big data analytics. Resonance 21(8), 695–716 (2016). https://doi.org/10.1007/s12045-016-0376-7

    Article  Google Scholar 

  9. Subramanian, N., Abdulrahman, M.D., Chan, H.K., Ning, K.: Big data analytics: service and manufacturing industries perspectives. In: Supply Chain Management in the Big Data Era, pp. 13–23. IGI Global (2017)

    Google Scholar 

  10. Verhoeven, B., Daelemans, W., Plank, B.: Twisty: a multilingual Twitter stylometry corpus for gender and personality profiling. In: Calzolari, N., et al. (eds.) Proceedings of the 10th Annual Conference on Language Resources and Evaluation (LREC 2016), pp. 1–6 (2016)

    Google Scholar 

  11. Viji Amutha Mary, A., Kumar, K.S., Sai, K.P.S.: An automatic approach to extracting geographic information from internet. J. Comput. Theor. Nanosci. 16(8), 3216–3218 (2019)

    Article  Google Scholar 

  12. Wu, F., Huang, X., Jiang, B.: A data-driven approach for extracting representative information from large datasets with mixed attributes. IEEE Trans. Eng. Manag. (2019)

    Google Scholar 

Download references

Acknowledgments

This work has been supported by the project “XAI - XAI - Sistemas Inteligentes Auto Explicativos creados con Módulos de Mezcla de Expertos”, ID SA082P20, financed by Junta Castilla y León, Consejería de Educación, and FEDER funds.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francisco Pinto-Santos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pinto-Santos, F., Shoeibi, N., Rivas, A., Hernández, G., Chamoso, P., De La Prieta, F. (2022). Distributed Platform for the Extraction and Analysis of Information. In: Corchado, J.M., Trabelsi, S. (eds) Sustainable Smart Cities and Territories. SSCTIC 2021. Lecture Notes in Networks and Systems, vol 253. Springer, Cham. https://doi.org/10.1007/978-3-030-78901-5_18

Download citation

Publish with us

Policies and ethics