Data Extraction Based on Web Scrapy

Alshammari, Reem; Alrashed, Rwan; Almutiri, Atheer; Alwalah, Mathail; Al-marrai, Wadha; Alqahtani, Dana; Alzahrani, Amani

doi:10.1007/978-3-030-73603-3_47

Reem Alshammari²⁰,
Rwan Alrashed²⁰,
Atheer Almutiri²⁰,
Mathail Alwalah²⁰,
Wadha Al-marrai²⁰,
Dana Alqahtani²⁰ &
…
Amani Alzahrani²⁰

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1372))

Included in the following conference series:

International Conference on Innovations in Bio-Inspired Computing and Applications

619 Accesses

Abstract

In recent years, with the growth of big data the information on the Internet is increasing rapidly, it has become important to have technologies that help the user to obtain the information efficiently and easily. Many undergraduate students buy books and school supplies through online stores such as Amazon and other websites. How to effectively obtain information from the websites. This paper implements a Python web scraper to extract the data from the target websites and store the collected data on the comma separated values file. This system aims to collect information about the products that undergraduate students need from the target websites and return them to the users with a simple page. Users can search for product that they are want to obtain information about it and the crawler crawled the following information (product name, product price, product URL) from the following online stores: Amazon, eBay, Jarir, and Extra then stored it in the comma-separated values file for information analysis. The data scraped by the crawler and saved on the comma-separated values file is 9083 records. The importance of the system lies in its effective ability to reach products fast and high efficiency and enable price comparison of similar products on different online shopping stores therefore saving the searching time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Automated Organic Web Harvesting on Web Data for Analytics

Optimizing Marketing Through Web Scraping

A Best Price Web Scraping Application for E-commerce Websites

References

Yang, H.: Design and implementation of data acquisition system based on scrapy technology. In: 2019 2nd International Conference on Safety Produce Informatization (IICSPI). IEEE (2019)
Google Scholar
Kim, C.: An implementation and performance evaluation of fast web crawler with Python. J. Semicond. Disp. Technol. 18(3), 140–143 (2019)
Google Scholar
Mathew Thomas, D., Mathur, S.: Data analysis by web scraping using python. In: 2019 3rd International Conference on Electronics, Communication and Aerospace Technology (ICECA). IEEE (2019)
Google Scholar
Lv, H.: Design and implementation of domestic news collection system based on Python. In: 2020 Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC). IEEE (2020)
Google Scholar
Goel, S., et al.: Web crawling-based search engine using python. In: 2019 3rd International Conference on Electronics, Communication and Aerospace Technology (ICECA). IEEE (2019)
Google Scholar
Fan, Y.: Design and implementation of distributed crawler system based on Scrapy. In: IOP Conference Series: Earth and Environmental Science, vol. 108, no. 4, p. 042086 (2018)
Google Scholar
Farooq, B., et al.: Crawling of Japanese real-estate websites using scrapy. Int. J. Adv. Res. Comput. Sci. 9(2), 64–67 (2018)
Google Scholar
Wang, J., Guo, Y.: Scrapy-based crawling and user-behavior characteristics analysis on Taobao. In: 2012 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery. IEEE (2012)
Google Scholar
Peng, D., et al.: Research on information collection method of shipping job hunting based on web crawler. In: 2018 Eighth International Conference on Information Science and Technology (ICIST). IEEE (2018)
Google Scholar
Zhi-hang, T., et al.: Clothing information collection based on theme web crawler. Int. J. Adv. Networking Appl. 10(4), 3919–3924 (2019)
Google Scholar
Singh, S., Jain, R.: Weather report on metropolitan cities in India using web scraping technique. Int. J. Adv. Res. Ideas Innovations Technol. 5(3) (2019)
Google Scholar
Wei Jen, C.: An Automated Web Scraping Tool for Malaysia Tourism. UTAR (2019)
Google Scholar
Shi, Z. et al.: The implementation of crawling news page based on incremental web crawler. In: 2016 4th International Conference on Applied Computing and Information Technology/3rd International Conference on Computational Science/Intelligence and Applied Informatics/1st International Conference on Big Data, Cloud Computing, Data Science & Engineering (ACIT-CSII-BCD). IEEE (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Department, College of Science and Humanities in Jubail, Imam Abdulrahman Bin Faisal University, Jubail Industrial City, Kingdom of Saudi Arabia
Reem Alshammari, Rwan Alrashed, Atheer Almutiri, Mathail Alwalah, Wadha Al-marrai, Dana Alqahtani & Amani Alzahrani

Authors

Reem Alshammari
View author publications
You can also search for this author in PubMed Google Scholar
Rwan Alrashed
View author publications
You can also search for this author in PubMed Google Scholar
Atheer Almutiri
View author publications
You can also search for this author in PubMed Google Scholar
Mathail Alwalah
View author publications
You can also search for this author in PubMed Google Scholar
Wadha Al-marrai
View author publications
You can also search for this author in PubMed Google Scholar
Dana Alqahtani
View author publications
You can also search for this author in PubMed Google Scholar
Amani Alzahrani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Reem Alshammari .

Editor information

Editors and Affiliations

Scientific Network for Innovation and Research Excellence, Machine Intelligence Research Labs (MIR Labs), Auburn, WA, USA
Ajith Abraham
National Institute of Information and Communications Technology (NICT), Koganei, Tokyo, Japan
Hideyasu Sasaki
Universidade Federal da Bahia, Salvador, Brazil
Ricardo Rios
Scientific Network for Innovation and Research Excellence, Machine Intelligence Research Labs (MIR Labs), Auburn, WA, USA
Niketa Gandhi
Institute of Technology and Science, Ghaziabad, Uttar Pradesh, India
Umang Singh
School of Information Science and Engineering, University of Jinan, Jinan, Shandong, China
Kun Ma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alshammari, R. et al. (2021). Data Extraction Based on Web Scrapy. In: Abraham, A., Sasaki, H., Rios, R., Gandhi, N., Singh, U., Ma, K. (eds) Innovations in Bio-Inspired Computing and Applications. IBICA 2020. Advances in Intelligent Systems and Computing, vol 1372. Springer, Cham. https://doi.org/10.1007/978-3-030-73603-3_47

Download citation

DOI: https://doi.org/10.1007/978-3-030-73603-3_47
Published: 10 April 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73602-6
Online ISBN: 978-3-030-73603-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Data Extraction Based on Web Scrapy

Abstract

Access this chapter

Similar content being viewed by others

Automated Organic Web Harvesting on Web Data for Analytics

Optimizing Marketing Through Web Scraping

A Best Price Web Scraping Application for E-commerce Websites

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Data Extraction Based on Web Scrapy

Abstract

Access this chapter

Similar content being viewed by others

Automated Organic Web Harvesting on Web Data for Analytics

Optimizing Marketing Through Web Scraping

A Best Price Web Scraping Application for E-commerce Websites

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation