research-article

Web Scraper Application for Extracting Scientific Journals Data

Authors:
Wahaj Salem Alkaberi

University of Jeddah, Saudi Arabia

University of Jeddah, Saudi Arabia
View Profile

,
Reem Hamed Aljuhani

University of Jeddah, Saudi Arabia

University of Jeddah, Saudi Arabia
View Profile

,
Huda Mohamed Alamoudi

University of Jeddah, Saudi Arabia

University of Jeddah, Saudi Arabia
View Profile

ICFNDS '21: Proceedings of the 5th International Conference on Future Networks and Distributed SystemsDecember 2021Pages 220–224https://doi.org/10.1145/3508072.3508106

Published:13 April 2022Publication History

ICFNDS '21: Proceedings of the 5th International Conference on Future Networks and Distributed Systems

Pages 220–224

ABSTRACT

Searching for certain subjects of articles that are disseminated throughout scientific journals would be a time-consuming task, as it would necessitate scouring many digital libraries or journal websites. This process can be performed efficiently by utilizing web scraping technology, in which a scraper is used to extract web page content into more organized and structured datasets. This paper proposes a customized web scraper called ”Research Scraper” that will extract content from scientific journal websites, allowing users to access all results from a single search interface. The proposed technique is simple to use and can help with the process of analyzing publications in a specific field. This paper presents and explains the development steps, system design, and technologies that will be used in the implementation phase.

Supplemental Material

Available for Download

pptx

p220-alkaberi-supplement.pptx (711.7 KB)

Presentation slides

References

Rabiyatou Diouf, Edouard Ngor Sarr, Ousmane Sall, Babiga Birregah, Mamadou Bousso, and Sény Ndiaye Mbaye. 2019. Web scraping: state-of-the-art and areas of application. In 2019 IEEE International Conference on Big Data (Big Data). IEEE, 6040–6042.Google ScholarCross Ref
[2] Import.io.2022. https://www.import.io/.Google Scholar
Yesi Novaria Kunang, Susan Dian Purnamasari, 2018. Web scraping techniques to collect weather data in South Sumatera. In 2018 International Conference on Electrical Engineering and Computer Science (ICECOS). IEEE, 385–390.Google Scholar
Software Innovation Lab LLC. 2021. Data Miner. https://data-miner.io/.Google Scholar
Deepak Kumar Mahto and Lisha Singh. 2016. A dive into Web Scraper world. In 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom). IEEE, 689–693.Google Scholar
Ryan Mitchell. 2018. Web scraping with Python: Collecting more data from the modern web. ” O’Reilly Media, Inc.”.Google Scholar
[7] Octoparse.2021. https://www.octoparse.com/.Google Scholar
D Pratiba, MS Abhay, Akhil Dua, Giridhar K Shanbhag, Neel Bhandari, and UTKARSH SINGH. 2018. Web Scraping And Data Acquisition Using Google Scholar. In 2018 3rd International Conference on Computational Systems and Information Technology for Sustainable Solutions (CSITSS). IEEE, 277–281.Google Scholar
[9] Simplescaper.2020. https://simplescraper.io/.Google Scholar
[10] Helium Scraper Software.2021. https://www.heliumscraper.com/.Google Scholar
K Sundaramoorthy, R Durga, and S Nagadarshini. 2017. Newsone—an aggregation system for news using web scraping method. In 2017 International Conference on Technical Advancements in Computers and Communications (ICTACC). IEEE, 136–140.Google ScholarCross Ref

Recommendations

Effective Web Data Extraction with Ducky
IDEAS '15: Proceedings of the 19th International Database Engineering & Applications Symposium

The World Wide Web has become an invaluable source of data. However, extracting useful information from the vastness of the web can become a challenge as depending on the amount of data needed, manual extraction or creation of web scraping programs may ...
Read More
Browser GUI for generating web data extraction rules in Ducky
iiWAS '15: Proceedings of the 17th International Conference on Information Integration and Web-based Applications & Services

To benefit from the invaluable data in the World Wide Web, manual extraction or creation of web scraping programs may be necessary. However, these processes can be tedious and complicated. To address these, we have proposed Ducky, which is a Web data ...
Read More
Current challenges in web crawling
ICWE'13: Proceedings of the 13th international conference on Web Engineering

Web crawling, a process of collecting web pages in an automated manner, is the primary and ubiquitous operation used by a large number of web systems and agents starting from a simple program for website backup to a major web search engine. Due to an ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICFNDS '21: Proceedings of the 5th International Conference on Future Networks and Distributed Systems
December 2021
847 pages
ISBN:9781450387347
DOI:10.1145/3508072

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 April 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
data extraction
scraper
web crawling
web scraping
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 324
  Total Downloads
- Downloads (Last 12 months)96
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Web Scraper Application for Extracting Scientific Journals Data

ICFNDS '21: Proceedings of the 5th International Conference on Future Networks and Distributed Systems

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Recommendations

Effective Web Data Extraction with Ducky

Browser GUI for generating web data extraction rules in Ducky

Current challenges in web crawling

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Web Scraper Application for Extracting Scientific Journals Data

ICFNDS '21: Proceedings of the 5th International Conference on Future Networks and Distributed Systems

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Recommendations

Effective Web Data Extraction with Ducky

Browser GUI for generating web data extraction rules in Ducky

Current challenges in web crawling

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media