research-article

An Efficient Method for Scientific Data Retrieval Service

Authors:
Lixin Du

School of Information Science and Engineering, University of Jinan, Jinan, China

School of Information Science and Engineering, University of Jinan, Jinan, China
View Profile

,
Mingyue Li

School of Information Science and Engineering, University of Jinan, Jinan, China

School of Information Science and Engineering, University of Jinan, Jinan, China
View Profile

,
Jiangying Xu

School of Information Science and Engineering, University of Jinan, Jinan, China

School of Information Science and Engineering, University of Jinan, Jinan, China
View Profile

ICBDT '20: Proceedings of the 3rd International Conference on Big Data TechnologiesSeptember 2020Pages 6–10https://doi.org/10.1145/3422713.3422731

Published:23 October 2020Publication History

ICBDT '20: Proceedings of the 3rd International Conference on Big Data Technologies

Pages 6–10

ABSTRACT

The sharing of scientific research data on the Internet is already the trend in academia. More and more data have been published to the public throughout the web on Internet. Due to the rapid growth of data, and the requirements of data service quality, the efficiency of data retrieval services has become an important factor affecting service quality. Based on the characteristics of scientific data, and the actual requirements of Pharmaceutical Information Center (PIC, http://pharmdata.ncmi.cn), we propose an efficient scientific data service retrieval method which can greatly improve retrieval speed and service quality. This method includes two work phases. The first phase is to obtain meaningful search keywords from scientific data using semantic analysis technology, including effective keyword sets construction, and eliminating the impact of invalid search keywords. The second phase is to construct a Hash Index Tree (HI-Tree) for valid keywords. Scientific data retrieval service will just traverse the cached HI-Tree instead of traversing the entire database to minimize the database query operation. Compared with traditional database retrieval methods, the experimental results show that our method improves the retrieval efficiency greatly and make better user experience of the data services.

References

Pasquetto, I. V., Randles, B. M., and Borgman, C. L. 2017. On the reuse of scientific data. Data Science Journal. 16, 8 (Mar. 2017), 1--9. DOI=https://doi.org/10.5334/dsj-2017-008.Google ScholarCross Ref
Zhang, Y., Yuan, F., Zhan, Y., and Wang, L. 2015. Relational database keyword retrieval based on index structure. Journal of Hebei University. 35, 1 (April. 2015), 95--101. DOI=http://doi.org/10.3969/j.issn.1000-1565.2015.01.017.Google Scholar
Merlo-Galeazzi R, Carrasco-Ochoa J A, Martínez-Trinidad J F, et al. Information retrieval based on a query document using maximal frequent sequences. 2013 32nd International Conference of the Chilean Computer Science Society (SCCC). (Nov.2013), 58--62. DOI=https://doi.org/10.1109/SCCC.2013.13Google ScholarCross Ref
Liu, X., Wang, J., Zhu, M., Deng, F., and Sun, P. 2013. An effective directory index framework taking advantages of hash table and B (+)-tree. Journal of Xi'an Jiaotong University. 47, 4 (Apr. 2013), 105--111. DOI= http://doi.org/10.7652/xjtuxb201304018.Google Scholar
Zang, W., Li, J. Fang B., et al. 2015. H-Tree: Hierarchy index for online monitoring of big data streams. Chinese Journal of Computers. 38, 1 (Jan. 2015), 35--44.Google Scholar
Li, X., Song, B., Yu, G., and Wang, D. 2014. L(k)-index: An efficient k-bisimilarity based structural summary supporting label path. Chinese Journal of Computers. 37, 8 (Aug. 2014), 1732--1742.Google Scholar
Wang, Y., Gu, Y., Zhou, J., and Qu, W. 2015. A graph-based approach for semantic similar word retrieval. In 2015 International Conference on Behavioral, Economic and Socio-cultural Computing. (Oct. 2015), 24--27, DOI= https://doi.org/10.1109/BESC.2015.7365952.Google ScholarCross Ref
Tang, X., Alabduljalil, M., Jin, X., and Yang, T. 2017. Partitioned similarity search with cache-conscious data traversal. ACM Transactions on Knowledge Discovery from Data. 11, 3 (April. 2017), 1--38. DOI = https://doi.org/10.1145/3014060.Google ScholarDigital Library
Galakatos, A., Markovitch, M., Binnig, C., Fonseca, R., and Kraska, T. 2019. Fiting-tree: A data-aware index structure. In Proceedings of the 2019 International Conference on Management of Data. (June. 2019), 1189--1206. DOI = https://doi.org/10.1145/3299869.3319860.Google ScholarDigital Library
Tang, J., Zhou, Z., Xue, X., and Wang, G. 2019. Using collaborative edge-cloud cache for search in Internet of things. IEEE Internet of Things Journal. 7, 2 (Feb. 2020), 922--936. DOI= https://doi.org/10.1109/JIOT.2019.2946389.Google Scholar
Tolosa, G., Feuerstein, E., Becchetti, L., and Marchetti-Spaccamela, A. 2017. Performance improvements for search systems using an integrated cache of lists+ intersections. Information Retrieval Journal. 20, 3 (May. 2017), 172--198. DOI=https://doi.org/10.1007/978-3-319-11918-2_22.Google ScholarDigital Library
Nargesian, F., Zhu, E., Pu, K. Q., and Miller, R. J. 2018. Table union search on open data. Proceedings of the VLDB Endowment. 11, 7, (March. 2018), 813--825. DOI=https://doi.org/10.14778/3192965.3192973.Google ScholarDigital Library
Yang, H. F., Chen, M. L., & Zhen, Z. 2017. Analysis on applicability of common chinese word segmentation software in literature study of traditional chinese medicine text. DEStech Transactions on Computer Science and Engineering. (May. 2017), 698--708. DOI= https://doi.org/10.12783/dtcse/cst2017/12573.Google Scholar

Index Terms

An Efficient Method for Scientific Data Retrieval Service
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Novelty in information retrieval

Recommendations

The Cognitive Enhancement Process of Scientific Data Retrieval
CSAE '19: Proceedings of the 3rd International Conference on Computer Science and Application Engineering

Is there a stable cognitive structure of scientific data retrieval process? Based on the theory and method of user relevance research, this study explores the cognitive characteristics of user scientific data query and retrieval. The semi-structured ...
Read More
An efficient approach for service retrieval
ICUIMC '08: Proceedings of the 2nd international conference on Ubiquitous information management and communication

The efficient discovery of services from a large-scale collection of services has become an important issue[1, 15]. We studied a pragmatic and efficient method for Web service retrieval. We regarded service retrieval as information retrieval on the ...
Read More
Lineage retrieval for scientific data processing: a survey

Scientific research relies as much on the dissemination and exchange of data sets as on the publication of conclusions. Accurately tracking the lineage (origin and subsequent processing history) of scientific data sets is thus imperative for the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICBDT '20: Proceedings of the 3rd International Conference on Big Data Technologies
September 2020
250 pages
ISBN:9781450387859
DOI:10.1145/3422713

Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 October 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Data Retrieval
Data Service
Scientific Data
Semantic Analysis
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 42
  Total Downloads
- Downloads (Last 12 months)8
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

An Efficient Method for Scientific Data Retrieval Service

ICBDT '20: Proceedings of the 3rd International Conference on Big Data Technologies

ABSTRACT

References

Cited By

Index Terms

Recommendations

The Cognitive Enhancement Process of Scientific Data Retrieval

An efficient approach for service retrieval

Lineage retrieval for scientific data processing: a survey

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

An Efficient Method for Scientific Data Retrieval Service

ICBDT '20: Proceedings of the 3rd International Conference on Big Data Technologies

ABSTRACT

References

Cited By

Index Terms

Recommendations

The Cognitive Enhancement Process of Scientific Data Retrieval

An efficient approach for service retrieval

Lineage retrieval for scientific data processing: a survey

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media