Abstract
Several centers of atomic and molecular data in the world maintain research databases for use in fusion plasma simulations, hadron therapy, modelling the universe and other areas. Among the data center activities, collection of experimental and theoretical results across the world has been of major importance. This includes the identification, relevance assessment and retrieval of journal articles, followed by the data extraction, data mining, format conversion and data input. The methodology of the process still largely relies on working groups of specialists and part-time human labor, in spite of recent modernization in journal publishing, especially the electronic journals newly available in subscription domain and the free-access online abstract databases. This work focuses on automating the above procedure to the maximum extent possible. In particular, we design a download robot that performs query search and abstract retrieval for the candidates of relevant articles over the internet at first stage, followed by fultext retrieval (pdf format), text extraction and a deterministic relevance judgement. As a demonstration, we have also developed a bibliography database for electron-molecule collisions that automatically updates its contents over the internet in regular time intervals. The present work belongs to the project for evolutional data collecting system supported by a JSPS project which involves several research institutes.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Itikawa, Y.: Annotated bibliography on electron collisions with atomic positive ions: excitation and ionization in 1995–1999. Atom. Data and Nucl. Data Tables 80, 117–146 (2002)
Bhalla, S.: Evolving a model of transaction management with embedded concurrency control for mobile database systems. Information & Software Technology 45, 587–596 (2003)
van Bommel, P.: Experiences with EDO: An Evolutionary Database Optimizer. Data Knowl. Eng. 13, 243–263 (1994)
Ray, I., Ammann, P., Jajodia, S.: Using semantic correctness in multidatabases to achieve local autonomy, distribute coordination, and maintain global integrity. Inf. Sci. 129, 155–195 (2000)
Sasaki, A., Joe, K., Kashiwagi, H., et al.: Design and implementation of an evolutional data collecting system for the atomic and molecular databases. In: Joint ITC14 and ICAMDATA 2004 Conference, Ceratopia Toki, Toki, Gifu, Japan, October 5-8 (2004)
Utgoff, P.E.: Incremental induction of decision trees. Machine Learning 4, 161–186 (1989)
Quinlan, J.R.: Learning efficient classification procedures and their application to chess end games. In: Michalski, R., et al. (eds.) Machine Learning, An Artificial Intelligence Approach, pp. 463–482. Tioga Publishing Company, Palo Alto (1983)
Salton, G.: Automatic Text Processing. Addison-Wesley Publishing Company, Inc, Reading (1989)
Chen, H., Lally, A., Zhu, B., Chau, M.: HelpfulMed, Intelligent Searching for Medical Information over the Internet. Journal of the American Society for Information Science and Technology (JASIST) 54, 683–694 (2003)
Commercial: http://www.iee.org/Publish/INSPEC/ , http://www.iee.org/Publish/INSPEC/
Joint search (APS and IOP), http://crdb.nifs.ac.jp/j_search/js_top.php
Autonomous bibliography database, http://crdb.nifs.ac.jp/evodb/evodb_top.php
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pichl, L., Suzuki, M., Joe, K., Sasaki, A. (2005). Networked Mining of Atomic and Molecular Data from Electronic Journal Databases on the Internet. In: Bhalla, S. (eds) Databases in Networked Information Systems. DNIS 2005. Lecture Notes in Computer Science, vol 3433. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31970-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-540-31970-2_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25361-7
Online ISBN: 978-3-540-31970-2
eBook Packages: Computer ScienceComputer Science (R0)