Abstract
In the domain of science research, a mass of data obtained and generated by instruments are in the form of text. How to make the best use of these data has become one of the issues for both nature science researchers and computer professions. Many of these data contain their logic structure inside, but they are different from the self-describing semi-structured data, for these data are separate from the schema. Because of the great increase of the data amount, the traditional way of studying on these data can not meet the needs of high performance and flexible access. Relational DBMS is a good technique for organizing and managing data. In this paper, a mapping model—STRIPE—between scientific text and relational database is proposed. Using STRIPE, we design and implement a Web-based massive scientific data transformation system, which gives a good solution to the problem of the massive scientific data management, query and exchange. The evaluation to the system shows that it can greatly improve the efficiency of scientific data transformation, and offer scientists a novel platform for studying the data.
This work is supported by National Natural Science Foundation of China (No. 60573090).
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Atzeni, P., Ceri, S., Paraboschi, S., Torlone, R.: Database System Concents, Languages and Architecture. McGraw-Hill, New York (1999)
Buneman, P., Davidson, S.B., Hart, K., Overton, G.C., Wong, L.: A Data Transformation System for Biological Data Sources. In: VLDB, pp. 158–169 (1995)
Buneman, P., Davidson, S.B., Fernandez, M.F., Suciu, D.: Adding Structure to Unstructured Data. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 336–350. Springer, Heidelberg (1996)
Buneman, P., Khanna, S., Tajima, K., Tan, W.C.: Archiving scientific data. ACM Trans. Database Syst. 29, 2–42 (2004)
Deutsch, A., Fernandez, M.F., Suciu, D.: Storing Semistructured Data with STORED. SIGMOD, 431–442 (1999)
Extensible Markup Language: http://www.w3.org/XML/
Gray, J., Liu, D.T., Nieto-Santisteban, M.A., Szalay, A., DeWitt, D.J., Heber, G.: Scientific data management in the coming decade. SIGMOD Record 34(4), 34–41 (2005)
Liu, Y., Liu, X., Xiao, L., Ni, L.M., Zhang, X.: Location-Aware Topology Matching in P2P Systems. In: Proc. of the IEEE INFOCOM (2004)
McHugh, J., Abiteboul, S., Goldman, R., Quass, D., Widom, J.: Lore: A Database Management System for Semistructured Data. SIGMOD Record 26(3), 54–66 (1997)
Model-View-Controller, http://java.sun.com/blueprints/patterns/MVC.html
National Marine Data Information and Service, http://www.nmdis.gov.cn/
National Oceanographic Data Center, http://www.nodc.noaa.gov/
Papakonstantinou, Y., Garcia-Molina, H., Widom, J.: Object exchange across heterogeneous information sources. In: ICDE, pp. 251–260 (1995)
Quass, D., Rajaraman, A., Ullman, J.D., Widom, J., Sagiv, Y.: Querying Semistructured Heterogeneous Information. Journal of Systems Integration 7(3/4), 381–407 (1997)
The BibTeX Format, http://www.ecst.csuchico.edu/~jacobsd/bib/formats/bibtex.html
Vidgen, R.T.: Constructing a web information system development methodology. Inf. Syst. J. 12(3), 247–261 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Feng, S., Song, J., Bai, X., Wang, D., Yu, G. (2006). A Web-Based Transformation System for Massive Scientific Data. In: Feng, L., Wang, G., Zeng, C., Huang, R. (eds) Web Information Systems – WISE 2006 Workshops. WISE 2006. Lecture Notes in Computer Science, vol 4256. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11906070_10
Download citation
DOI: https://doi.org/10.1007/11906070_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-47663-4
Online ISBN: 978-3-540-47664-1
eBook Packages: Computer ScienceComputer Science (R0)