Abstract
Hybrid XML storage offers a large number of alternative shredding choices. In order to automatically determine optimal shredding strategies it is crucial to have an insight into how the structure of a XML data set affects the performance. Since the structure can take many forms and the number of possible mappings is huge it is important to gain insights on the relation between structure and performance for formats that are actually used. By taking real-world data sets and modify the structure in steps you can see how the performance and other measurable properties change. We describe how a data generator can be used to produce a synthetic data set based on an existing data set, by using four different models. We compare the performance on the original data set with the performance on the different synthetic models.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Strömbäck, L., Asberg, M., Hall, D.: HShreX - A Tool for Design and Evaluation of Hybrid XML Storage. In: Int. Work. on Database and Expert Systems Applications (DEXA), pp. 417–421 (2009)
Bitton, D., DeWitt, D.J., Turbyfil, C.: Benchmarking Database Systems: A Systematic Approach. In: Proc. of the 1983 Very Large Database Conf. VLDB (1983)
Anon, et al.: A Measure of Transaction Processing Power. In: Stonebraker, M. (ed.) Readings in Database Systems. Morgan Kaufmann, San Francisco (1988)
Carey, M.J., DeWitt, D.J., Jeffrey, F.N.: The OO7 Benchmark. In: Proc. of the 1993 ACM SIGMOD International Conference on Management of Data, pp. 12–21 (1993)
Böhme, T., Rahm, E.: XMach-1: A Benchmark for XML Data Management. In: Proc. of German database conference BTW 2001, Oldenburg. Springer, Berlin (2001)
Schmidt, A.R., Waas, F., Kersten, M.L., Florescu, D., Manolescu, I., Carey, M.J., Busse, R.: The XML Benchmark Project. Technical report, CWI, Amsterdam, The Netherlands (2001)
Nambiar, U., Lacroix, Z., Bressan, S., Li Lee, M., Li, Y.: XML Benchmarks Put to the Test. In: IIWAS (2001)
The UniProt Consortium The Universal Protein Resource (UniProt). Nucleic Acids Res. 36, D190–D195 (2008)
Hucka, M., Finney, A., Sauro, H.M., et al.: The Systems Biology Markup Language (SBML): A Medium for Representation and Exchange of Biochemical Network Models. Bioinformatics 19(4), 524–531 (2003)
DBLP XML Records, http://acm.org/sigmoid/dblp/dp/index.html
Haklay, M., Weber, P.: OpenStreetMap: User-generated Street Maps. IEEE Pervasive Computing 7(4), 12–18 (2008)
Legislative Documents in XML at the United States House of Representatives, http://xml.house.gov/
Nierman, A., Jagadish, H.V.: Evaluating Structural Similarity in XML Documents. In: Proc. of the 5th Int. Work. on the Web and Databases (2002)
Freire, J., Haritsa, J., Ramanath, M., Roy, P., Simeon, J.: StatiX: Making XML Count. In: Proc. of ACM SIGMOD Conference, pp. 181–191 (2002)
Flesca, S., Manco, G., Masciari, E., Pontieri, L., Pugliese, A.: Fast Detection of XML Structural Similarities. IEEE Trans. Know Data Eng. 7(2), 160–175 (2005)
Polyzotis, N., Garofalakis, M.N.: XCLUSTER Synopses for Structured XML Content. In: Proc. of the 22nd Int. Conf. on Data Engineering (2006)
Runapongsa, K., Patel, J.M., Jagadish, H.V., Chen, Y., Al-Khalifa, S.: The Michigan benchmark: Towards XML Query Performance Diagnostics. In: Proc. VLDB Conference, vol. 31 (2003)
Cohen, S.: Count-Constraints for Generating XML. In: Etzion, O., Kuflik, T., Motro, A. (eds.) NGITS 2006. LNCS, vol. 4032, pp. 153–164. Springer, Heidelberg (2006)
Cohen, S.: Generating XML Structure Using Examples and Constraints. In: Proc. of the VLDB Endowment, pp. 490–501 (2008)
Barbosa, D., Mendelzon, A., Keenleyside, J., Lyons, K.: ToXgene: A Template-based Data Generator for XML. In: Proc. of the 2002 ACM SIGMOD int. conf. on Management of data (2002)
Geng, K., Dobbie, G.: An XML Document Generator for Semantic Query Optimization Experimentation. Int. J. of Web Information Systems 3(1), 26–40 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hall, D., Strömbäck, L. (2010). Generation of Synthetic XML for Evaluation of Hybrid XML Systems. In: Yoshikawa, M., Meng, X., Yumoto, T., Ma, Q., Sun, L., Watanabe, C. (eds) Database Systems for Advanced Applications. DASFAA 2010. Lecture Notes in Computer Science, vol 6193. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14589-6_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-14589-6_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14588-9
Online ISBN: 978-3-642-14589-6
eBook Packages: Computer ScienceComputer Science (R0)