Abstract
Archives of distributed workloads acquired at the infrastructure level reputably lack information about users and application-level middleware. Science gateways provide consistent access points to the infrastructure, and therefore are an interesting information source to cope with this issue. In this paper, we describe a workload archive acquired at the science-gateway level, and we show its added value on several case studies related to user accounting, pilot jobs, fine-grained task analysis, bag of tasks, and workflows. Results show that science-gateway workload archives can detect workload wrapped in pilot jobs, improve user identification, give information on distributions of data transfer times, make bag-of-task detection accurate, and retrieve characteristics of workflow executions. Some limits are also identified.
Chapter PDF
References
Iosup, A., Li, H., Jan, M., Anoep, S., Dumitrescu, C., Wolters, L., Epema, D.H.J.: The grid workloads archive. Future Gener. Comput. Syst. 24(7), 672–686 (2008)
Iosup, A., Epema, D.: Grid computing workloads: bags of tasks, workflows, pilots, and others. IEEE Internet Computing 15(2), 19–26 (2011)
Kondo, D., Javadi, B., Iosup, A., Epema, D.: The failure trace archive: Enabling comparative analysis of failures in diverse distributed systems. In: CCGrid 2010, pp. 398–407 (2010)
Germain-Renaud, C., Cady, A., Gauron, P., Jouvin, M., Loomis, C., Martyniak, J., Nauroy, J., Philippon, G., Sebag, M.: The grid observatory. In: IEEE International Symposium on Cluster Computing and the Grid, pp. 114–123 (2011)
Ostermann, S., Prodan, R., Fahringer, T., Iosup, R., Epema, D.: On the characteristics of grid workflows. In: CoreGRID Symposium - Euro-Par 2008 (2008)
Christodoulopoulos, K., Gkamas, V., Varvarigos, E.: Statistical analysis and modeling of jobs in a grid environment. Journal of Grid Computing 6, 77–101 (2008)
Medernach, E.: Workload Analysis of a Cluster in a Grid Environment. In: Feitelson, D.G., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2005. LNCS, vol. 3834, pp. 36–61. Springer, Heidelberg (2005)
Iosup, A., Jan, M., Sonmez, O., Epema, D.: The Characteristics and Performance of Groups of Jobs in Grids. In: Kermarrec, A.-M., Bougé, L., Priol, T. (eds.) Euro-Par 2007. LNCS, vol. 4641, pp. 382–393. Springer, Heidelberg (2007)
Ferreira da Silva, R., Camarasu-Pop, S., Grenier, B., Hamar, V., Manset, D., Montagnat, J., Revillard, J., Balderrama, J.R., Tsaregorodtsev, A., Glatard, T.: Multi-Infrastructure Workflow Execution for Medical Simulation in the Virtual Imaging Platform. In: HealthGrid 2011, Bristol, UK (2011)
Shahand, S., Santcroos, M., Mohammed, Y., Korkhov, V., Luyf, A.C., van Kampen, A., Olabarriaga, S.D.: Front-ends to Biomedical Data Analysis on Grids. In: Proceedings of HealthGrid 2011, Bristol, UK (June 2011)
Kacsuk, P.: P-GRADE Portal Family for Grid Infrastructures. Concurrency and Computation: Practice and Experience 23(3), 235–245 (2011)
Ardizzone, V., Barbera, R., Calanducci, A., Fargetta, M., Ingrà, E., La Rocca, G., Monforte, S., Pistagna, F., Rotondo, R., Scardaci, D.: A European framework to build science gateways: architecture and use cases. In: 2011 TeraGrid Conference: Extreme Digital Discovery, pp. 43:1–43:2. ACM, New York (2011)
Krefting, D., Bart, J., Beronov, K., Dzhimova, O., Falkner, J., Hartung, M., Hoheisel, A., Knoch, T.A., Lingner, T., Mohammed, Y., Peter, K., Rahm, E., Sax, U., Sommerfeld, D., Steinke, T., Tolxdorff, T., Vossberg, M., Viezens, F., Weisbecker, A.: Medigrid: Towards a user friendly secured grid infrastructure. Future Generation Computer Systems 25(3), 326–336 (2009)
Luckow, A., Weidner, O., Merzky, A., Maddineni, S., Santcroos, M., Jha, S.: Towards a common model for pilot-jobs. In: HPDC 2012, Delft, The Netherlands (2012)
Tsaregorodtsev, A., Brook, N., Ramo, A.C., Charpentier, P., Closier, J., Cowan, G., Diaz, R.G., Lanciotti, E., Mathe, Z., Nandakumar, R., Paterson, S., Romanovsky, V., Santinelli, R., Sapunov, M., Smith, A.C., Miguelez, M.S., Zhelezov, A.: DIRAC3. The New Generation of the LHCb Grid Software. Journal of Physics: Conference Series 219(6), 062029 (2009)
Thain, D., Tannenbaum, T., Livny, M.: Distributed computing in practice: the condor experience. Concurrency and Computation: Practice and Experience 17(2-4), 323–356 (2005)
Ferreira da Silva, R., Glatard, T., Desprez, F.: Self-healing of operational workflow incidents on distributed computing infrastructures. In: IEEE/ACM CCGrid 2012, Ottawa, Canada, pp. 318–325 (2012)
Ilijasic, L., Saitta, L.: Characterization of a Computational Grid as a Complex System. In: Grid Meets Autonomic Computing (GMAC 2009), pp. 9–18 (June 2009)
Lingrand, D., Montagnat, J., Martyniak, J., Colling, D.: Optimization of jobs submission on the EGEE production grid: modeling faults using workload. Journal of Grid Computing (JOGC) Special Issue on EGEE 8(2), 305–321 (2010)
Casanova, H.: On the harmfulness of redundant batch requests. In: International Symposium on High-Performance Distributed Computing, pp. 255–266 (2006)
Brasileiro, F., Gaudencio, M., Silva, R., Duarte, A., Carvalho, D., Scardaci, D., Ciuffo, L., Mayo, R., Hoeger, H., Stanton, M., Ramos, R., Barbera, R., Marechal, B., Gavillet, P.: Using a simple prioritisation mechanism to effectively interoperate service and opportunistic grids in the eela-2 e-infrastructure. Journal of Grid Computing 9, 241–257 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
da Silva, R.F., Glatard, T. (2013). A Science-Gateway Workload Archive to Study Pilot Jobs, User Activity, Bag of Tasks, Task Sub-steps, and Workflow Executions. In: Caragiannis, I., et al. Euro-Par 2012: Parallel Processing Workshops. Euro-Par 2012. Lecture Notes in Computer Science, vol 7640. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36949-0_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-36949-0_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36948-3
Online ISBN: 978-3-642-36949-0
eBook Packages: Computer ScienceComputer Science (R0)