Skip to main content
Log in

Experience with BXGrid: a data repository and computing grid for biometrics research

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Research in the field of biometrics depends on the effective management and analysis of many terabytes of digital data. The quality of an experimental result is often highly dependent upon the sheer amount of data marshalled to support it. However, the current state of the art requires researchers to have a heroic level of expertise in systems software to perform large scale experiments. To address this, we have designed and implemented BXGrid, a data repository and workflow abstraction for biometrics research. The system is composed of a relational database, an active storage cluster, and a campus computing grid. End users interact with the system through a high level abstraction of four stages: Select, Transform, AllPairs, and Analyze. A high degree of availability and reliability is achieved through transparent fail over, three phase operations, and independent auditing. BXGrid is currently in daily production use by an active biometrics research group at the University of Notre Dame. We discuss our experience in constructing and using the system and offer lessons learned in conducting collaborative research in e-Science.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Baru, C., Moore, R., Rajasekar, A., Wan, M.: The SDSC storage resource broker. In: Proceedings of CASCON, Toronto, Canada, 1998

  2. Daugman, J.: How Iris recognition works. IEEE Trans. Circuits Syst. Video Technol. 14(1), 21–30 (2004)

    Article  Google Scholar 

  3. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large cluster. In: Operating Systems Design and Implementation, 2004

  4. Deelman, E., Singh, G., Su, M.-H., Blythe, J., Gil, Y., Kesselman, C., Mehta, G., Vahi, K., Berriman, B., Good, J., Laity, A., Jacob, J., Katz, D.: Pegasus: a framework for mapping complex scientific workflows onto distributed systems. Sci. Program. J. 13(3) (2005)

  5. Dongarra, J.J., Walker, D.W.: MPI: a standard message passing interface. Supercomputer (January), 56–68 (1996)

  6. Gray, J., Szalay, A.: Where the rubber meets the sky: bridging the gap between databases and science. IEEE Data Eng. Bull. 27, 3–11 (2004)

    Google Scholar 

  7. Howard, J., Kazar, M., Menees, S., Nichols, D., Satyanarayanan, M., Sidebotham, R., West, M.: Scale and performance in a distributed file system. ACM Trans. Comput. Syst. 6(1), 51–81 (1988)

    Article  Google Scholar 

  8. Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: distributed data parallel programs from sequential building blocks. In: Proceedings of EuroSys, March 2007

  9. Jain, A.K., Ross, A., Pankanti, S.: A prototype hand geometry-based verification system. In: Proc. Audio- and Video-Based Biometric Person Authentication (AVBPA), pp. 166–171, 1999

  10. Moretti, C., Bulosan, J., Flynn, P., Thain, D.: All-pairs: an abstraction for data intensive cloud computing. In: International Parallel and Distributed Processing Symposium (IPDPS), 2008

  11. No, J., Thakur, R., Choudhary, A.: Integrating parallel file i/o and database support for high-performance scientific data management. In: IEEE High Performance Networking and Computing, 2000

  12. Pinheiro, E., Weber, W.-D., Barroso, L.A.: Failure trends in a large disk drive population. In: USENIX File and Storage Technologies, 2007

  13. Ratha, N., Bolle, R.: Automatic Fingerprint Recognition Systems. Springer, Berlin (2004)

    Book  Google Scholar 

  14. Riedel, E., Gibson, G.A., Faloutsos, C.: Active storage for large scale data mining and multimedia. In: Very Large Databases (VLDB), 1998

  15. Searcs, R., Ingen, C.V., Gray, J.: To blob or not to blob: large object storage in a database or a filesystem. Technical Report MSR-TR-2006-45, Microsoft Research, April (2006)

  16. Stolte, E., von Praun, C., Alonso, G., Gross, T.: Scientific data repositories. Designing for a moving target. In: SIGMOD, 2003

  17. Szalay, A.S., Kunszt, P., Thakar, A., Gray, J., Slutz, D., Brenner, R.J.: Designing and mining multi-terabyte astronomy archives: the sloan digital sky survey. Technical Report MSR-TR-99-30, Microsoft Research, Feb (2000)

  18. Thain, D., Moretti, C., Hemmes, J.: Chirp: a practical global file system for cluster and grid computing. J. Grid Comput. 7(1), 51–72 (2009)

    Article  Google Scholar 

  19. Thain, D., Tannenbaum, T., Livny, M.: Condor and the grid. In: Berman, F., Fox, G., Hey, T. (eds.) Grid Computing: Making the Global Infrastructure a Reality. Wiley, New York (2003)

    Google Scholar 

  20. Wan, M., Moore, R., Schroeder, W.: A prototype rule-based distributed data management system rajasekar. In: HPDC Workshop on Next Generation Distributed Data Management, May 2006

  21. Yan, P., Bowyer, K.W.: A fast algorithm for icp-based 3d shape biometrics. Comput. Vis. Image Underst. 107(3), 195–202 (2007)

    Article  Google Scholar 

  22. Zhao, W., Chellappa, R., Phillips, P., Rosenfeld, A.: Face Recognition: A Literature Survey. ACM Comput. Surv. 34(4), 299–458 (2003)

    Google Scholar 

  23. Zhao, Y., Dobson, J., Moreau, L., Foster, I., Wilde, M.: A notation and system for expressing and executing cleanly typed workflows on messy scientific data. In: SIGMOD, 2005

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Douglas Thain.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bui, H., Kelly, M., Lyon, C. et al. Experience with BXGrid: a data repository and computing grid for biometrics research. Cluster Comput 12, 373–386 (2009). https://doi.org/10.1007/s10586-009-0098-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-009-0098-7

Keywords

Navigation