Skip to main content
Log in

Component-based end-user database design for ecologists

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

To solve today’s ecological problems, scientists need well documented, validated, and coherent data archives. Historically, however, ecologists have collected and stored data idiosyncratically, making data integration even among close collaborators difficult. Further, effective ecology data warehouses and subsequent data mining require that individual databases be accurately described with metadata against which the data themselves have been validated. Using database technology would make documenting data sets for archiving, integration, and data mining easier, but few ecologists have expertise to use database technology and they cannot afford to hire programmers. In this paper, we identify the benefits that would accrue from ecologists’ use of modern information technology and the obstacles that prevent that use. We describe our prototype, the Canopy DataBank, through which we aim to enable individual ecologists in the forest canopy research community to be their own database programmers. The key feature that makes this possible is domain-specific database components, which we call templates. We also show how additional tools that reuse these components, such as for visualization, could provide gains in productivity and motivate the use of new technology. Finally, we suggest ways in which communities might share database components and how components might be used to foster easier data integration to solve new ecological problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Beard-Tisdale, K., Kahl, J. S., Pettigrew, N., Hunter, M., & Lutz, M. (2003). BDEI: Event and process tagging for information integration for the international gulf of maine watershed. In NSF Workshop on Biodiversity & Ecosystem Informatics. Olympia, WA.

  • Beck, K. (2000). Extreme programming explained. Boston, MA: Addison Wesley.

    Google Scholar 

  • Bernstein, P. A., & Rahm, E. (2000). Data warehouse scenarios for model management. In ER2000 conference proceedings (pp. 1–15). Salt Lake City, UT: Springer.

  • Brooks, F. P. J. (1995). No silver bullet—essence and accident in software engineering. In F. P. Jr. Brooks (Ed.), The mythical man-month anniversary edition. Reading, MA: Addison Wesley.

    Google Scholar 

  • Burnett, M., Atwood, J., Djang, R. W., Gottfried, H., Reichwein, J., & Yang, S. (2001). Forms/3: A first-order visual language to explore the boundaries of the spreadsheet paradigm. Journal of Functional Programming, 11, 155–206.

    MATH  Google Scholar 

  • Cushing, J. B., Nadkarni, N. M., Delcambre, L., Healy, K., Maier, D., & Ordway, E. (2002a). The development of databases and database tools for forest canopy researchers: a model for database enhancement in the ecological sciences. In SSGRR2002W, L’Aquila, Italy.

  • Cushing, J. B., Nadkarni, N. M., Delcambre, L., Healy, K., Maier, D., & Ordway, E. (2002b). Template-driven end-user ecological database design. In SCI2002. Orlando, FL.

  • Cushing, J. B., Nadkarni, N. M., Finch, M., & Kim, Y. (2003). The canopy database project: Component-driven database design and visualization for ecologists. In Poster. VIS 2003. Seattle, WA.

  • Cushing, J. B., & Wilson, T. (July 2005). Eco-Informatics for Decision Makers—Advancing a Research Agenda. Invited paper, 2nd international workshop on data integration in the life sciences. In L. Raschid, & B. Ludaescher (Eds.). San Diego, CA.

  • Delcambre, L., Maier, D., Weaver, M., Shapiro, L., & Cushing, J. B. (2003). Superimposing spatial enrichments in traditional information. In International workshop on next generation geospatial information. Cambridge (Boston), MA.

  • Dunne, J. (2005). Emerging ecoinformatic tools and accomplishments for synthetic ecological research across scales. Ecological Society of America Annual Meeting, August 7–12. Session presenters: J. Cushing, M. Weiser, J. Alroy, M. Jones, J. Quinn, N. Martinez, J. Dunne, and U. Brose.

  • Dunne, J., Martinez, N., & Williams, R. (2005). Webs on the web: Ecoinformatic approaches to synthetic food-web research from cambrian to contemporary ecosystems. In emerging ecoinformatic tools and accomplishments for synthetic ecological research across scales. Ecological Society of America Annual Meeting, August 7–12.

  • Finch, M. The canopy database project: Component-driven database design and visualization for ecologists. In Demonstration. VIS 2003. Seattle, WA.

  • Fowler, M., & Scott, K. (1997). UML distilled. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Gamma, E., Helm, R., Johnson, R., & Vlissides, J. (1995). Design patterns. Boston, MA: Addison Wesley.

    Google Scholar 

  • Gause, D. C., & Weinberg, G. M. (1989). Exploring requirements. New York: Dorset House.

    MATH  Google Scholar 

  • Gruber, T. R. (1993). A translation approach to portable ontologies. Knowledge Acquisition, 5, 199–220.

    Article  Google Scholar 

  • Henebry, G. M., & Merchant, J. W. (2001). Geospatial data in time: limits and prospects for predicting species occurrences. In J. M. Scott, P. J. Heglund, & M. Morrison (Eds.), Predicting species occurrences: issues of scale and accuracy. Covello, CA: Island.

    Google Scholar 

  • Hook, J., & Widen, T. (1998). Software design automation: Language design in the context of domain engineering. In Proceedings of SEE ’98. San Francisco, CA.

  • Jagadish, H. V., Olken, F., et al. (2003). NSF/NLM workshop on data management for molecular and cell biology, report data management for the biosciences. OMICS:A Journal of Integrative Biology 7, 1.

    Article  Google Scholar 

  • Kieburtz, R. (2000). Defining and implementing closed domain-specific languages. OGI Technical Report http://www-internal.cse.ogi.edu/PacSoft/publications/phaseiiiq13papers/design_and_impl.pdf.

  • Lowman, M. D., & Nadkarni, N. M. (1995). Forest canopies. San Diego, CA: Academic.

    Google Scholar 

  • Maier, D., Cushing, J. B., Hansen, D. M., Purvis III, G. D., Bair, R. A., DeVaney, D. M., et al. (1993). Object data models for shared molecular structures. In R. Lysakowski (Ed.), First international symposium on computerized chemical data standards: databases, data interchange, and information systems. Atlanta, GA: ASTM.

    Google Scholar 

  • Maier, D., Landis, E., Frondorf, A., Silverschatz, A., Schnase, J., & Cushing, J. B. (2001). Report of an NSF, USGS, NASA workshop on biodiversity and ecosystem informatics. http://www.evergreen.edu/bdei/2001/

  • Metacat, & Morpho (2003). http://knb.ecoinformatics.org/software/.

  • Michener, W., & Brunt, J. (Eds.) (2001). Ecological data-design, management and processing. Blackwell Science Methods in Ecology Series.

  • Michener, W., Brunt, J., Helly, J., Kirchner, T., & Stafford, S. (1997). Non-spatial metadata for the ecological sciences. Ecological Applications, 7, 330–342.

    Article  Google Scholar 

  • Michener, W., Porter, J. H., & Stafford, S. (Eds.) (1998). Data and information management in the ecological sciences: a resource guide. Albuquerque, NM: LTER Network Office, University of New Mexico.

  • Miller, R. J., Haas, L. M., & Hernandez, M. (2000). Schema mapping as query discovery. In Proceedings of the international conference on very large Data bases (VLDB) (pp. 77–88). Cairo, Egypt.

  • Miller, R. J., Hernandez, M. A., Haas, L. M., Yan, L., Ho, C. T. H., Fagin, R., et al. (2001). The clio project: Managing heterogeneity. SIGMOD Record, 30, 78–83.

    Article  Google Scholar 

  • Musen, M. A., Fergerson, R. W., Grosso, W. E., Noy, N. F., Crubezy, M., & Gennari, J. H. (2000). Component-based support for building knowledge-acquisition systems. In Conference on intelligent information processing (IIP 2000) of the international federation for information processing world computer congress (WCC 2000). Beijing, China.

  • Nadkarni, N. M., & Cushing, J. B. (1995). Final report: Designing the forest canopy researcher’s workbench: computer tools for the 21st century. Olympia, WA: International Canopy Network.

  • Nadkarni, N. M., & Cushing, J. B. (2001). Lasers in the jungle: The forest canopy database project. Bulletin of the Ecological Society of America, 82, 200–201.

    Google Scholar 

  • Nadkarni, N. M., & Parker, G. G. (1994). A profile of forest canopy science and scientists—who we are, what we want to know, and obstacles we face: Results of an international survey. Selbyana, 15, 38–50.

    Google Scholar 

  • Nottrott, R., Jones, M. B., & Schildhauer, M. (1999). Using Xml-structured metadata to automate quality assurance processing for ecological data. In Third IEEE computer society metadata conference, Bethesda, MD: IEEE Computer Society.

  • NRC. National Research Council. (1995). Finding the forest for the trees: The challenge of combining diverse environmental data-selected case studies. Washington, DC: National Academy.

    Google Scholar 

  • NRC. National Research Council. (1997). Bits of power: issues in global access to scientific data. Washington, DC: National Academy.

    Google Scholar 

  • Peyton-Jones, S. (2003). Spreadsheets—functional programming for the masses. Invited talk. Technical symposium on software, science & society. Oregon Graduate Institute of the Oregon Health and Science University, Friday, December 5, 2003. http://web.cecs.pdx.edu/~black/S3S/speakers.html and http://web.cecs.pdx.edu/~black/S3S/PJ.html.

  • Raguenaud, C., & Kennedy, J. (2002). Multiple overlapping classifications: issues and solutions. In 14th international conference on scientific and statistical database management—SSDBM 2002 (pp. 77–86). Edinburgh, Scotland: IEEE Computer Society.

  • Romanello, S., Beach, J., Bowers, S., Jones, M., Ludäscher, B., Michener, W., et al. (2005). Creating and providing data management services for the biological and ecological sciences: science environment for ecological knowledge. In 17th International Conference on Scientific and Statistical Database Management-SSDBM 2005.

  • Schnase, J. L., Cushing, J., Frame, M., Frondorf, A., Landis, E., Maier, D., et al. (2003). Information technology challenges of biodiversity and ecosystems informatics, special issue on data management in bioinformatics, Information Systems. In: M. J. Zaki, & J. T. L. Wang (Eds.) Volume 28, 4., June 2003. (pp 241–367). Elsevier Science.

  • Schroeder, W., Martin, K., & Lorensen, B. (1998). The visualization toolkit. Upper Saddle River, NJ: Prentice Hall.

    Google Scholar 

  • Sheard, T. (2001). Accomplishments and research challenges in meta-programming. Invited talk. In Semantics, applications, and implementation of program generation 2001. LNCS, Volume 2196. (pp. 2–44). Florence, Italy: Springer.

  • Sheard, T., & Jones, S. P. (2002). Template meta-programming for haskell. Haskell worshop. Pittsburg, PA: ACM.

    Google Scholar 

  • Sowa, J. F. (1984). Conceptual structures: information processing in mind and machine. Reading, MA: Addison Wesley.

    MATH  Google Scholar 

  • Spycher, G., Cushing, J. B., Henshaw, D. L., Stafford, S. G., & Nadkarni, N. M. (1996). Solving problems for validation, federation, and migration of ecological databases. Global networks for environmental information. In Proceedings of Eco-Informa ’96 (pp. 695–700). Lake Buena Vista, FL.: Ann Arbor, MI: Environmental Research Institute of Michigan (ERIM).

  • Stemple, D., & Sheard, T. (1991). A recursive base for database programming primitives. In Proceedings of next generation information system technology, LNCS, (pp. 311–332). Springer.

  • Szyperski, C. A. (1997). Component software. Addison-Wesley.

  • Van Pelt, R., & Nadkarni, N. M. (2004). Horizontal and vertical distribution of canopy structural elements of pseudotsuga menziesii forests in the pacific northwest, Forest Science, 50: 326–341.

  • Villa, F. (2001). Integrating modelling architecture: A declarative framework for multi-paradigm,multi-scale ecological modeling. Ecological Modelling, 137, 23–42.

    Article  Google Scholar 

  • Wang, B., Liu, X., & Kerridge, J. (2003). Agenerative and component based approach to reuse in database applications. In 5th generative programming and component engineering young researcher workshop. (September)

  • Weaver, M., Delcambre, L., & Maier, D. (2001). A superimposed architecture for enhanced metadata. In DELOS workshop on interoperability in digital libraries, held in conjunction with European Conference on Digital Libraries (ECDL 2001). Darmstadt, Germany.

  • Wood, W. A., & Kleb, W. L. (2003). Exploring XP for scientific research. IEEE Software, 20, 30–36.

    Article  Google Scholar 

URL’s referenced in the paper

Other sites about ecosystem informatics or software cited in this paper follow

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Judith Bayard Cushing.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cushing, J.B., Nadkarni, N., Finch, M. et al. Component-based end-user database design for ecologists. J Intell Inf Syst 29, 7–24 (2007). https://doi.org/10.1007/s10844-006-0028-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-006-0028-6

Keywords

Navigation