Skip to main content
Log in

Airborne data processing and analysis software package

  • Software Article
  • Published:
Earth Science Informatics Aims and scope Submit manuscript

Abstract

The practice of conducting quality control and quality assurance in the construction of data sets is often an overlooked and underestimated task of many research projects in the Earth Sciences. The development of software to effectively process and quickly analyze measurements is a critical aspect of a research project. An evolutionary approach has been used at the University of North Dakota to develop and implement software to process and analyze airborne measurements. Development over the past eight years has resulted in a collection of software named the Airborne Data Processing and Analysis (ADPAA) package which has been published as an open source project on Source Forge. The ADPAA package is intended to fully automate data processing while incorporating the concept of missing value codes and levels of data processing. At each data level, ADPAA utilizes a standard ASCII file format to store measurements from individual instruments into separate files. After all data levels have been processed, a summary file containing parameters of scientific interest for the field project is created for each aircraft flight. All project information is organized into a standard directory structure. ADPAA contains several tools that facilitate quality control procedures conducted on instruments during field projects and laboratory testing. Each quality control procedure is designed to ensure proper instrument performance and hence the validity of the instrument’s measurement. Data processing by ADPAA allows edit files to be created that are automatically used to insert missing value codes into a time period that had instrument problems. The creation of edit files is typically done after the completion of a field project when scientists are performing quality assurance of the data set. Since data processing is automatic, preliminary data can be created and analyzed within hours of an aircraft flight and a complete field project data set can be reprocessed many times during the quality assurance process. Once a final data set has been created, ADPAA provides several tools for visualization and analysis. In addition to aircraft data, ADPAA can be used on any data set that is based on time series measurements. The concepts illustrated by ADPAA and components of ADPAA, such as the Cplot visualization tool, are applicable to areas of Earth Science that work with time series measurements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Cplot (2010) Download ADPAA Files Now. https://sourceforge.net/projects/adpaa/files/. Accessed May 2010

  • Gaines SE, Hipskind RS (2009) Format Specification for Data Exchange. http://aerosol.atmos.und.edu/ADPAA/formatspec.txt. Accessed July 2009

  • Gancarz M (2003) Linux and UNIX Philosophy, Digital Press, 12 Crosby Drive, Bedford, MA 01730, USA

  • General Public License (2009) GNU General Public License Version3, 29 June 2007. http://www.gnu.org/copyleft/gpl.html. Accessed August 2009

  • Healy M, Westmacott M (1956) Missing values in experiments analysed on automatic computer. J R Stat Soc, Ser C, Appl Stat 5(3):203–206

    Google Scholar 

  • Heard DE (2006) Field measurements of atmospheric composition. In: Analytical techniques for atmospheric measurement. Blackwell Publishing, Oxford, UK

  • Holzwarth S, Freer M, Bachmann M, Wang X (2010) FP7-N6SP-DN6.2.1-List of Existing Data Pre-processing Software. http://www.eufar.net/search/doc/doc_pres.php?id_doc=4343&all=1. Accessed May 2010

  • Horton NJ, Lipsitz SR (2001) Multiple imputation in practice: comparison of software packages for regression models with missing variables. Am Stat 55(3):244–254

    Article  Google Scholar 

  • Lee X, Massman W, Law B (2004) Post-field data quality control. In: Handbook of micrometeorology: a guide for surface flux measurement and analysis. Kluwer Academic Publishers, Dordrecht

  • Lerner J, Tirole J (2005) The scope of open source licensing. J Law Econ Organ 21(1):20–56. doi:10.1093/jleo/ewi002

    Article  Google Scholar 

  • Masked Arrays (2010) NumPy v1.5.dev8106 Manual (DRAFT). http://docs.scipy.org/doc/numpy/reference/maskedarray.html. Accessed May 2010

  • Matplotlib (2010) Python Plotting. http://matplotlib.sourceforge.net/. Accessed May 2010

  • Mayavi (2010) 3D Scientific Data Visualization and Plotting. http://code.enthought.com/projects/mayavi/. Accessed May 2010

  • Murray JJ, Nguyen LA, Daniels TS, Minnis P, Schaffner PR, Cagle MF, Nordeen ML, Wolff, CA, Anderson MV, Mulally DJ, Jensen KR, Grainger CA, Delene DJ (2005) Tropospheric Airborne Meteorological Data and Reporting (TAMDAR) icing sensor performance during the 2003/2004 Alliance Icing Research Study (AIRS II). 43rd AIAA Aerospace Sciences Meeting and Exhibit - Meeting Papers Pages 11935–11945

  • National Aeronautics and Space Administration (1986) Report of the EOS Data Panel, Vol IIa Earth Observing System Data and Information System. Technical Memorandum 87777, National Aeronautics and Space Administration (NASA), Washington, DC

  • Noble CA, Vanderpool RW, Peters TM, McElroy FF, Gemmill DB, Wiener RW (2001) Federal reference and equivalent methods for measuring fine particulate matter. Aerosol Sci Technol 34(5):457–464

    Google Scholar 

  • NumPy (2010) Scientific Computing Tools for Python. http://numpy.scipy.org/. Accessed May 2010

  • Pinch T (1985) Towards an analysis of scientific observation: the externality and evidential significance of observational reports in physics. Soc Stud Sci 15(1):3–36

    Article  Google Scholar 

  • Prenni AJ, Harrington JY, Tjernstom M, DeMott PJ, Avramov A, Long CN, Kreidenweis SM, Olsson PQ, Verlinde J (2007) Can ice-nucleating aerosols affect arctic seasonal climate? Bull Am Meteorol Soc 88(4):541–550. doi:10.1175/BAMS-88-4-541

    Article  Google Scholar 

  • Pressman RS (2005) Software testing techniques. In: Software engineering: a practitioner’s approach, 6th edn. McGraw Hill, New York, pp 389–428

  • Rpy (2010) Low-level Interface to R. http://rpy.sourceforge.net/rpy2.html. Accesed May 2010

  • Rubin DB (1976) Noniterative least squares estimates, standard errors and F-Tests for analyses of variance with missing data. J R Stat Soc B Methodol 38(3):270–274

    Google Scholar 

  • Science Engineering Associates (2009) M300 Data Acquisition System. http://www.scieng.com/support/m300.htm. Accessed December 2009

  • SciPy (2010) Open-source software for mathematics, science, and engineering. http://www.scipy.org/. Accessed May 2010

  • Simmhan YL, Plale B, Gannon D (2005) A survey of data provenance in e-science. SIGMOD Rec 34(3):31–36. http://doi.acm.org/10.1145/1084805.1084812

    Article  Google Scholar 

  • Source Forge (2009) Airborne Data Processing and Analysis. http://sourceforge.net/projects/adpaa/. Accessed July 2009

  • Subramanian GH, Gary K, Jiang JJ, Chien-Lung C (2009) Balancing four factors in system development projects. Commun ACM 52(10):118–121

    Article  Google Scholar 

  • Sukovich EM, Kingsmill DE, Yuter SE (2009) Variability of graupel and snow observed in tropical oceanic convection by aircraft during TRMM KWAJEX. J Appl Meteorol Climatol 48(2):185–198

    Article  Google Scholar 

Download references

Acknowledgments

Several research projects have indirectly funded the development of the ADPAA software package and several people have helped with the software development. Roelof Burger and Duncan Axisa have helped with the development of the data directory structure. Chris Kruse, David Keith, Gökhan Sever, Fred Remer, Mike Poellot and Aaron Bansemer have reviewed and commented on draft versions of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David J. Delene.

Additional information

Communicated by: H.A. Babaie

Appendices

Appendix A

ADPAA uses a standard ASCII data file which contains meta-data in the header. An example file is given below. Variable labels in curly brackets are added on the left with an explanation of the labels given below the example data file.

figure a

NLHEAD: Number of lines (integer) composing the file header. NLHEAD is the first recorded value on the first line of an exchange file.

FFI: ASCII file format number. For the UND Citation aircraft data this will always be 1001.

ONAME: A character string specifying the name(s) of the originator(s) of the exchange file, last name first. On one line and not exceeding 132 characters.

ORG: Character string specifying the organization or affiliation of the originator of the exchange file. Can include address, phone number, email address, etc. On one line and not exceeding 132 characters.

SNAME: A character string specifying the source of the measurements or model results which compose the primary variables, on one line and not exceeding 132 characters. Can include instrument name, measurement platform, etc.

MNAME: A character string specifying the name of the field project that the data were obtained from.

IVOL: Volume number (integer) of the total number of volumes required to store a complete dataset, assuming only one file per volume. To be used in conjunction with NVOL to allow data exchange of large data sets requiring more than one volume of the exchange medium (diskette, etc.).

NVOL: Total number of volumes (integer) required to store the complete dataset, assuming one file per volume. If NVOL>1 then each volume must contain a file header with an incremented value for IVOL, and continue the data records with monotonic independent variable marks.

DATE: UT date at which the data within the exchange file begins. For aircraft data files DATE is the UT date of takeoff. DATE is in the form YYYY MM DD (year, month, day) with each integer value separated by at least one space. For example: 1989 1 16 or 1989 01 16 for 16 January 1989.

RDATE: Date of data reduction or revision, in the same form as DATE.

DX(s): Interval (real) between values of the s-th independent variable, X(i,s), i=1,NX(s); in the same units as specified in XNAME(s). DX(s) is zero for a non-uniform interval. DX(s) is non-zero for a constant interval. If DX(s) is non-zero then it is required that NX(s)=(X(NX(s),s)-X(1,s)) / DX(s)+1. For some file formats the value of DX also depends on the unbounded independent variable and is expressed as DX(m,s).

XNAME(s): A character string giving the name and/or description of the s-th independent variable, on one line and not exceeding 132 characters. Include units of measure and order the independent variable names such that, when reading primary variables from the data records, the most rapidly varying independent variable is listed first and the most slowly varying independent variable is listed last. Currently this is Time [Seconds] from midnight on day aircraft flight started for all UND exchange files.

NV: Number of primary variables in the exchange file (integer). This number plus one (for the time value) gives the number of parameters in the data file.

VSCAL(n): Scale factor by which one multiplies recorded values of the n-th primary variable to convert them to the units specified in VNAME(n). Currently this is 1 for all UND Citation Aircraft recorded values.

VMISS(n): A quantity indicating missing or erroneous data values for the n-th primary variable. VMISS(n) must be larger than any “good” data value, of the n-th primary variable, recorded in the file. The value of VMISS(n) defined in the file header is the same value that appears in the data records for missing/bad values of V(X,n). Currently the majority of UND parameters use a VMISS value of 999999.9999.

VNAME(n): A character string giving the name and/or description of the n-th primary variable, on one line and not exceeding 132 characters. Include units of measure the data will have after multiplying by the n-th scale factor, VSCAL(n). The order in which the primary variable names are listed in the file header is the same order in which the primary variables are read from the data records, and the same order in which scale factors and missing values for the primary variables are read from the file header records.

NSCOML: Number of special comment lines (integer) within the file header. Special comments are reserved to note special problems or circumstances concerning the data within a specific exchange file so they may easily be found and flagged by those reading the file. If NSCOML=0 then there are no special comment lines.

NNCOML: Number of normal comment lines (integer) within the file header, including blank lines and data column headers, etc. Normal comments are those which apply to all of a particular kind of dataset, and can be used to more completely describe the contents of the file. If NNCOML=0 then there are no normal comment lines.

DTYPE: Version description of the data. Typically either Preliminary or Final Data.

VFREQ: Time frequency of the data.

VDESC: A character string on a single line containing a short description of each variable in the exchange file. No spaces are allowed in each short variable description.

VUNITS: A character string on a single line containing the units of each variable in the exchange file. No spaces are allowed in each unit’s description.

Appendix B

The directory structure for ADPAA data sets is described below. The description starts from the top of the directory tree and works downward. Each level is defined by a name, notes, and examples. The name is one or two words that define the name of the level. The notes section contains a short description of the level. The example section contains example directories related to the Saudi Arabia 2007/2008 Winter field project. Items in the directory given below are indented as to indicate which directory level they are contained within. The directory tree used by ADPAA has a general directory structure tree as follows.

  • Project Name/

  • General Time Period/

  • General Data Type/

  • General Instrument Type/

  • Measurement Purpose/

  • Particular Time/

  • Particular Data Type/

A specific example of the directory tree used by ADPAA is given below with the general directory structure tree name being referenced highlighted in bold fonts.

NAME:

Project Name

NOTES:

This is the top of the directory tree. It groups projects by geographical regions.

EXAMPLES:

  • SaudiArabia/

  • Mali/

NAME:

General Time Period

NOTES:

Groups time periods together that span a single deployment or similar atmospheric conditions. All sub-directories will follow a similar structure.

EXAMPLES:

  • Mali/

  • SaudiArabia/

    • Spring07/

    • Winter0708/

    • Summer08/

NAME:

General Data Type

NOTES:

Groups different types of data together based on where it is obtained.

EXAMPLES:

  • Mali/

  • SaudiArabia/

    • Spring07/

    • Winter0708/

      • Aircraft/

        • Directory that contains all aircraft data from the winter project, located in the Winter0708 folder.

      • Documents/

        • Directory that contains documents created or related to the winter project.

      • Forecast

        • Directory that contains the forecast data for the winter project. Forecast data is grouped into year, month, day sub-directories.

    • Summer08/

NAME:

General Instrument Type

NOTES:

Groups data from different platforms and instruments together.

EXAMPLES:

  • Mali/

  • SaudiArabia/

    • Spring07/

    • Winter0708/

      • Aircraft/

        • KingAir_N825ST/

      • Documents/

      • Forecast/

    • Summer08/

NAME:

Measurement Purpose

NOTES:

Groups data that have a similar purpose together.

EXAMPLES:

  • Mali/

  • SaudiArabia/

    • Spring07/

    • Winter0708/

      • Aircraft/

        • KingAir_N825ST/

          • DMTCCNCTest/

            • Contains all test data for the DMT and CCNC instruments.

          • Documents/

            • Directory for aircraft specific documentation.

          • Flight/

            • Directory for the aircraft flight data.

          • GroundChecks/

            • Directory for the data related to all calibration checks and ground tests.

      • Documents/

      • Forecast/

    • Summer08/

NAME:

Particular Time

NOTES:

Groups flights that have a similar start times together. Directory name based on the start time for the data. If flight has more than 1 files, directory should be named YYYYMMDD_?, where ? is the number of the flight. Under this directory the names should then be similar to the standard directory.

EXAMPLES:

  • Mali/

  • SaudiArabia/

    • Spring07/

    • Winter0708/

      • Aircraft/

        • KingAir_N825ST/

          • DMTCCNCTest/

          • Documents/

          • Flight/

            • 20080308_074553/

              • Directory should have a name of YYYYMMDD_HHMMSS. The name is unique. Additional information is given in upper level directory or can be put in a readme file in the directory itself.

          • GroundChecks/

      • Documents/

      • Forecast/

    • Summer08/

NAME:

Particular Data Type

NOTES:

Groups data together based on particular data type.

EXAMPLES:

  • Mali/

  • SaudiArabia/

    • Spring07/

    • Winter0708/

      • Aircraft/

        • KingAir_N825ST/

          • DMTCCNCTest/

          • Documents/

          • Flight/

            • 20080308_074553/

              • Combined/

                • Directory that contains data from multiple data streams.

              • Notes/

                • Directory that contains flight notes and reports.

              • M300_Tables/

                • Contains the M300 tables used during the flight.

              • Photos/

                • Directory for digital images from the flight.

              • PostProcessing/

                • Directory for the post-processing data stream which is based on the *.sea file.

              • QuickChecks/

                • Directory for plots of the data.

              • RealTime/

                • Directory for the real-time data stream which is based on the M300 formula tables.

                • Directory contains the *.txt *.csv *.raw file.

              • Tamu/

                • Directory that contains the DMA and DMT CCNC data streams.

              • Video/

                • Directory to store any video from the flight.

              • M300_Tables/

                • Contains the M300 tables used during the flight.

          • GroundChecks/

      • Documents/

      • Forecast

    • Summer08/

Rights and permissions

Reprints and permissions

About this article

Cite this article

Delene, D.J. Airborne data processing and analysis software package. Earth Sci Inform 4, 29–44 (2011). https://doi.org/10.1007/s12145-010-0061-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12145-010-0061-4

Keywords

Navigation