Home Documentation Overview Geoprocessing Methods

Geoprocessing Methods

This page describes geoprocessing methods used  to produce global variables for entry into GLOBE, by converting global datasets from their native formats into the GLOBE Land Unit (GLU) data structure.

This page is part of the Overall Procedure Creating GLOBE’s Global Variables; this must be fully completed to submit a new global variable.

  • These methods are based on the use of ArcGIS (10.0 or later):
  • Geoprocessing Methods are described by code in the Tables below, this code must be included in the documentation for each global variable, as noted in the GLOBE Global Variable Metadata Templatee.
  • The most fundamental method used here is Zonal Statistics, which is used to compute values within each GLU (“Zone”) from global datasets (e.g. MEAN, MAJORITY).  Using Zonal Statistics, a database (table) can be produced with the values of a global dataset specified for each GLU.
  • The native form of the GLUs is Hexagonal Polygons (VECTOR).
  • However, it is generally more efficient for geoprocessing to represent the GLUs using a 30 arc second raster dataset:
  • There are more geoprocessing methods that can work, to discuss alternatives, try the global variable forum.
  • This Documentation remains incomplete: please contact: globe@umbc.edu if planning to process a global variable for upload into GLOBE.

The optimal geoprocessing method depends on the native structure of the global dataset, specifically:

Decision Tree for Selecting Optimal Geoprocessing Method

I. Native Global Dataset is RASTER
     I.A. Resolution = 30 arc second: 1
     I.B. Resolution >> 30 arc second: 2

II. Native Global Dataset is VECTOR (Polygons)
     I.A. Polygons significantly larger than GLUs: 5
     I.B. Polygons significantly smaller than GLU: <TBD>

Final database structure for GLOBE:

  • Database in .dbf file format with two data fields (columns):
    • GLU_ID: the unique integer code for each GLU.
    • Field with global variable value for each GLU.  The field name for the global variable should be the short, 10 character,  name for the global variable.
    • The database may contain more than these two fields, but additional fields will not be used and will increase file size.
  • More information on global variables is requested in the GLOBE Global Variable Metadata Template (an Excel 2007 ”macro-enabled” workbook).  To open this file, it is ideal to use Microsoft Excel 2007 or later, though earlier versions of Excel may be updated.  Be sure to “Enable Content” if requested: this file contains a single macro for visualizing map legends: the file does not contain harmful code.

 

Table 1. Main Geoprocessing Methods

Code Native Data Description Procedure Summary
1 30 arc second grid, no extrapolation needed Process using the ‘Zonal Statistics as Table’ tool with 30 arc second GLU raster “GLU_ID” specified as zones.
2 Greater than 30 arc second resolution, no extrapolation needed Convert to 30 arc second and process using the ‘Zonal Statistics as Table’ tool 30 arc second GLU raster “GLU_ID” specified as zones.
3 Greater than 30 arc second resolution, extrapolation needed++ Convert to 30 arc second and extrapolate to cover extent of LandScan 2007 using ‘Focal Statistics’ tool, and process using the ‘Zonal Statistics as Table’ tool 30 arc second GLU raster “GLU_ID” specified as zones. ++See instructions below for extrapolating.
4 Greater than 30 arc second resolution, no extrapolation, ordinal variable Convert to 30 arcsec, reclassify into numerical classes, process using the ‘Zonal Statistics as Table’ tool with 30 arc second GLU raster “GLU_ID” specified as zones.
5 Polygon Convert features to 30 arcsec raster. and process using the ‘Zonal Statistics as Table’ tool with 30 arc second GLU raster “GLU_ID” specified as zones.

++Instructions for Extrapolating Rasters: (contact: globe@umbc.edu)

 

Table 2. Less Common Geoprocessing Methods

Code Native Data Description Procedure Summary
3 Greater than 30 arcsecond resolution, extrapolation needed** Convert to 30 arcsec and extrapolate to cover extent of LandScan 2007 using ‘Focal Statistics’ tool, and process using the ‘Zonal Statistics as Table’ tool with 30 arcsec GLU grid specified as zones. See instructions below for extrapolating and generating a coverage report.
6 Sinusoidal Convert GLU grid to sinusoidal spatial reference and projection and 15 arsec resolution, process with sinusoidal GLU, then convert back to 30 arcsec resolution. Process using the ‘Zonal Statistics as Table’ tool with 30 arcsec GLU grid specified as zones.
7 Distance Merge polygon panels, convert to 30 arcsec grid, and use cost distance tool. Process using the ‘Zonal Statistics as Table’ tool with 30 arcsec GLU grid specified as zones.
8 Country Names For GLU cells containing multiple countries, alphabetize country names and combine into a single name separating countries with dashes. Done in MATLAB.
9 Area calculations Calculate true area in ArcMap using Eckert IV coordinate system.
10 DGG Creation and calculation of statistics of GLUs from DGG grid cells.

** Variables were extrapolated with a three step process:

  1. Reclassify all native data as ’1′ and ‘No Data’ values to ’2′. This is referred to as the ‘position grid’.
  2. Use the ‘Focal Statistics’ tool to extrapolate the native data to cover the GLU extent. For continuous variables, the default extrapolation option is ‘mean’, and for categorical variables use ‘majority’.
  3. Specify the ‘Pick’ function in ‘Raster Calculator’ and use the ‘position grid’ to choose between the native and extrapolated datasets. Syntax is as follows: pick(‘position grid’,['native data','extrapolated data']).

Optional: generate a coverage report by:

  1. Reclassify all values of the extrapolated dataset as ’0′ and ‘No Data’ as ’1′. This is referred to as the ‘extrapolated extent grid’.
  2. Using ‘Raster Calculator’, add the ‘position grid’ (from above) to the ‘extrapolated extent grid’. This produces a grid with 1′s over the extent of the native data, 2′s where the data was extrapolated, and 3′s where neither the native or extrapolated data exist.
  3. Using a statistical software package (e.g., SPSS), calculate the percent of of the dataset that is native data, extrapolated data, or neither.

Notes from initial GLOBE dataset preparation:

  • Each global variable was converted to 30 arcsec raster with an extent equal to LandScan 2007.
  • Global variables that had smaller extents than LandScan 2007 were first extrapolated in their native resolutions using the ‘Focal Statistics’ function before being converted to 30 arcsec resolution.

To compute the values of global variables within GLUs requires different geoprocessing methods depending on the native structure of the global dataset.  Specifically, the optimal geoprocessing method depends on the form (vector or raster), type (continuous, nominal, ordinal), native spatial resolution (finer or coarser than 30 arc seconds, smaller or larger than 96 km2 hexagons) and the coordinate system of the global dataset to be processed.

© Copyright 2012 GLOBE. All Rights Reserved.
goin up