Skip to content

Latest commit

 

History

History

dods

OPeNDAP/GDAL Driver HOWTO
James Gallagher
12 December 2003

This document describes how to configure the OPeNDAP/GDAL driver. It
documents version 1.0 of the driver and uses version 3.4.1 of DODS/OPeNDAP.

0. Introduction

The OPeNDAP/GDAL driver enables GDAL to read from data sources served using
the OPeNDAP (aka DODS) software. In the rest of this HOWTO, I'll use the
names DODS and OPeNDAP interchangeably, however, there is a difference. DODS
(and its sister project NVODS) are oceanographic data systems; OPeNDAP is a
not-for-profit corporation formed to develop the discipline-neutral software
component used by those (and other) projects. 

You can find out about the DODS project at its web site:
http://www.unidata.ucar.edu/packages/dods/. And you can learn about OPeNDAP
at its web site: http://www.opendap.org/. Currently there is considerable
overlap between the two and OPeNDAP's software is being distributed at the
DODS web site.

The web sites contain on-line and printable documentation along with software
download pages, news and a list of data sources. In addition, you can search
for DODS data sources using the NASA Global Change Master Directory or Google
(look for DODS or GDS).

* Limitations of the current version of the driver

This driver is limited to data stored in Arrays or Grids (aka Rasters). In
the future we will support reading vector data using the OGR API.

DODS is a read-only data system, thus it is not possible to use this driver
to write data to a DODS/OPeNDAP server since that really has no meaning in
the context of the Data Access Protocol (DAP). You can read from the this
driver and use another driver (say geotiff) to write data to a local file.

This code is currently at version 1.0 (version.h holds the current version
number). It needs more testing.

---------------------------------------------------------------------------

1. Building GDAL with OPeNDAP support

To build the OPeNDAP driver, you will need the DODS software distribution
version 3.4.1. Later versions may also work. First build and install DODS.
Then run GDAL's configure script and provide it with the pathname to the root
of your DODS installation. To do this use the --with-dods-root=<pathname>
option. See 'configure --help' for more information. If you plan on using
EPSG numbers for Projection and Geographic coordinate systems, you'll also
need to be sure and build the geotiff software. I am fairly certain that
these are automatically built for newer versions of GDAL/OGR.

---------------------------------------------------------------------------

2. Enabling OPeNDAP data sources

To use the driver, pass the GDALDataset::Open() method a string which names
the DODS data source and the raster. To do this you set the pszFilename field
of the GDALOpenInfo structure to the string. While most drivers accept a
simple filename for this argument, the DODS driver takes a URL to the DODS
data source. This URL has some additional information added to it so the driver
knows which raster within the data source you would like to access and how
different dimensions of that raster are to be interpreted. Note that the
DODS/GDAL driver can open only one raster per call to the GdalDataset::Open()
method. If you want to open several rasters at once you will need to call the
driver several times.

* Syntax of the request string

To open a DODS data source using the driver you must append the name of the
Array or Grid variable to be accessed to the data source's DODS URL. In
addition, you must also supply some extra information about which dimensions
of the Raster correspond to Latitude and Longitude and, optionally, which are
bands. The syntax of the string is:

    <data source url> ? <raster name> <bracket expression>

The <data source url> is the URL of the DODS data source. The <raster name>
is the name of a raster (Array or Grid variable) held in the data source and
the <bracket expression> is used to describe how different dimensions are to
be interpreted by the driver. Here's a simple example:

    http://dods.gso.uri.edu/cgi-bin/nph-dods/DODS/pathfdr/1986/12/k86336175759.hdf?dsp_band_1[lat]lon]

There are many data sources in DODS that hold variables with three or four
dimensions. In cases where the GDAL driver is used to access these, it must
know which dimensions correspond to Latitude and Longitude. It must also know
if it should always use one specific 'slice' of a dimension or if a
dimension should be interpreted as making up a set of bands. This is the
function of the <bracket expression>. The full syntax is:

  ( [ <lat|lon|1-N|N> ] )+

Where '-', 'lat' and 'lon' are literal and N is a positive integer.

Here's an example for a three-dimensional Array:

    u[1-16][lat][lon]

(This variable can be found in th test data source at
http://dodsdev.gso.uri.edu/dods-test/nph-dods/data/nc/fnoc1.nc). The bracket
expression says that the first dimension holds sixteen bands (numbers 1 to
16, note that ones-based indexing is used to match GDAL's band numbering
scheme. The DAP uses zero-based indexing, so you'll need to take this into
account), the second dimension corresponds to Latitude and the third to
Longitude. If the variable 'u' had four dimensions, the expression would need
to specify which element of the fourth dimension should be used.

If a data source array has four or more dimensions, two *must* be labeled
'lat' and 'lon', one must be used to specify a group of bands and the
remaining dimensions must use the 'slice' notation to tell the driver which
single element of that dimension is to be used. For example, a hypothetical
four dimensional array called 'four' would need a bracket expression like:

    four[1][1-12][lat][lon]

or

    four_prime[3][3-9][lat][lon]

In the first example the driver is told to use only first element of
dimension one, that the second dimension is to be considered twelve bands and
the last tow dimensions are to be taken as latitude and longitude. In the
second example the driver is to use the third element of the first dimension
and the bands 3 through 9 of the second dimension.

When the driver opens a data source with a raster like 'four_prime' int he
above example, it will map the raster's bands 3 through 9 to GDAL bands 1
through 6.

It bears repeating: The band numbers and 'slice' in the 'bracket expression'
use ones-based indexing.

* Summary

To use the OPeNDAP/GDAL driver to read a variable from a data source, pass
the driver a URL with the name of the variable and the bracket expression.
Use a question mark to separate the variable name from the data source URL.
Here's an example using the sample data set at dodsdev:

     http://dodsdev.gso.uri.edu/dods-test/nph-dods/data/nc/fnoc1.nc?u[1-16][lat][lon]

* Adding geo-referencing information 

Most data sources available from DODS do not include the sort of
geo-referencing information that a GIS application expects or needs. However,
you can easily add this information to a data source using the AIS (Ancillary
Information Service) of DODS. The AIS is a new feature of DODS introduced in
beta form with the 3.4.0 version of the software. For this reason it has not
been fully documented in the DODS User Guide or other documentation. Here's a
short description of the system and how to configure it.

In the subdirectory example_conf.d there are DAS files and an AIS database
which configure the driver for the URI/GSO Pathfinder and GSFC MODIS
Aqua-Atmosphere MYD08_E3 (Level 3) data sources.

* How to configure the AIS for a data source

The AIS provides a way to augment the metadata for a data source. The service
uses text files which contain the extra metadata along with a (simple)
database that matches data sources to the files. The database is encoded in
XML and the files are DODS Dataset Attribute Structure (DAS) files. In the
following I'll explain how to write one of the DAS files and then how to
write the AIS database.

* Writing a DAS file for a particular data source or group of data sources

A DAS file is a simple text file. It contains a description of the attributes
for a data source. For general information about DAS objects and files, see
the DODS User's Guide available on-line at the DODS project web site. To add
geo-referencing information for a particular raster from a given data source,
add attributes which describe the Latitude and Longitude extent and
geographic projection for the raster. Here's the DAS file for the Pathfinder
data source held at the University of Rhode Island (URI):

    Attributes {
	dsp_band_1 {
	    Float64 Northernmost_Latitude 71.1722;
	    Float64 Southernmost_Latitude 4.8278;
	    Float64 Easternmost_Longitude -27.8897;
	    Float64 Westernmost_Longitude -112.11;
	    String ProjectionCS Equirectangular;
	    String GeographicCS WGS84;
	    Norm_Proj_Param {
		Float64 latitude_of_origin 0.0;
		Float64 central_meridian 0.0;
		Float64 false_easting 0.0;
		Float64 false_northing 0.0;
	    }
	}
    }

In this example the raster is named 'dsp_band_1' (the data source may be
found at: http://dods.gso.uri.edu/cgi-bin/nph-dods/DODS/pathfdr/
1986/12/k86336175759.hdf).

To describe the latitude and longitude bounding box, use the four attributes
"Northernmost_Latitude", "Southernmost_Latitude", "Easternmost_Longitude",
and "Westernmost_Longitude". 

To specify the projection information, use the attributes "ProjectionCS" and
"GeographicCS". The GeographicCS attribute is used to name the geographic
coordinate system. The value may be any string which can be passed to the
OGRSpatialReference::SetWellKnownGeogCS method. These include WGS84, WGS73,
NAD27, NAD83 and various EPSG numbers using the syntax 'EPSG:n' where 'n' is
the EPSG number. The ProjectionCS attribute is used to provide the projection
coordinate system. You can find a list of the projection coordinate systems
that GDAL supports at: http://remotesensing.org/geotiff/proj_list/. Finally,
the attribute container Norm_Proj_Param is used to provide the normalized
values of the parameters for the projection named by ProjectionCS. These
parameter names can be found at http://remotesensing.org/geotiff/proj_list/.
The parameter names should be the OGC WKT names.

For a nice tutorial on projections and information about GDAL's support for
specific ones, see:
http://gdal.velocet.ca/projects/opengis/ogrhtml/osr_tutorial.html.

Note that you must be careful about enclosing attribute names in double
quotes. In general, the quotes will be included in the value of a String
attribute. That means that a value of WSG84 is different from a value of
"WGS84" and the latter won't be recognized by GDAL.

It's often possible to figure out the bounding box and projection information
by examining the data source. You can do this using the OPeNDAP server's HTML
interface. This will provide you with a HTML form which lists all the
variables and their attributes for a given data source. You can query the
data source and get back ASCII data dumps, too. To access the HTML interface
fora given data source, append the suffix '.html' to the data source's URL.

* Adding a DAS to the AIS database

The AIS uses a simple flat-file database to match specific DAS files with
data sources. The file is easy to create with a text editor and it's easy to
specify that one DAS file should be used with a group of data sources. For
example, one DAS file is used with the thousands of images in the URI
Pathfinder data set.

Here's the text of the AIS data base:

    <?xml version="1.0" encoding="US-ASCII" standalone="no"?>
    <!DOCTYPE ais SYSTEM "http://www.opendap.org/ais/ais_database.dtd">

    <ais xmlns="http://xml.opendap.org/ais">
    <entry>
    <primary regexp="http://dods.gso.uri.edu/cgi-bin/nph-dods/DODS/pathfdr/.*\.hdf"/>
    <ancillary url="example_conf.d/pathfinder.hdf.das"/>
    </entry>

    </ais>

Each entry in the database uses one <entry> element. Inside each <entry>
there is a a <primary> element which holds information about a data source or
group of data sources. Use the "url" attribute to name a specific data source
or the "regex" attribute to name a group using a regular expression. Also in
an <entry> is a <ancillary> element which names the DAS file. Note that the
file name is given as the value of the <ancillary> element's "url" attribute.
It may be a local file name (as in this example) or it maybe a remote file
accessible using http.

Important: When you start adding you own DAS files, you should make the
filenames absolute paths, not relative. Because the pathnames are relative
they work only when running programs in the 'gdal/frmts/dods' directory (I
used relative paths because I cannot control where the code will be unpacked).

* Using a particular AIS database

To get the driver to use a particular database, add the pathname to the data
base in the .dodsrc file as the value of the parameter "AIS_DATABASE". In
the example_conf.d directory there is a sample .dodsrc file named "dodsrc"
which is confgiured to used the ais database in that directory (both the
dodsrc and ais_database.xml files are used by the test software). The .dodsrc
file can be kept the a users home directory or in a location named by the
DODS_CONF environment variable.

The dodsrc file canbe used to configure a number of other things about the
GDAL driver. The USE_CACHE parameter determines whether the driver will cache
responses. Turning caching on may improve performance dramatically. Setting
teh DEFLATE parameter to "1" will cause the driver to request that servers
compress responses before sending them, also improving performance.

* Data sources with many variables

Some data sources (e.g., some MODIS level 3) have many variables. It would be
tedious to write a DAS file for each variable in a data source when all of
the variables use the same projection. In this case the projection information
maybe included in a global attribute container named "opendap_org_gdal".
Here's an example for MODIS Level 3 data hosted at Goddard Space Flight
Center:

    Attributes {
	opendap_org_gdal {
	    Float64 Northernmost_Latitude 89.5;
	    Float64 Southernmost_Latitude -89.5;
	    Float64 Easternmost_Longitude 179.5;
	    Float64 Westernmost_Longitude -179.5;
	    String ProjectionCS Equirectangular;
	    String GeographicCS WGS84;
	    Norm_Proj_Param {
		Float64 latitude_of_origin 0.0;
		Float64 central_meridian 0.0;
		Float64 false_easting 0.0;
		Float64 false_northing 0.0;
	    }
	}
    }

Note that you can include both the global container and a variable-specific
one in the same DAS file. In that case the driver will use the variable
specific attributes preferentially over the global ones. This enables one DAS
file to include a default set of projection values and then override those
for a handful of variables.

Also note that since the DAS file can be accessed using HTTP, you can share
the files among several sites.

---------------------------------------------------------------------------

3. Testing the driver

Unit tests

The make target dodsdataset_test builds the unit tests which you can run as
"./dodsdataset_test". You'll need both DODS and CPPUNIT installed on your
computer for the unit tests to build and run. Be aware that some of the unit
tests take a while and may fail if some of the remote servers they use are
off line. 

Test application

The make target "using_dods" builds a simple application that uses the driver
to read and print from a DODS data source. You'll need DODS installed to
build this target (but you don't need CPPUNIT). To run the program pass it a
DODS data source URL with the variable name and bracket expression appended
as described above in Section 2. Here's an example using one of the DODS test
data sets:

     [jimg@comet dods]$ ./using_dods http://dodsdev.gso.uri.edu/dods-test/
     nph-dods/data/nc/fnoc1.nc?u[1-4][lat][lon]

This should print the following:

    Opening the dataset.
    Band Number: 1
    Block = 21x17
    Type = Int16
    Min = -2352, Max = 3694
    -146 -422 -1033 -1668 -2000 -2307 -2216 -1757 -1231 -294 861 1807 2561

followed by the remaining output (which is fairly long since it'll write out
data for the four bands)

Note that if you want to use the DODS/OPeNDAP HTML form interface of the data
source to check on these values, remember that the DAP uses zero-based
indexing. Be sure to adjust your band numbers accordingly!

You can also try using gdalinfo to open one of the rasters. Here's an example
using the pathfinder data (if you run this make sure to set the environment
variable DODS_CONF so that it points to the file 'dodsrc' in the subdirectory
'example_conf.d'. 



----------------------------------------------------------------------------

 $Id$