Documentation

§ Overview

Harmony provides access to services that can transform data from NASA's Earth Observing Systems Data and Information System (EOSDIS) Distributed Active Archive Centers (DAAC). Transformations can be requested using one of two Open Geospatial Consortium (OGC) inspired APIs.

Data processed by Harmony is staged in AWS S3 buckets owned by NASA or optionally in user owned S3 buckets. Harmony provides signed URLs or temporary access credentials to users for data staged in NASA buckets.

Data transformation requests are executed as jobs in Harmony. Harmony provides the ability for users to monitor and interact with long-running jobs, both programmatically through an API and via a web-based user interface.

This documentation covers the following:

§ Getting Started

All users will need an Earthdata Login (EDL) account in order to access NASA data and services. Once a user has an EDL username and password they will need to use these when accessing Harmony. They can be used directly in a browser request (the browser will prompt for them), in another client like curl, or in code.

For curl or in code the easiest approach is to place your EDL credentials in a .netrc file.

A sample .netrc file looks like this


machine urs.earthdata.nasa.gov login my-edl-user-name password my-edl-password

COPIED

Example 2 - Sample .netrc file

Make sure that this file is only readable by the current user or you will receive an error stating "netrc access too permissive."

$ chmod 0600 ~/.netrc
COPIED

Example 3 - Setting permissions on the .netrc file (Unix/macOS)

Alternatively users can generate an EDL bearer token directly and pass this to Harmony using an Authorization: Bearer header.

§ Passing credentials with curl

Use the -n flag to use your .netrc file with curl. You will also need to pass the -L flag (to handle the redirect from Harmony to EDL and back) and the -b and -j flags to properly handle cookies used during the authentication.


curl -Lnbj https://harmony.earthdata.nasa.gov/C1234208438-POCLOUD/ogc-api-coverages/1.0.0/collections/bathymetry/coverage/rangeset

COPIED

Example 4 - Curl flags to handle EDL authentication when using a .netrc file

To work directly with a bearer token from EDL you can use an Authorization: Bearer my-bearer-token header as follows:


curl -H "Authorization: Bearer <my-bearer-token>" https://harmony.earthdata.nasa.gov/C1234208438-POCLOUD/ogc-api-coverages/1.0.0/collections/bathymetry/coverage/rangeset

COPIED

Example 5 - Using a bearer token with curl

§ Passing credentials in code

The following Python example uses the netrc, request, and cookiejar libraries to set up authentication with EDL. No error handling is included in this example.


import netrc
from urllib import request, parse
from http.cookiejar import CookieJar

def setup_earthdata_login_auth(endpoint):
    """
    Set up the request library so that it authenticates against the given Earthdata Login
    endpoint and is able to track cookies between requests.  This uses the .netrc file.
    """
    username, _, password = netrc.netrc().authenticators(endpoint)
    manager = request.HTTPPasswordMgrWithDefaultRealm()
    manager.add_password(None, endpoint, username, password)
    auth = request.HTTPBasicAuthHandler(manager)

    jar = CookieJar()
    processor = request.HTTPCookieProcessor(jar)
    opener = request.build_opener(auth, processor)
    request.install_opener(opener)

setup_earthdata_login_auth('urs.earthdata.nasa.gov')

COPIED

Example 6 - Authenticating in python

The username and password can also be set directly instead of using a .netrc file.

There is significant boiler-plate code involved in connecting to Harmony that can be avoided by using the harmony-py library. The equivalent code using harmony-py (when using a .netrc file) can be as simple as


from harmony import Client

harmony_client = Client() # defaults to Harmony production endpoint

COPIED

Example 7 - Using harmony-py to create a client with .netrc EDL authentication

harmony-py provides many other conveniences when using Harmony services. For these reasons harmony-py is the suggested way to access Harmony in code. For complete details see the documentation.




§ Summary of Available Endpoints

All of the public endpoints for Harmony users other than the OGC Coverages and WMS APIs are listed in the following table. The Coverages and WMS APIs are described in the next section.

route description
/ The Harmony landing page
/capabilities Returns JSON detailing the harmony capabilities for the provided collection
/cloud-access Generates JSON with temporary credentials for accessing processed data in S3
/cloud-access.sh Generates shell scripts that can be run to access processed data in S3
/docs These documentation pages
/docs/api The Swagger documentation for the OGC Coverages API
/jobs The jobs API for getting job status, pausing/continuing/canceling jobs
/stac The API for retrieving STAC catalogs and catalog items for processed data
/staging-bucket-policy The policy generator for external (user) bucket storage
/versions Returns JSON indicating what version (image tag) each deployed service is running
/workflow-ui The Workflow UI for monitoring and interacting with running jobs

Table 1 - Harmony routes other than OGC Coverages and WMS

The remaining routes are for launching services for collections using either OGC Coverages or WMS and are discussed in the next section.



§ Using the Service APIs

This section provides an introduction to the Harmony service APIs. For more details on the OGC Coverages API see the API Documentation.

Each API requires a CMR collection concept ID or short name, and transformations can be performed using one of the following endpoints ({collectionId} and {variable} are placeholders):


https://harmony.earthdata.nasa.gov/{collectionId}/ogc-api-coverages/1.0.0/{variable}/coverage/rangeset

COPIED

Example 8 - OGC Coverages endpoint


https://harmony.earthdata.nasa.gov/{collectionId}/wms

COPIED

Example 9 - WMS endpoint

§ OGC Coverages Request Parameters

The primary Harmony services REST API conforms to the OGC Coverages API version 1.0.0. As such it accepts parameters in the URL path as well as query parameters.

§ URL Path Parameters
parameter description
collection (required) This is the NASA EOSDIS collection or data product. There are two options for inputting a collection of interest:
1. Provide a concept ID, which is an ID provided in the Common Metadata Repository (CMR) metadata
2. Use the data product short name, e.g. SENTINEL-1_INTERFEROGRAMS. Must be URL encoded.
variable (required) Names of the UMM-Var variables to be retrieved, or "all" to retrieve all variables.
Multiple variables may be retrieved by separating them with a comma.

Table 2 - Harmony OGC Coverages API URL path (required) parameters



§ Query Parameters
parameter description
subset get a subset of the coverage by slicing or trimming along one axis. Harmony supports the axes "lat" and "lon" for spatial subsetting, and "time" for temporal, regardless of the names of those axes in the data files. Harmony also supports arbitrary dimension names for subsetting on numeric ranges for that dimension.
outputCrs reproject the output coverage to the given CRS. Recognizes CRS types that can be
interpolation specify the interpolation method used during reprojection and scaling
scaleExtent scale the resulting coverage along one axis to a given extent
scaleSize scale the resulting coverage along one axis to a given size
concatenate requests results to be concatenated into a single result
granuleId the CMR Granule ID for the granule which should be retrieved
granuleName passed to the CMR search as the readable_granule_name parameter. Supports * and ? wildcards for multiple and single character matches. Wildcards can be used any place in the name, but leading wildcards are discouraged as they require a lot of resources for the underlying search
grid the name of the output grid to use for regridding requests. The name must match the UMM
point only collections that have a geometry that contains a spatial point are selected. The spatial point is provided as two numbers:
* Longitude, coordinate axis 1
* Latitude, coordinate axis 2
The coordinate reference system of the values is WGS84 longitude/latitude.
width number of columns to return in the output coverage
height number of rows to return in the output coverage
forceAsync if "true", override the default API behavior and always treat the request as asynchronous
format the mime-type of the output format to return
maxResults limits the number of input files processed in the request
skipPreview if "true", override the default API behavior and never auto-pause jobs
ignoreErrors if "true", continue processing a request to completion even if some items fail
destinationUrl destination url specified by the client; currently only s3 link urls are supported (e.g. s3://my-bucket-name/mypath) and will result in the job being run asynchronously

Table 3 - Harmony OGC Coverages API query parameters

For POST requests the body should be multipart/form-data and may also contain

A sample OGC Coverages request is as follows


curl -Lnbj https://harmony.earthdata.nasa.gov/C1234208438-POCLOUD/ogc-api-coverages/1.0.0/collections/bathymetry/coverage/rangeset?maxResults=1

COPIED

Example 10 - Curl command for an OGC Coverages request

§ WMS Requests

Harmony provides an implementation of the OGC Web Map Service (WMS) API version 1.3.0. Harmony only supports the GetCapabilities and GetMap requests.

The API uses both URL path and query parameters.

§ URL Path Parameters
parameter required description
collection Y this parameter is the same as the collection parameter described in the OGC Coverages API above.

Table 4 - Harmony WMS API URL path (required) parameters

§ Common Query Parameters
parameter required description
service Y the service for the request. Must be equal to 'WMS'
version Y the WMS version to use. Must be equal to '1.3.0'
request Y the action being requested. Valid values are GetCapabilities and GetMap

Table 5 - Required query parameters for both GetCapabilities and GetMap

§ Query Parameters for GetMap - Standard WMS
parameter required description
layers Y comma-separated list of layer names to display on map
bbox Y the bounding box for the map as comma separated values in WSEN order
crs Y Spatial Reference System for map output. Value is in form EPSG:nnn
format Y output format mime-type
styles Y Styles in which layers are to be rendered. Value is a comma-separated list of style names, or empty if default styling is required. Style names may be empty in the list, to use default layer styling.
width Y width in pixels of the output
height Y height in pixels of the output
bgcolor N Background color for the map image. Value is in the form RRGGBB. Default is FFFFFF (white).
exceptions N Format in which to report exceptions. Default value is application/vnd.ogc.se_xml
transparent N whether the output background should be transparent (true or false). default is false

Table 6 - Standard WMS query parameters for GetMap

§ Additional Harmony parameters for WMS requests
parameter required description
dpi N the dots-per-inch (DPI) resolution for image output
map_resolution N the DPI resolution for image output
granuleId N the CMR Granule ID for the granule of interest
granuleName N passed to the CMR search as the readable_granule_name parameter. Supports * and ? wildcards for multiple and single character matches. Wildcards can be used any place in the name, but leading wildcards are discouraged as they require a lot of resources for the underlying search

Table 7 - Additional (non-OGC) query parameters for Harmony WMS queries

GetCapabilities requests return an XML document, while GetMap requests return an image.



§ Available Services

Harmony requests are declarative rather than imperative, so a request specifies the particular data of interest, time range of interest, spatial bounds of interest, desired output format, etc. Harmony then matches this declaration against available services and invokes the matching services on behalf of the user. All of which is to say the user does not request specific services directly. Despite this, it can be useful for a user to know what services are available, what their capabilities are, and which services can work together.

§ Service Versions

Harmony services run in containers in pods in a Kubernetes cluster. It is not possible for users to interact directly with these pods, but it may be useful to know some of the details about the running versions. The specific docker image and tag for each service can be retrieved from the versions route.

§ Service Capabilities

The following tables provide an overview of the deployed services with a description of each and what capabilities they provide.

Name: gesdisc/giovanni
Description:

A service to compose the Giovanni URL and invoke Giovanni service to produce output file to visualize, analyze, and access vast amounts of Earth science remote sensing data without having to download the data.

Capabilities
subsettingconcatenationreprojectionoutput formats
bounding boxshapevariablemultiple variableDEFAULTNtext/csv
YNYN

Name: podaac/l2-subsetter
Description:

Implementation of the L2 swath subsetter based on python, xarray and netcdf4.

  • Works with Trajectory (1D) and Along track/across track data.
  • Works with NetCDF and HDF5 inputfiles
  • Variable subsetting supported
  • works with hierarchical groups Outputs netcdf4.

Capabilities
subsettingconcatenationreprojectionoutput formats
bounding boxshapevariablemultiple variableNNapplication/netcdf,application/x-netcdf4
YYYN

Name: podaac/concise
Description:

Service capabale of "concatenating" multiple netCDF files into a single netCDF files. The resulting file has an extra dimension with size equal to the number of input files where each slice in that dimension corresponds to the data from one of the inputs.

Capabilities
subsettingconcatenationreprojectionoutput formats
bounding boxshapevariablemultiple variableYNapplication/netcdf,application/x-netcdf4
NNNN

Name: podaac/l2-subsetter-concise
Description:

Chained Service of the PODAAC L2 swath subsetter and PODAAC concise services.
PODAAC L2 swath subsetter
  • Works with Trajectory (1D) and Along track/across track data.
  • Works with NetCDF and HDF5 inputfiles
  • Variable subsetting supported
  • works with hierarchical groups Outputs netcdf4.
PODAAC concise services

Service capabale of "concatenating" multiple netCDF files into a single netCDF files. The resulting file has an extra dimension with size equal to the number of input files where each slice in that dimension corresponds to the data from one of the inputs.

Capabilities
subsettingconcatenationreprojectionoutput formats
bounding boxshapevariablemultiple variableYNapplication/netcdf,application/x-netcdf4
YNYN

Name: sds/trajectory-subsetter
Description:

A service that supports L2 segmented trajectory (not swath) data. Currently uses the same C++ binary as is on-premises in SDPS, to offer variable, temporal, bounding box spatial and polygon spatial subsetting. This subsetter also ensures valid segment indices and sizes following transformation.

Capabilities
subsettingconcatenationreprojectionoutput formats
bounding boxshapevariablemultiple variableNNapplication/x-hdf
YYYN

Name: sds/HOSS-geographic
Description:

A service that currently supports L3/L4 geographically gridded collections, offering variable, temporal, named dimension, bounding box spatial and shape file spatial subsetting.

Accesses NetCDF-4 files hosted in Hyrax cloud service (OPeNDAP), and retrieves the requested list of variables, plus all those that are required to support meaningful downstream operations upon those data (e.g. associated coordinate variables).

This service is currently operated from the same Docker image as the sds/variable-subsetter service.

Capabilities
subsettingconcatenationreprojectionoutput formats
bounding boxshapevariablemultiple variableNNapplication/netcdf,application/x-netcdf4
YYYN

Name: sds/HOSS-projection-gridded
Description:

A service that currently supports L3/L4 projection-gridded collections, offering variable, temporal, named dimension, bounding box spatial and shape file spatial subsetting.

Accesses NetCDF-4 files hosted in Hyrax cloud service (OPeNDAP), and retrieves the requested list of variables, plus all those that are required to support meaningful downstream operations upon those data (e.g. associated coordinate variables).

This service is currently operated from the same Docker image as the sds/variable-subsetter service.

Capabilities
subsettingconcatenationreprojectionoutput formats
bounding boxshapevariablemultiple variableNNapplication/netcdf,application/x-netcdf4
YYYN

Name: nasa/harmony-gdal-adapter
Description:

Service translating Harmony operations to GDAL commands. Supports spatial bounding box, temporal, variable, and shapefile, reprojection, and output to NetCDF4, COG, PNG, and GIF. Operates on input file types supported by GDAL (most EOSDIS data). Most operations assume L3 data, though it is likely that some work on L2.

Capabilities
subsettingconcatenationreprojectionoutput formats
bounding boxshapevariablemultiple variableNNapplication/x-netcdf4,image/tiff,image/png,image/gif
YYYY

Name: harmony/netcdf-to-zarr
Description:

Converts NetCDF4 files to the Zarr format as faithfully as possible, preserving metadata. The service attempts to optimize chunking in both time and space via heuristic algorithm in the output Zarr store.

Capabilities
subsettingconcatenationreprojectionoutput formats
bounding boxshapevariablemultiple variableYNapplication/x-zarr
NNNN

Name: harmony/service-example
Description:

Reference service that can be used to perform most operations supported by Harmony. Useful for testing new API features end-to-end on example data but not meant for production use.

Capabilities
subsettingconcatenationreprojectionoutput formats
bounding boxshapevariablemultiple variableNNimage/tiff,image/png,image/gif
YNYY


§ Monitoring Jobs with the Jobs API and the Workflow-UI

Jobs can be monitored using the jobs API as well as with the Workflow-UI web application.

§ Getting the list of jobs for a user

curl -Ln -bj https://harmony.earthdata.nasa.gov/jobs

COPIED

Example 11 - Getting the user's list of jobs using the jobs API

§ Getting job status

curl -Ln -bj https://harmony.earthdata.nasa.gov/jobs/<job-id>

COPIED

Example 12 - Getting job status

§ Pausing a job

curl -Ln -bj https://harmony.earthdata.nasa.gov/jobs/<job-id>/pause

COPIED

Example 13 - Pausing a running job

§ Resuming a paused job

curl -Ln -bj https://harmony.earthdata.nasa.gov/jobs/<job-id>/resume

COPIED

Example 14 - Resuming a paused job

§ Canceling a job

curl -Ln -bj https://harmony.earthdata.nasa.gov/jobs/<job-id>/cancel

COPIED

Example 15 - Canceling a running job

Jobs involving many granules will by default pause automatically after the first few granules are processed. This allows the user to examine the output to make sure things look right before waiting for the entire job to complete. If things looks good, the user can resume the paused job, if not they can cancel the paused job.

If a user wishes to skip this step they can pass the skipPreview flag mentioned in the Services API section, or they can tell an already running job to skip the preview using the following:

§ Skipping preview

curl -Ln -bj https://harmony.earthdata.nasa.gov/jobs/<job-id>/skip-preview

COPIED

Example 16 - Skipping the preview on a many-granule job



§ User Owned Buckets for Harmony Output

Users may store Harmony output directly in their own S3 buckets by specifying the bucket/path in their requests with the destinationUrl parameter. For example


https://harmony.earthdata.nasa.gov/C1234208438-POCLOUD/ogc-api-coverages/1.0.0/collections/bathymetry/coverage/rangeset?concatenate=true&subset=lon(-160%3A160)&subset=lat(-80%3A80)&skipPreview=true&maxResults=1&destinationUrl=3%3A%2F%2Fmy-staging-bucket

COPIED

Example 17 - Request to store output in user owned S3 bucket

would place the output in s3://my-example-bucket. Note that the value of destinationUrl must be a full S3 path and must be URL encoded.

Four things are required to enable Harmony to write to your bucket.

  1. The bucket must be in the same AWS region as Harmony, i.e., us-west-2.
  2. The bucket must have ACLs disabled. This is the default for S3 buckets.
  3. Harmony must have permission to write to the bucket.
  4. Harmony must have permission to check the bucket's location.

Numbers three and four on the list can be accomplished by setting an appropriate bucket policy. You can obtain a bucket policy for your bucket using the policy generator at https://harmony.earthdata.nasa.gov/staging-bucket-policy and passing in the bucketPath query parameter. For example

https://harmony.earthdata.nasa.gov/staging-bucket-policy?bucketPath=my-example-bucket

The bucketPath parameter can be one of the following

  1. A bucket name, e.g., my-example-bucket
  2. A bucket name + path, e.g., my-example-bucket/my/path
  3. A full S3 url with our without a path, e.g., s3://my-example-bucket/my/path

The third option is compatible with the destinationUrl parameter for requests.

A sample policy generated by the endpoint is shown below:


{
  'Version': '2012-10-17',
  'Statement': [
    {
      'Sid': 'write permission',
      'Effect': 'Allow',
      'Principal': {
        'AWS': 'arn:aws:iam::123456789012:root',
      },
      'Action': 's3:PutObject',
      'Resource': 'arn:aws:s3:::my-bucket/*',
    },
    {
      'Sid': 'get bucket location permission',
      'Effect': 'Allow',
      'Principal': {
        'AWS': 'arn:aws:iam::123456789012:root',
      },
      'Action': 's3:GetBucketLocation',
      'Resource': 'arn:aws:s3:::my-bucket',
    },
  ]
}

COPIED

Example 18 - Sample bucket policy to enable writing Harmony output