Harmony provides access to services that can transform data from NASA's Earth Observing Systems Data and Information System (EOSDIS) Distributed Active Archive Centers (DAAC). Transformations can be requested using one of two Open Geospatial Consortium (OGC) inspired APIs.
Data processed by Harmony is staged in AWS S3 buckets owned by NASA or optionally in user owned S3 buckets. Harmony provides signed URLs or temporary access credentials to users for data staged in NASA buckets.
Data transformation requests are executed as jobs in Harmony. Harmony provides the ability for users to monitor and interact with long-running jobs, both programmatically through an API and via a web-based user interface.
This documentation covers the following:
All users will need an Earthdata Login (EDL) account in order to access NASA data and services. Once a user has an EDL username and password they will need to use these when accessing Harmony. They can be used directly in a browser request (the browser will prompt for them), in another client like curl, or in code.
For curl
or in code the easiest approach is to place your EDL credentials in a .netrc file.
A sample .netrc
file looks like this
machine urs.earthdata.nasa.gov login my-edl-user-name password my-edl-password
COPIED
Example 2 - Sample .netrc file
Make sure that this file is only readable by the current user or you will receive an error stating "netrc access too permissive."
$ chmod 0600 ~/.netrc
COPIED
Example 3 - Setting permissions on the .netrc file (Unix/macOS)
Alternatively users can generate an EDL bearer token directly and pass this to Harmony using an Authorization: Bearer
header.
Use the -n
flag to use your .netrc
file with curl
. You will
also need to pass the -L
flag (to handle the redirect from Harmony to EDL and back) and
the -b
and -j
flags to properly handle cookies used during the authentication.
curl -Lnbj https://harmony.earthdata.nasa.gov/C1234208438-POCLOUD/ogc-api-coverages/1.0.0/collections/bathymetry/coverage/rangeset
COPIED
Example 4 - Curl flags to handle EDL authentication when using a .netrc file
To work directly with a bearer token from EDL you can use an Authorization: Bearer my-bearer-token
header as follows:
curl -H "Authorization: Bearer <my-bearer-token>" https://harmony.earthdata.nasa.gov/C1234208438-POCLOUD/ogc-api-coverages/1.0.0/collections/bathymetry/coverage/rangeset
COPIED
Example 5 - Using a bearer token with curl
The following Python example uses the netrc
, request
, and cookiejar
libraries to set up authentication with EDL.
No error handling is included in this example.
import netrc
from urllib import request, parse
from http.cookiejar import CookieJar
def setup_earthdata_login_auth(endpoint):
"""
Set up the request library so that it authenticates against the given Earthdata Login
endpoint and is able to track cookies between requests. This uses the .netrc file.
"""
username, _, password = netrc.netrc().authenticators(endpoint)
manager = request.HTTPPasswordMgrWithDefaultRealm()
manager.add_password(None, endpoint, username, password)
auth = request.HTTPBasicAuthHandler(manager)
jar = CookieJar()
processor = request.HTTPCookieProcessor(jar)
opener = request.build_opener(auth, processor)
request.install_opener(opener)
setup_earthdata_login_auth('urs.earthdata.nasa.gov')
COPIED
Example 6 - Authenticating in python
The username
and password
can also be set directly instead of using a .netrc
file.
There is significant boiler-plate code involved in connecting to Harmony that can be avoided
by using the harmony-py library. The equivalent code
using harmony-py
(when using a .netrc
file) can be as simple as
from harmony import Client
harmony_client = Client() # defaults to Harmony production endpoint
COPIED
Example 7 - Using harmony-py
to create a client with .netrc
EDL authentication
harmony-py
provides many other conveniences when using Harmony services. For these reasons
harmony-py
is the suggested way to access Harmony in code. For complete details see the
documentation.
All of the public endpoints for Harmony users other than the OGC Coverages and WMS APIs are listed in the following table. The Coverages and WMS APIs are described in the next section.
route | description |
---|---|
/ | The Harmony landing page |
/capabilities | Returns JSON detailing the harmony capabilities for the provided collection |
/cloud-access | Generates JSON with temporary credentials for accessing processed data in S3 |
/cloud-access.sh | Generates shell scripts that can be run to access processed data in S3 |
/docs | These documentation pages |
/docs/api | The Swagger documentation for the OGC Coverages API |
/jobs | The jobs API for getting job status, pausing/continuing/canceling jobs |
/stac | The API for retrieving STAC catalogs and catalog items for processed data |
/staging-bucket-policy | The policy generator for external (user) bucket storage |
/versions | Returns JSON indicating what version (image tag) each deployed service is running |
/workflow-ui | The Workflow UI for monitoring and interacting with running jobs |
Table 1 - Harmony routes other than OGC Coverages and WMS
The remaining routes are for launching services for collections using either OGC Coverages or WMS and are discussed in the next section.
This section provides an introduction to the Harmony service APIs. For more details on the OGC Coverages API see the API Documentation.
Each API requires a CMR collection concept ID or short name, and transformations can be performed using one of the following endpoints ({collectionId} and {variable} are placeholders):
https://harmony.earthdata.nasa.gov/{collectionId}/ogc-api-coverages/1.0.0/{variable}/coverage/rangeset
COPIED
Example 8 - OGC Coverages endpoint
https://harmony.earthdata.nasa.gov/{collectionId}/wms
COPIED
Example 9 - WMS endpoint
The primary Harmony services REST API conforms to the OGC Coverages API version 1.0.0. As such it accepts parameters in the URL path as well as query parameters.
parameter | description |
---|---|
collection | (required) This is the NASA EOSDIS collection or data product. There are two options for inputting a collection of interest: 1. Provide a concept ID, which is an ID provided in the Common Metadata Repository (CMR) metadata 2. Use the data product short name, e.g. SENTINEL-1_INTERFEROGRAMS. Must be URL encoded. |
variable | (required) Names of the UMM-Var variables to be retrieved, or "all" to retrieve all variables. Multiple variables may be retrieved by separating them with a comma. |
Table 2 - Harmony OGC Coverages API URL path (required) parameters
parameter | description |
---|---|
subset | get a subset of the coverage by slicing or trimming along one axis. Harmony supports the axes "lat" and "lon" for spatial subsetting, and "time" for temporal, regardless of the names of those axes in the data files. Harmony also supports arbitrary dimension names for subsetting on numeric ranges for that dimension. |
outputCrs | reproject the output coverage to the given CRS. Recognizes CRS types that can be |
interpolation | specify the interpolation method used during reprojection and scaling |
scaleExtent | scale the resulting coverage along one axis to a given extent |
scaleSize | scale the resulting coverage along one axis to a given size |
concatenate | requests results to be concatenated into a single result |
granuleId | the CMR Granule ID for the granule which should be retrieved |
granuleName | passed to the CMR search as the readable_granule_name parameter. Supports * and ? wildcards for multiple and single character matches. Wildcards can be used any place in the name, but leading wildcards are discouraged as they require a lot of resources for the underlying search |
grid | the name of the output grid to use for regridding requests. The name must match the UMM |
point | only collections that have a geometry that contains a spatial point are selected. The spatial point is provided as two numbers: * Longitude, coordinate axis 1 * Latitude, coordinate axis 2 The coordinate reference system of the values is WGS84 longitude/latitude. |
width | number of columns to return in the output coverage |
height | number of rows to return in the output coverage |
forceAsync | if "true", override the default API behavior and always treat the request as asynchronous |
format | the mime-type of the output format to return |
maxResults | limits the number of input files processed in the request |
skipPreview | if "true", override the default API behavior and never auto-pause jobs |
ignoreErrors | if "true", continue processing a request to completion even if some items fail |
destinationUrl | destination url specified by the client; currently only s3 link urls are supported (e.g. s3://my-bucket-name/mypath) and will result in the job being run asynchronously |
Table 3 - Harmony OGC Coverages API query parameters
For POST
requests the body should be multipart/form-data
and may also contain
shape
: perform a shapefile subsetting request on a supported collection by passing the path to a GeoJSON file (*.json or .geojson), an ESRI Shapefile (.zip or .shz), or a kml file (.kml) as the "shape" parameterA sample OGC Coverages request is as follows
curl -Lnbj https://harmony.earthdata.nasa.gov/C1234208438-POCLOUD/ogc-api-coverages/1.0.0/collections/bathymetry/coverage/rangeset?maxResults=1
COPIED
Example 10 - Curl command for an OGC Coverages request
Harmony provides an implementation of the OGC Web Map Service (WMS) API version 1.3.0. Harmony only supports the GetCapabilities
and GetMap
requests.
The API uses both URL path and query parameters.
parameter | required | description |
---|---|---|
collection | Y | this parameter is the same as the collection parameter described in the OGC Coverages API above. |
Table 4 - Harmony WMS API URL path (required) parameters
parameter | required | description |
---|---|---|
service | Y | the service for the request. Must be equal to 'WMS' |
version | Y | the WMS version to use. Must be equal to '1.3.0' |
request | Y | the action being requested. Valid values are GetCapabilities and GetMap |
Table 5 - Required query parameters for both GetCapabilities
and GetMap
parameter | required | description |
---|---|---|
layers | Y | comma-separated list of layer names to display on map |
bbox | Y | the bounding box for the map as comma separated values in WSEN order |
crs | Y | Spatial Reference System for map output. Value is in form EPSG:nnn |
format | Y | output format mime-type |
styles | Y | Styles in which layers are to be rendered. Value is a comma-separated list of style names, or empty if default styling is required. Style names may be empty in the list, to use default layer styling. |
width | Y | width in pixels of the output |
height | Y | height in pixels of the output |
bgcolor | N | Background color for the map image. Value is in the form RRGGBB. Default is FFFFFF (white). |
exceptions | N | Format in which to report exceptions. Default value is application/vnd.ogc.se_xml |
transparent | N | whether the output background should be transparent (true or false ). default is false |
Table 6 - Standard WMS query parameters for GetMap
parameter | required | description |
---|---|---|
dpi | N | the dots-per-inch (DPI) resolution for image output |
map_resolution | N | the DPI resolution for image output |
granuleId | N | the CMR Granule ID for the granule of interest |
granuleName | N | passed to the CMR search as the readable_granule_name parameter. Supports * and ? wildcards for multiple and single character matches. Wildcards can be used any place in the name, but leading wildcards are discouraged as they require a lot of resources for the underlying search |
Table 7 - Additional (non-OGC) query parameters for Harmony WMS queries
GetCapabilities
requests return an XML document, while GetMap
requests return an image.
Harmony requests are declarative rather than imperative, so a request specifies the particular data of interest, time range of interest, spatial bounds of interest, desired output format, etc. Harmony then matches this declaration against available services and invokes the matching services on behalf of the user. All of which is to say the user does not request specific services directly. Despite this, it can be useful for a user to know what services are available, what their capabilities are, and which services can work together.
Harmony services run in containers in pods in a Kubernetes cluster. It is not possible for users to interact directly with these pods, but it may be useful to know some of the details about the running versions. The specific docker image and tag for each service can be retrieved from the versions route.
The following tables provide an overview of the deployed services with a description of each and what capabilities they provide.
Name: gesdisc/giovanni
Description:
A service to compose the Giovanni URL and invoke Giovanni service to produce output file to visualize, analyze, and access vast amounts of Earth science remote sensing data without having to download the data.
Capabilities | ||||||
---|---|---|---|---|---|---|
subsetting | concatenation | reprojection | output formats | |||
bounding box | shape | variable | multiple variable | DEFAULT | N | text/csv |
Y | N | Y | N |
Name: podaac/l2-subsetter
Description:
Implementation of the L2 swath subsetter based on python, xarray and netcdf4.
Capabilities | ||||||
---|---|---|---|---|---|---|
subsetting | concatenation | reprojection | output formats | |||
bounding box | shape | variable | multiple variable | N | N | application/netcdf,application/x-netcdf4 |
Y | Y | Y | N |
Name: podaac/concise
Description:
Service capabale of "concatenating" multiple netCDF files into a single netCDF files. The resulting file has an extra dimension with size equal to the number of input files where each slice in that dimension corresponds to the data from one of the inputs.
Capabilities | ||||||
---|---|---|---|---|---|---|
subsetting | concatenation | reprojection | output formats | |||
bounding box | shape | variable | multiple variable | Y | N | application/netcdf,application/x-netcdf4 |
N | N | N | N |
Name: podaac/l2-subsetter-concise
Description:
Service capabale of "concatenating" multiple netCDF files into a single netCDF files. The resulting file has an extra dimension with size equal to the number of input files where each slice in that dimension corresponds to the data from one of the inputs.
Capabilities | ||||||
---|---|---|---|---|---|---|
subsetting | concatenation | reprojection | output formats | |||
bounding box | shape | variable | multiple variable | Y | N | application/netcdf,application/x-netcdf4 |
Y | N | Y | N |
Name: sds/trajectory-subsetter
Description:
A service that supports L2 segmented trajectory (not swath) data. Currently uses the same C++ binary as is on-premises in SDPS, to offer variable, temporal, bounding box spatial and polygon spatial subsetting. This subsetter also ensures valid segment indices and sizes following transformation.
Capabilities | ||||||
---|---|---|---|---|---|---|
subsetting | concatenation | reprojection | output formats | |||
bounding box | shape | variable | multiple variable | N | N | application/x-hdf |
Y | Y | Y | N |
Name: sds/HOSS-geographic
Description:
A service that currently supports L3/L4 geographically gridded collections, offering variable, temporal, named dimension, bounding box spatial and shape file spatial subsetting.
Accesses NetCDF-4 files hosted in Hyrax cloud service (OPeNDAP), and retrieves the requested list of variables, plus all those that are required to support meaningful downstream operations upon those data (e.g. associated coordinate variables).
This service is currently operated from the same Docker image as the sds/variable-subsetter service.
Capabilities | ||||||
---|---|---|---|---|---|---|
subsetting | concatenation | reprojection | output formats | |||
bounding box | shape | variable | multiple variable | N | N | application/netcdf,application/x-netcdf4 |
Y | Y | Y | N |
Name: sds/HOSS-projection-gridded
Description:
A service that currently supports L3/L4 projection-gridded collections, offering variable, temporal, named dimension, bounding box spatial and shape file spatial subsetting.
Accesses NetCDF-4 files hosted in Hyrax cloud service (OPeNDAP), and retrieves the requested list of variables, plus all those that are required to support meaningful downstream operations upon those data (e.g. associated coordinate variables).
This service is currently operated from the same Docker image as the sds/variable-subsetter service.
Capabilities | ||||||
---|---|---|---|---|---|---|
subsetting | concatenation | reprojection | output formats | |||
bounding box | shape | variable | multiple variable | N | N | application/netcdf,application/x-netcdf4 |
Y | Y | Y | N |
Name: nasa/harmony-gdal-adapter
Description:
Service translating Harmony operations to GDAL commands. Supports spatial bounding box, temporal, variable, and shapefile, reprojection, and output to NetCDF4, COG, PNG, and GIF. Operates on input file types supported by GDAL (most EOSDIS data). Most operations assume L3 data, though it is likely that some work on L2.
Capabilities | ||||||
---|---|---|---|---|---|---|
subsetting | concatenation | reprojection | output formats | |||
bounding box | shape | variable | multiple variable | N | N | application/x-netcdf4,image/tiff,image/png,image/gif |
Y | Y | Y | Y |
Name: harmony/netcdf-to-zarr
Description:
Converts NetCDF4 files to the Zarr format as faithfully as possible, preserving metadata. The service attempts to optimize chunking in both time and space via heuristic algorithm in the output Zarr store.
Capabilities | ||||||
---|---|---|---|---|---|---|
subsetting | concatenation | reprojection | output formats | |||
bounding box | shape | variable | multiple variable | Y | N | application/x-zarr |
N | N | N | N |
Name: harmony/service-example
Description:
Reference service that can be used to perform most operations supported by Harmony. Useful for testing new API features end-to-end on example data but not meant for production use.
Capabilities | ||||||
---|---|---|---|---|---|---|
subsetting | concatenation | reprojection | output formats | |||
bounding box | shape | variable | multiple variable | N | N | image/tiff,image/png,image/gif |
Y | N | Y | Y |
Jobs can be monitored using the jobs
API as well as with the Workflow-UI web application.
curl -Ln -bj https://harmony.earthdata.nasa.gov/jobs
COPIED
Example 11 - Getting the user's list of jobs using the jobs
API
curl -Ln -bj https://harmony.earthdata.nasa.gov/jobs/<job-id>
COPIED
Example 12 - Getting job status
curl -Ln -bj https://harmony.earthdata.nasa.gov/jobs/<job-id>/pause
COPIED
Example 13 - Pausing a running job
curl -Ln -bj https://harmony.earthdata.nasa.gov/jobs/<job-id>/resume
COPIED
Example 14 - Resuming a paused job
curl -Ln -bj https://harmony.earthdata.nasa.gov/jobs/<job-id>/cancel
COPIED
Example 15 - Canceling a running job
Jobs involving many granules will by default pause automatically after the first few granules are processed. This allows the user to examine the output to make sure things look right before waiting for the entire job to complete. If things looks good, the user can resume the paused job, if not they can cancel the paused job.
If a user wishes to skip this step they can pass the skipPreview
flag mentioned in the
Services API section, or they can tell an already running job
to skip the preview using the following:
curl -Ln -bj https://harmony.earthdata.nasa.gov/jobs/<job-id>/skip-preview
COPIED
Example 16 - Skipping the preview on a many-granule job
Users may store Harmony output directly in their own S3 buckets by specifying the bucket/path
in their requests with the destinationUrl
parameter. For example
https://harmony.earthdata.nasa.gov/C1234208438-POCLOUD/ogc-api-coverages/1.0.0/collections/bathymetry/coverage/rangeset?concatenate=true&subset=lon(-160%3A160)&subset=lat(-80%3A80)&skipPreview=true&maxResults=1&destinationUrl=3%3A%2F%2Fmy-staging-bucket
COPIED
Example 17 - Request to store output in user owned S3 bucket
would place the output in s3://my-example-bucket
. Note that the value of destinationUrl
must be a full S3 path and
must be URL encoded.
Four things are required to enable Harmony to write to your bucket.
us-west-2
.Numbers three and four on the list can be accomplished by setting an appropriate bucket policy.
You can obtain a bucket policy for your bucket using the policy generator at
https://harmony.earthdata.nasa.gov/staging-bucket-policy and passing in the bucketPath
query parameter. For example
https://harmony.earthdata.nasa.gov/staging-bucket-policy?bucketPath=my-example-bucket
The bucketPath
parameter can be one of the following
my-example-bucket
my-example-bucket/my/path
s3://my-example-bucket/my/path
The third option is compatible with the destinationUrl
parameter for requests.
A sample policy generated by the endpoint is shown below:
{
'Version': '2012-10-17',
'Statement': [
{
'Sid': 'write permission',
'Effect': 'Allow',
'Principal': {
'AWS': 'arn:aws:iam::123456789012:root',
},
'Action': 's3:PutObject',
'Resource': 'arn:aws:s3:::my-bucket/*',
},
{
'Sid': 'get bucket location permission',
'Effect': 'Allow',
'Principal': {
'AWS': 'arn:aws:iam::123456789012:root',
},
'Action': 's3:GetBucketLocation',
'Resource': 'arn:aws:s3:::my-bucket',
},
]
}
COPIED
Example 18 - Sample bucket policy to enable writing Harmony output