Latest revision as of 08:53, 16 December 2024

Overview

The Asynchronous Data Load web service consumes a submitted file for validation and transformation. On receipt of the file, the service returns a UID (token), which can be used to track the validation process, and perform further actions such as transformation to another format.

This service is suitable for larger datasets and heavier workloads that would otherwise result in HTTP timeouts due to the processing time if using the alternative synchronous Data Transformation Web Service.

Use this service to:

validate a dataset
optionally transform the dataset to a different structure using SDMX Structure Maps
optionally export the received or transformed dataset in any SDMX data transmission format

Asynchronous Data Load

Entry Point	/ws/public/data/load
Access	Public (default). Configurable to Private
Http Method	POST
Accepts	CSV, XLSX, SDMX-ML, SDMX-EDI (any format for which there is a Data Reader)
Compression	Zip files supported, if loading from URL gzip responses supported
Content-Type	1. multipart/form-data (if attaching file) – the attached file must be in field name of uploadFile 2. application/text or application/xml (if submitting data in the body of the POST)
Response Format	JSON
Response Statuses	200 - Data file recieved 400 - Trasformation could not be performed (either an unreadable dataset, or unresolvable reference to a required structure) 401 - Unauthorized (if access has been restricted) 500 - Server Error

Data Load Response

The response to a data load is a token, which can be used in subsequent calls to track the data load and validation process and, once validation is complete, the token can be used to perform actions such as a publish, obtain validation report, export in a different format, or export with mapping.

{
  "Success" : true,
   "uid"    : "unique token"
}

Request Load Status

Entry Point	/ws/public/data/loadStatus
Access	Public (default). Configurable to Private
Http Method	GET
Response Format	JSON

Query Parameters

uid	Unique identifier for the loaded dataset, returned from the data load operation

Load Status Response

The response is a validation report, to either indicate validation success or validation with errors.

The report Status indicates how far into the validation process the server has reached. The following table shows the various stages:

Status	Description
Initialising	Initial status
Analysing	The dataset is being analysed for series and obs count, and which dsd's it references
Validating	The dataset is being validated
Complete	The dataset validation process has finished, there may/man not be errors
Consolidating	The dataset is being consolidated, duplicate series and observations are being merged into one final dataset
IncorrectDSD	The dataset references a DSD that can not be used to validate the data
InvalidRef	The dataset references a DSD/Dataflow/Provision that does not exist in the Registry
MissingDSD	The dataset does not not reference any structure so the system can not read the dataset
Error	The dataset can not be read at all

Data export/transform

Entry Point	/ws/public/data/download
Access	Public (default). Configurable to Private
Http Method	GET
Response Format	Determined by Accept Header

Query Parameters

Query Parmeter	Format	Description
uid	string	Unique identifier for the loaded dataset, returned from the data load operation
datasetIndex	integer	If multiple datasets are in the data file, identifies which one to export, zero indexed
map	string	URN of Dataflow or DataStucture to map to, there must be a Structure Map which describes the mapping from the source to target
zip	boolean (default is false)	Zips the respone
includeMetrics	boolean (default is false)	Include Metrics in the response see Data Transformation
unmapped	boolean (default is false)	If the map parameter is supplied, and some series or observation can not be mapped, the unmapped data will be included in either the zip file (if zip is true) or in a multipart-form boundary (if zip is false)

Revalidate

A Re-validation service is provided if there is more information that can be provided to the dataset(s) loaded against the token. The reason to revalidate is if the dataset is to be attached to a different Dataflow or Provision Agreement. When this link changes, it may impact the validation results due to the application of different Constraints or Validation Schemes. The underlying datasaet will be updated to refer to this different Dataflow/Provision Agreement, which will be reflected in the exported dataset if the data format supports this information (for example SDMX-ML contains the linked structure in the Header section of the dataset).

Revalidation will be faster than the initial load and validate process because there is not as much work to do. The dataset is already consolidated, and has already had parts of it validated that will not change even if linked to a different Dataflow. The revalidation service only revalidates parts of the dataset that will change due to a different structure link, for example it would revalidate against constraints if these are different, and validation schemes (mathematical validation rules) if there are different rules based on changing the linked structure for the dataset.

Revalidation is an asynchronous action, whose progress can be tracked using the same token and tracking web services provided.

Entry Point	/ws/public/data/revalidate
Access	Public (default). Configurable to Private
Http Method	POST
Response Format	JSON
Accepts	JSON
Content-Type	application/json
Response Format	JSON
Response Statuses	200 - request received and being processed 400 - request could not be performed (possibly due to bad syntax of JSON POST) 401 - Unauthorized (if access has been restricted) 500 - Server Error

Post Body

The POST request must contain the token of the dataset to revalidate, this is the same token that was provided by the server on data load. The SRef is an array of URNs, one for each Dataset that was present in the data file loaded to the server. Each URN refers to which structure to use to validate the dataset against. The URN can either be to a Data Structure Definition, Dataflow, or Provision Agreement.

{
  UID : "datasetdetailsuid",
 SRef : ["urn1", "urn2", urn3"]
 }

Revalidate Response

The response is a JSON message with a success message. The validation progress should be tracked using the token that was used in the re-validation request against the Load Status web service.

{
  "Success" : true,
}

Data Publish

Entry Point	/ws/public/data/publish
Access	Public (default). Configurable to Private
Http Method	POST
Accepts	JSON
Response Format	JSON
Response Statuses	200 - Publish request accepted 400 - Bad request 401 - Unauthorized (if access has been restricted) 500 - Server Error

Publishes the dataset loaded against the uid the expected json is as follows

{
  UID : "datasetdetailsuid",
  Action : "Append|Replace|Delete",
  DeleteAction : "DEFAULT|OBSERVATIONS|SERIES",
  Dataset: int
 }

The If the action is Delete, then there are three different options for Delete Action. The DEFAULT option is to use the SDMX Delete rules, OBSERVATION will delete all the observations in the loaded dataset from the database, SERIES will delete all the series in the dataset from the database (including all the observations that belong to the series).

@@ Line 1: / Line 1: @@
-[[Category:WebService]]
+[[Category:FMR REST API Reference]]
 = Overview =
-The Asynchronous Data Load web service consumes a submitted file for Validation.  On receipt of the file, the service returns a UID (token), which can be used to track the validation process, and perform further actions such as transformation to another format, or publish to a database once validation has completed.
+The Asynchronous Data Load web service consumes a submitted file for validation and transformation.  On receipt of the file, the service returns a UID (token), which can be used to track the validation process, and perform further actions such as transformation to another format.
-The Fusion Registry stores the data on the instance it was sent to, so in a load balanced system, the same server must be accessed, if there is no activity on the file for 15 minutes, the registry will automatically remove the file from its cache.
+This service is suitable for larger datasets and heavier workloads that would otherwise result in HTTP timeouts due to the processing time if using the alternative synchronous [[Data Transformation Web Service]].
+Use this service to:
+* validate a dataset
+* optionally transform the dataset to a different structure using SDMX Structure Maps
+* optionally export the received or transformed dataset in any SDMX data transmission format
 = Asynchronous Data Load =
@@ Line 31: / Line 36: @@
 == Data Load Response ==
-The response to a data load is a token, which can be used in subsequent calls to track the data load and validation process and, once validation is complete, the token can be used to perform actions such as a pulish, obtain validation report, export in a different format, or export with mapping.
+The response to a data load is a token, which can be used in subsequent calls to track the data load and validation process and, once validation is complete, the token can be used to perform actions such as a publish, obtain validation report, export in a different format, or export with mapping.
   {
@@ Line 59: / Line 64: @@
 == Load Status Response ==
-The response is a validation report, to either indicate [[/Data_Validation_Web_Service#Valid_Dataset|validation success]] or [[Data_Validation_Web_Service#Dataset_with_Errors|validation with errors]].
+The response is a validation report, to either indicate validation success or [[Data_Validation_Web_Service#Dataset_with_Errors|validation with errors]].
 The report '''Status''' indicates how far into the validation process the server has reached. The following table shows the various stages:
@@ Line 125: / Line 130: @@
 |style="background-color:#eaecf0"|<b>unmapped</b>
 || boolean (default is false)
-|| If the map parameter is supplied, and some series or observation can not be mapped, the unmapped data will be included in either the zip file (if zip is true) or in a multipart-form boundry (if zip is false)
+|| If the map parameter is supplied, and some series or observation can not be mapped, the unmapped data will be included in either the zip file (if zip is true) or in a multipart-form boundary (if zip is false)
 |}
 =  Revalidate =
-A Re-validation service is provided if there is more information that can be provided to the dataset(s) loaded against the token.  The reason to revalidate is if the dataset is to be attached to a different Dataflow or Provision Agreement.  When this link changes, it may impact the validation results due to the application of different Constraints or Validation Schemes.  The underlying datasaet will be updated to refer to this different Dataflow/.Provision Agreement, which will be reflected in the exported dataset if the data format supports this information (for example SDMX-ML contains the linked structure in the Header section of the dataset).
+A Re-validation service is provided if there is more information that can be provided to the dataset(s) loaded against the token.  The reason to revalidate is if the dataset is to be attached to a different Dataflow or Provision Agreement.  When this link changes, it may impact the validation results due to the application of different Constraints or Validation Schemes.  The underlying datasaet will be updated to refer to this different Dataflow/Provision Agreement, which will be reflected in the exported dataset if the data format supports this information (for example SDMX-ML contains the linked structure in the Header section of the dataset).
 Revalidation will be faster than the initial load and validate process because there is not as much work to do.  The dataset is already consolidated, and has already had parts of it validated that will not change even if linked to a different Dataflow.  The revalidation service only revalidates parts of the dataset that will change due to a different structure link, for example it would revalidate against constraints if these are different, and validation schemes (mathematical validation rules) if there are different rules based on changing the linked structure for the dataset.

Difference between revisions of "Asynchronous Data Validation and Transformation Web Service"

Latest revision as of 08:53, 16 December 2024

Contents

Overview

Asynchronous Data Load

Data Load Response

Request Load Status

Query Parameters

Load Status Response

Data export/transform

Query Parameters

Revalidate

Post Body

Revalidate Response

Data Publish

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Fusion Metadata Registry

Tools