Difference between revisions of "Asynchronous Data Validation and Transformation Web Service"

From FMR Knowledge Base
Jump to navigation Jump to search
(Export)
Line 1: Line 1:
 
[[Category:WebService]]
 
[[Category:WebService]]
 +
 +
= Overview =
 +
The Asynchronous Data Load web service consumes a submitted file for Validation.  On receipt of the file, the service returns a UID (token), which can be used to track the validation process, and perform further actions such as transformation to another format, or publish to a database once validation has completed.
 +
 +
The Fusion Registry stores the data on the instance it was sent to, so in a load balanced system, the same server must be accessed, if there is no activity on the file for 15 minutes, the registry will automatically remove the file from its cache.
 +
 
= Asynchronous Data Load =
 
= Asynchronous Data Load =
 
{| class="wikitable"
 
{| class="wikitable"

Revision as of 01:18, 29 April 2020


Overview

The Asynchronous Data Load web service consumes a submitted file for Validation. On receipt of the file, the service returns a UID (token), which can be used to track the validation process, and perform further actions such as transformation to another format, or publish to a database once validation has completed.

The Fusion Registry stores the data on the instance it was sent to, so in a load balanced system, the same server must be accessed, if there is no activity on the file for 15 minutes, the registry will automatically remove the file from its cache.

Asynchronous Data Load

Entry Point /ws/public/data/load
Access Public (default). Configurable to Private
Http Method POST
Accepts CSV, XLSX, SDMX-ML, SDMX-EDI (any format for which there is a Data Reader)
Compression Zip files supported, if loading from URL gzip responses supported
Content-Type

1. multipart/form-data (if attaching file) – the attached file must be in field name of uploadFile

2. application/text or application/xml (if submitting data in the body of the POST)

Response Format JSON
Response Statuses

200 - Data file recieved

400 - Trasformation could not be performed (either an unreadable dataset, or unresolvable reference to a required structure)

401 - Unauthorized (if access has been restricted)

500 - Server Error

Data Load Response

The response to a data load is a token, which can be used in subsequent calls to track the data load and validation process and, once validation is complete, the token can be used to perform actions such as a pulish, obtain validation report, export in a different format, or export with mapping.

{
  "Success" : true,
   "uid"    : "unique token"
}

Request Load Status

Entry Point /ws/public/data/loadStatus
Access Public (default). Configurable to Private
Http Method GET
Response Format JSON

Query Parameters

uid Unique identifier for the loaded dataset, returned from the data load operation


Load Status Response

The response is a validation report, to either indicate validation success or validation with errors.

The report Status indicates how far into the validation process the server has reached. The following table shows the various stages:

Status Description
Initialising Initial status
Analysing The dataset is being analysed for series and obs count, and which dsd's it references
Validating The dataset is being validated
Complete The dataset validation process has finished, there may/man not be errors
Consolidating The dataset is being consolidated, duplicate series and observations are being merged into one final dataset
IncorrectDSD The dataset references a DSD that can not be used to validate the data
InvalidRef The dataset references a DSD/Dataflow/Provision that does not exist in the Registry
MissingDSD The dataset does not not reference any structure so the system can not read the dataset
Error The dataset can not be read at all

Data export/Transform

Entry Point /ws/public/data/download
Access Public (default). Configurable to Private
Http Method GET
Response Format Determined by Accept Header

Query Parameters

Query Parmeter Format Description
uid string Unique identifier for the loaded dataset, returned from the data load operation
datasetIndex integer If multiple datasets are in the data file, identifies which one to export, zero indexed
map string URN of Dataflow or DataStucture to map to, there must be a Structure Map which describes the mapping from the source to target
zip boolean (default is false) Zips the respone
includeMetrics boolean (default is false) Include Metrics in the response see Data Transformation
unmapped boolean (default is false) If the map parameter is supplied, and some series or observation can not be mapped, the unmapped data will be included in either the zip file (if zip is true) or in a multipart-form boundry (if zip is false)

Data Publish

Entry Point /ws/public/data/publish
Access Public (default). Configurable to Private
Http Method POST
Accepts JSON
Response Format JSON
Response Statuses

200 - Publish request accepted

400 - Bad request

401 - Unauthorized (if access has been restricted)

500 - Server Error


Publishes the dataset loaded against the uid the expected json is as follows

{
  UID : "datasetdetailsuid",
  Action : "Append|Replace|Delete",
  DeleteAction : "DEFAULT|OBSERVATIONS|SERIES",
  Dataset: int
 }

The If the action is Delete, then there are three different options for Delete Action. The DEFAULT option is to use the SDMX Delete rules, OBSERVATION will delete all the observations in the loaded dataset from the database, SERIES will delete all the series in the dataset from the database (including all the observations that belong to the series).