Difference between revisions of "DRAFT Data Process Service"

From FMR Knowledge Base
Jump to navigation Jump to search
(Run Process)
(Process Instruction)
Line 73: Line 73:
  
 
=== Process Instruction ===
 
=== Process Instruction ===
The process instruction chains together one or more processes.  Each process has an Identification (StepId) links to a task (VALIDATE).  
+
The process instruction chains together one or more Process Steps.  Each Process Steps has an Identification (StepId) links to a task (VALIDATE). The supported Properties of a Process Step are specific to the Process being run, a typical example for processes which output a dataset is the '''Output''' property which defines which Process Step to pass the output to. 
  
 
  {
 
  {
   "Structure" : "urn"    //metadata to read the data, only required if overriding the structure reference from the dataset
+
   "Structure" : "urn"    //Provision, Dataflow, or DSD URN used to read the data, only required if the dataset does not contain this, or if overriding the Dataset
 
   "ProcessSteps" : [
 
   "ProcessSteps" : [
 
     {
 
     {

Revision as of 03:03, 2 February 2021

Overview

The Data Process Service is a web service hosted by Fusion Registry which accepts a dataset (multiple formats supported) and a process instruction, which tells the Fusion Registry how to process the data, for example validate, map, transform, export.

FR Process Example.jpg

Workflow

The Data Process Service has the following workflow

  1. Load Data - Data is submitted using HTTP Post. The server provides a token to be able to perform further actions on the data
  2. Run Process - The process to execute is sent to the server, along with the token of which dataset to apply the process to. The server provides a token to track the process
  3. Track Progress - A GET request to track the progress of a running process, using the processes' unique token as supplied by the server
  4. Export Data - If the final stage of a Process is to store the data in a temporary store, then it can be exported in any data format supported by Fusion Registry

Any data loaded is stored for 2 minutes, if there is no activity on the data 2 minutes after it is loaded, it will be evicted from the system.

Load Data

Entry Point /ws/secure/dataprocess/load
Access Restricted
Http Method POST
Accepts Any Supported Data Format
Compression Zip files supported
Content-Type

1. multipart/form-data (if attaching file) – the attached file must be in field name of uploadFile

2. application/text or application/xml (if submitting data in the body of the POST)

Response Format JSON
Response Statuses

200 - Data file received

400 - If not dataset provided

401 - Unauthorized (if access has been restricted)

500 - Server Error

Response

 {
   "token" : "uid123"
 }


Run Process

Entry Point /ws/secure/dataprocess/run/{token}
Access Restricted
Http Method POST
Accepts Process Instruction (JSON format)
Content-Type

1. multipart/form-data (if attaching file) – the attached file must be in field name of uploadFile

2. application/json (if submitting data in the body of the POST)

Response Format JSON
Response Statuses

200 - Data file received

400 - If the token does not match a known dataset

401 - Unauthorized (if access has been restricted)

500 - Server Error

Response

 {
   "token" : "uid123"
 }

Process Instruction

The process instruction chains together one or more Process Steps. Each Process Steps has an Identification (StepId) links to a task (VALIDATE). The supported Properties of a Process Step are specific to the Process being run, a typical example for processes which output a dataset is the Output property which defines which Process Step to pass the output to.

{
  "Structure" : "urn"     //Provision, Dataflow, or DSD URN used to read the data, only required if the dataset does not contain this, or if overriding the Dataset
  "ProcessSteps" : [
    {
      "ProcessId"  : "VALIDATE"  //Id of a registered process
      "StepId"     : "STEP1"     //Unique Id for the step
      "Metrics"    : true        //true to capture metrics for the process which are output in the track progress report
      "Properties" : {}          //optional map of properties specific to the process being run
    }
  ]
}

Track Process

Entry Point /ws/secure/dataprocess/status/{token}
Access Restricted
Http Method GET
Response Format JSON
Response Statuses

200 - Data file received

404 - If the token does not match a known process

401 - Unauthorized (if access has been restricted)

500 - Server Error

Response Example

 {
   "ProcessToken" : "abcd",
   "StartTime"    : 123456,   //unix time milliseconds since 1970
   "EndTime"      : null,     //unix time milliseconds since 1970
   "Status"       : 1         //0=running,1=success,2=error
   "Steps"        :
     {
        "Step1" : 
          {
            "StartTime"   : 123456,  //unix time milliseconds since 1970
            "EndTime"     : null,    //unix time milliseconds since 1970
            "Series" 	   : 12,
            "Rows"	   : 132
          }
     }
 }

Export Data

Entry Point /ws/secure/dataprocess/download/{token}/{stepId}
Access Restricted
Http Method GET
Response Format JSON
Response Statuses

200 - Data file received

404 - If the token does not match a known process

401 - Unauthorized (if access has been restricted)

500 - Server Error


Request Parameters
Parameter Required Description
saveAs False If provided, the response will include a content-disposition HttpHeader with the value attachment; filename = {param value}
format False This can be used to define the export data format, as opposed to using the HTTP Accept Header. @see formats reference for valid values for each format.