Difference between revisions of "Data Formats"

From FMR Knowledge Base
Jump to navigation Jump to search
(Created page with "= Overview = Fusion Registry accepts and outputs datasets in a number of formats. When consuming data, the Fusion Registry will analyse the dataset to try to determine what...")
 
(CSV Formats)
 
(34 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 +
[[Category:FMR Formats Reference]]
 
= Overview =  
 
= Overview =  
Fusion Registry accepts and outputs datasets in a number of formats.  When consuming data, the Fusion Registry will analyse the dataset to try to determine what data format it has received, so that it is able to direct it to the right reader.  All datasets are read by the Fusion Registry in excatly the same way, so any data processing performed on a Dataset is the same, regardless of the input Data Format.     
+
Fusion Metadata Registry accepts and outputs datasets in a number of formats.  When consuming data, the Registry will analyse the dataset to try to determine what data format it has received, so that it is able to direct it to the right reader.  All datasets are read by the Registry in exactly the same way, so any data processing performed on a dataset is the same, regardless of the input Data Format.     
  
 
When querying for data from the Registry web service, or performing a Data Transformation via the Web service, the output data format is described using the [https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html HTTP Accept Header] which describes the required data format.
 
When querying for data from the Registry web service, or performing a Data Transformation via the Web service, the output data format is described using the [https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html HTTP Accept Header] which describes the required data format.
  
 +
= HTTP Accept Headers =
 +
 +
==SDMX Formats==
 +
SDMX Fomats are supported as described by the [https://github.com/sdmx-twg/sdmx-rest/blob/master/doc/rest_cheat_sheet.pdf]
 +
 +
Accept Headers
 +
{| class="wikitable"
 +
|-
 +
! Accept Header !! Format
 +
|-
 +
| application/vnd.sdmx.structurespecificdata+xml;version=3.0|| [[SDMX-ML Structure Specific Data|Structure Specific]] (3.0)
 +
|-
 +
| application/vnd.sdmx.structurespecificdata+xml;version=2.1|| [[SDMX-ML Structure Specific Data|Structure Specific]] (2.1)
 +
|-
 +
| application/vnd.sdmx.structurespecificdata+xml;version=2.0|| Compact (2.0/1.0)
 +
|-
 +
| application/vnd.sdmx.genericdata+xml;version=2.1|| [[SDMX-ML Generic Data|Generic]] (2.1/2.0/1.0)
 +
|-
 +
| application/vnd.sdmx.data+json;version=2.0.0|| [[SDMX-JSON Data|SDMX JSON]] (2.0)
 +
|-
 +
| application/vnd.sdmx.data+json;version=1.0.0|| [[SDMX-JSON Data|SDMX JSON]] (1.0)
 +
|-
 +
| application/vnd.sdmx.data+csv;version=2.0.0|| [[SDMX-CSV|SDMX CSV]] (2.0)
 +
|-
 +
| application/vnd.sdmx.data+csv;version=1.0.0|| [[SDMX-CSV|SDMX CSV]] (1.0)
 +
|-
 +
| application/vnd.sdmx.data+edi|| [[SDMX-EDI Data|SDMX EDI]]
 +
|}
 +
 +
==CSV Formats==
 +
There are a number of CSV 'flavours' supported by Fusion Software, including the [[SDMX-CSV]] format which is an official SDMX data format.
 +
 +
<p>The following Formats and correspoding VND Headers are supported: </p>
 +
* [[SDMX-CSV]]  application/vnd.sdmx.data+csv
 +
* [[Fusion-CSV]] application/vnd.csv
 +
* [[Fusion-CSV-TS]] application/vnd.csv-ts
 +
 +
<p>Each VND Header can then take the additional arguments of, note SDMX-CSV only supports a subset of these arguments or supported values.</p>
 +
{| class="wikitable"
 +
|-
 +
! Argument !! Description !! Supported Values !! Default !! Supported Formats
 +
|-
 +
| version || the version of the format|| 1.0.0, 2.0.0 (SDMX-CSV only) || 1.0.0 || [[SDMX-CSV]], [[Fusion-CSV]], [[Fusion-CSV-TS]]
 +
|-
 +
| timeFormat ||  values are converted to the most granular ISO 8601 representation <br/>taking into account the highest frequency of the data in the message|| original or normalized || original  || [[SDMX-CSV]]
 +
|-
 +
| labels || output both code/concept ids and the respective labels <br/> in the specified language || both/id/name || id || [[SDMX-CSV]] (with the exception of labels=name),<br/> [[Fusion-CSV]], [[Fusion-CSV-TS]]
 +
|-
 +
| delimiter || the delimiter to use || comma/tab/semicolon/space || comma || [[Fusion-CSV]], [[Fusion-CSV-TS]]
 +
|-
 +
| serieskey || include the series key as a column<br/> A series key is the concatenation of the dimension values<br/>for example A:UK:EMPLOYMENT || include/exclude || exclude ||  [[Fusion-CSV]], [[Fusion-CSV-TS]]
 +
|-
 +
| bom <br/> <small>(since 10.3.1)</small> || Include or Exclude the [https://en.wikipedia.org/wiki/Byte_order_mark '''B'''yte '''O'''rder '''M'''ark] (BOM).<br/> The BOM helps Excel interpret non Latin characters when opening a CSV file || include/exclude || exclude ||  [[SDMX-CSV]], [[Fusion-CSV]], [[Fusion-CSV-TS]]
 +
|-
 +
|}
  
= HTTP Accept Headers =
+
 
 +
<p><strong>Note:</strong>The Labels parameter can be used in conjuntion with the [https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.4 HTTP Accept-Language] Header to indicate which language to resolve the labels in.  If the labels are not available in the requested language, another language will be selected, defaulting to English. </p>
 +
 
 +
===Examples===
 +
application/vnd.sdmx.data+csv <br/>
 +
application/vnd.sdmx.data+csv;version=1.0.0;<br/>
 +
application/vnd.sdmx.data+csv;version=1.0.0;timeFormat=normalized<br/>
 +
application/vnd.sdmx.data+csv;version=1.0.0;timeFormat=normalized;labels=both<br/>
 +
<p/>
 +
application/vnd.csv<br/>
 +
application/vnd.csv;delimiter=tab<br/>
 +
application/vnd.csv;timeFormat=normalized;serieskey=include<br/>
 +
application/vnd.csv;version=1.0.0;labels=both<br/>
 +
<p/>
 +
application/vnd.csv-ts<br/>
 +
application/vnd.csv-ts;version=1.0.0;<br/>
 +
application/vnd.csv-ts;version=1.0.0;labels=name<br/>
 +
 
 +
==Data Reporting Template==
 +
 
 +
There are two types of output when converting data to a Data Reporting Template format.  The first is where the Data reporting template is constructed in the usual way, with the [[Data_Reporting_Template#Defining_the_Universe_of_Data|Universe of Data]] being derived from the [[Dataflow V10|Dataflow]], and related [[Content Constraint|Content Constraints]].  The second is where the [[Data_Reporting_Template#Defining_the_Universe_of_Data|Universe of Data]]  is derived from the dataset being written into the Excel workbook.    The default output is to base the Report Template Universe on the constraints, to change this behaviour, use the '''+partial''' indicator in the VND Header.
  
 
{| class="wikitable"
 
{| class="wikitable"
 
|-
 
|-
! Data Format !! Accept Header !! Description  
+
! Accept Header !! Description  
 
|-
 
|-
| Excel Report Template
+
| application/vnd.reporttemplate  
|| application/vnd.reporttemplate  
 
 
|| Excel Report Template pre-populated with the data from a dataset.  
 
|| Excel Report Template pre-populated with the data from a dataset.  
<br/> The dataset should contain the Provision Agreement reference, to enable the Fusion Registry to determine the Data Provider
+
<br/> The dataset should contain the Provision Agreement reference, to enable the Registry to determine the Data Provider
 
<br/> The excel file will be the same as a Report Template generated via the [[Data_Reporting_Template_Web_Services|Reporting Template Web Service]], but it will be pre-populated with observation and attribute values from the dataset.
 
<br/> The excel file will be the same as a Report Template generated via the [[Data_Reporting_Template_Web_Services|Reporting Template Web Service]], but it will be pre-populated with observation and attribute values from the dataset.
  
 
|-
 
|-
| Excel Report Template
+
| application/vnd.reporttemplate.ACY:BANKING(1.0)  
|| application/vnd.reporttemplate.ACY:BANKING(1.0)  
+
|| This is an extension of <b>application/vnd.reporttemplate</b>, it tells the Registry which Reporting Template to use.  Only required if there is more then one Reporting Template for the Dataflow(s) being written.  
|| This is an extension of <b>application/vnd.reporttemplate</b>, it tells the Fusion Registry which Reporting Template to use.  Only required if there is more then one Reporting Template for the Dataflow(s) being written.  
 
 
|-
 
|-
| Excel Report Template
+
| application/vnd.reporttemplate;DATA_PROVIDER=ONS  
|| application/vnd.reporttemplate;DATA_PROVIDER=ONS  
+
|| This is an extension of <b>application/vnd.reporttemplate</b>, it tells the Registry who the Data Provider is.   
|| This is an extension of <b>application/vnd.reporttemplate</b>, it tells the Fusion Registry who the Data Provider is.  The Data Provider's Agency is assumed to be the same as the Agency that own's the Reporting Template.  If this is not true, use the syntax DATA_PROVIDER=ACY_ID.ONS. Can be used in conjunction with other VND arguments such as the Report Template identifer.  
+
 
 +
The Data Provider's Agency defaults to SDMX.  If this is not true, use the syntax '''DATA_PROVIDER=ACY_ID.ONS'''  
 +
 
 +
Can be used in conjunction with other VND arguments such as the Report Template identifer.  
 
|-
 
|-
| Excel Report Template
+
| application/vnd.reporttemplate+partial   
|| application/vnd.reporttemplate+partial   
+
|| This outputs the dataset conforming to the layout of the Report Template, but includes only the worksheets, and observation cells for which there is data in the dataset.  There is no main worksheet.   
|| This outputs the dataset conforming to the layout of the Report Template, but includes only the worksheets, and observation cells for which there is data in the dataset.  There is no main worksheet.  A Data Provider reference is not necessary, however information about which Report Template to use can be provided using the <b>.ACY:TEMPLATE_ID(1.0)</b> syntax.
+
 
 +
A Data Provider reference is not necessary, however information about which Report Template to use can be provided using the <b>.ACY:TEMPLATE_ID(1.0)</b> syntax.
 
    
 
    
 
|}
 
|}

Latest revision as of 06:31, 28 March 2024

Overview

Fusion Metadata Registry accepts and outputs datasets in a number of formats. When consuming data, the Registry will analyse the dataset to try to determine what data format it has received, so that it is able to direct it to the right reader. All datasets are read by the Registry in exactly the same way, so any data processing performed on a dataset is the same, regardless of the input Data Format.

When querying for data from the Registry web service, or performing a Data Transformation via the Web service, the output data format is described using the HTTP Accept Header which describes the required data format.

HTTP Accept Headers

SDMX Formats

SDMX Fomats are supported as described by the [1]

Accept Headers

Accept Header Format
application/vnd.sdmx.structurespecificdata+xml;version=3.0 Structure Specific (3.0)
application/vnd.sdmx.structurespecificdata+xml;version=2.1 Structure Specific (2.1)
application/vnd.sdmx.structurespecificdata+xml;version=2.0 Compact (2.0/1.0)
application/vnd.sdmx.genericdata+xml;version=2.1 Generic (2.1/2.0/1.0)
application/vnd.sdmx.data+json;version=2.0.0 SDMX JSON (2.0)
application/vnd.sdmx.data+json;version=1.0.0 SDMX JSON (1.0)
application/vnd.sdmx.data+csv;version=2.0.0 SDMX CSV (2.0)
application/vnd.sdmx.data+csv;version=1.0.0 SDMX CSV (1.0)
application/vnd.sdmx.data+edi SDMX EDI

CSV Formats

There are a number of CSV 'flavours' supported by Fusion Software, including the SDMX-CSV format which is an official SDMX data format.

The following Formats and correspoding VND Headers are supported:

Each VND Header can then take the additional arguments of, note SDMX-CSV only supports a subset of these arguments or supported values.

Argument Description Supported Values Default Supported Formats
version the version of the format 1.0.0, 2.0.0 (SDMX-CSV only) 1.0.0 SDMX-CSV, Fusion-CSV, Fusion-CSV-TS
timeFormat values are converted to the most granular ISO 8601 representation
taking into account the highest frequency of the data in the message
original or normalized original SDMX-CSV
labels output both code/concept ids and the respective labels
in the specified language
both/id/name id SDMX-CSV (with the exception of labels=name),
Fusion-CSV, Fusion-CSV-TS
delimiter the delimiter to use comma/tab/semicolon/space comma Fusion-CSV, Fusion-CSV-TS
serieskey include the series key as a column
A series key is the concatenation of the dimension values
for example A:UK:EMPLOYMENT
include/exclude exclude Fusion-CSV, Fusion-CSV-TS
bom
(since 10.3.1)
Include or Exclude the Byte Order Mark (BOM).
The BOM helps Excel interpret non Latin characters when opening a CSV file
include/exclude exclude SDMX-CSV, Fusion-CSV, Fusion-CSV-TS


Note:The Labels parameter can be used in conjuntion with the HTTP Accept-Language Header to indicate which language to resolve the labels in. If the labels are not available in the requested language, another language will be selected, defaulting to English.

Examples

application/vnd.sdmx.data+csv
application/vnd.sdmx.data+csv;version=1.0.0;
application/vnd.sdmx.data+csv;version=1.0.0;timeFormat=normalized
application/vnd.sdmx.data+csv;version=1.0.0;timeFormat=normalized;labels=both

application/vnd.csv
application/vnd.csv;delimiter=tab
application/vnd.csv;timeFormat=normalized;serieskey=include
application/vnd.csv;version=1.0.0;labels=both

application/vnd.csv-ts
application/vnd.csv-ts;version=1.0.0;
application/vnd.csv-ts;version=1.0.0;labels=name

Data Reporting Template

There are two types of output when converting data to a Data Reporting Template format. The first is where the Data reporting template is constructed in the usual way, with the Universe of Data being derived from the Dataflow, and related Content Constraints. The second is where the Universe of Data is derived from the dataset being written into the Excel workbook. The default output is to base the Report Template Universe on the constraints, to change this behaviour, use the +partial indicator in the VND Header.

Accept Header Description
application/vnd.reporttemplate Excel Report Template pre-populated with the data from a dataset.


The dataset should contain the Provision Agreement reference, to enable the Registry to determine the Data Provider
The excel file will be the same as a Report Template generated via the Reporting Template Web Service, but it will be pre-populated with observation and attribute values from the dataset.

application/vnd.reporttemplate.ACY:BANKING(1.0) This is an extension of application/vnd.reporttemplate, it tells the Registry which Reporting Template to use. Only required if there is more then one Reporting Template for the Dataflow(s) being written.
application/vnd.reporttemplate;DATA_PROVIDER=ONS This is an extension of application/vnd.reporttemplate, it tells the Registry who the Data Provider is.

The Data Provider's Agency defaults to SDMX. If this is not true, use the syntax DATA_PROVIDER=ACY_ID.ONS

Can be used in conjunction with other VND arguments such as the Report Template identifer.

application/vnd.reporttemplate+partial This outputs the dataset conforming to the layout of the Report Template, but includes only the worksheets, and observation cells for which there is data in the dataset. There is no main worksheet.

A Data Provider reference is not necessary, however information about which Report Template to use can be provided using the .ACY:TEMPLATE_ID(1.0) syntax.