Difference between revisions of "Reverse Engineer DSD from CSV Dataset"

From FMR Knowledge Base
Jump to navigation Jump to search
(Data)
 
(19 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
[[Category:How_To]]
 
[[Category:How_To]]
 +
[[Category:How_To V11]]
 
=Overview=
 
=Overview=
The Registry is able to create a Data Structure Definition from a CSV file which only needs the headings to be used in the DSD. Having created the Reverse Engineered Data Structure you can validate and transform data.
+
The Registry can create a Data Structure Definition from a CSV file which only needs the headings to be used in the DSD. Having created the Reverse Engineered Data Structure, you can validate and transform data.
  
 
=Preparation=
 
=Preparation=
Line 9: Line 10:
 
Next, ensure that all unnecessary data is removed and that the individual concepts all appear in the top row with each concept in its own cell.
 
Next, ensure that all unnecessary data is removed and that the individual concepts all appear in the top row with each concept in its own cell.
  
When done the file should be saved as an Excel CSV file ready to be used in the Reverse Engineer (RE) process.
+
When done, the file should be saved as an Excel CSV file ready to be used in the Reverse Engineer (RE) process.
  
  
[[File:RE0.PNG|1000px]]
+
[[File:RE0.PNG|Example CSV file|1000px]]
  
 
==Concept Schemes==
 
==Concept Schemes==
The RE process includes a step where you can link the concepts to an existing Concept Scheme owned by the same Agency as being used in the RE process. Alternatively, you can ignore this feature in which case the Concept scheme will be created for you.
+
The RE process includes a step where you can link the concepts to an existing [[Concepts_Schemes_and_Concepts|Concept Scheme]] owned by the same Agency as being used in the RE process. Alternatively, you can ignore this feature in which case the Concept scheme will be created for you.
  
 
==Codelists==
 
==Codelists==
If you decide to use the RE process to create a new Concept Scheme, Codelists will also be created for any concepts that you specify thus in Step 2 of the Wizard (see below).
+
If you decide to use the RE process to create a new Concept Scheme, [[Codelists|Codelists]] will also be created for any concepts that you specify thus in Step 2 of the Wizard (see below).
  
Note that the codelist will be empty.
+
Note that the Codelist will be empty.
  
 
==Using an existing Concept Scheme==
 
==Using an existing Concept Scheme==
Line 26: Line 27:
  
  
[[File:RE2.PNG|1000px]]
+
[[File:RE2.PNG|Checking the Concept Scheme|1000px]]
 
 
  
 
=Process=
 
=Process=
Line 34: Line 34:
 
The RE option is available from the Data options, Data Structure, Dataflow and Provision Agreement.
 
The RE option is available from the Data options, Data Structure, Dataflow and Provision Agreement.
  
[[File:RE3.PNG|400px]]
+
[[File:RE3.PNG|Reverse Engineer option|400px]]
  
 
In Step 1 of the Wizard, enter the required ID and Agency together with a suitable name.  
 
In Step 1 of the Wizard, enter the required ID and Agency together with a suitable name.  
Line 45: Line 45:
  
  
[[File:RE4.PNG|1000px]]
+
[[File:RE4.PNG|DSD - Step 1|1000px]]
  
Click Next to continue.
+
Click '''Next''' to continue.
  
 
==Reverse Engineer Wizard - Step 2 - Column Assignment==
 
==Reverse Engineer Wizard - Step 2 - Column Assignment==
  
This step allows you to define how each of the columns are to be treated. If you click the down chevron you will see the options available:
+
This step allows you to define how each of the columns is to be treated. If you click the down chevron you will see the options available:
  
  
[[File:RE1.PNG|1000px]]
+
[[File:RE1.PNG|Reverse Engineer Wizard - Step 2|1000px]]
  
  
  
Make the appropriate selections for '''EVERY''' column as shown in the example below then click Next to continue.
+
Make the appropriate selections for '''EVERY''' column as shown in the example below then click '''Next''' to continue.
  
Note that if you select an Enumerated type (Example '''ID'''), a codelist will be created.
+
Note that if you select an Enumerated type (Example '''ID'''), a Codelist will be created.
  
  
[[File:RE5.PNG|1000px]]
+
[[File:RE5.PNG|Completed Step 2|1000px]]
  
 
==Reverse Engineer Wizard - Step 3 - Enumerated Columns==
 
==Reverse Engineer Wizard - Step 3 - Enumerated Columns==
Line 70: Line 70:
  
  
[[File:RE6.PNG|1000px]]
+
[[File:RE6.PNG|Reverse Engineer Wizard - Step 3|1000px]]
  
 
==Reverse Engineer Wizard - Step 4 - Concept Roles==
 
==Reverse Engineer Wizard - Step 4 - Concept Roles==
This step allow you to define the role for each concept (Dimension, Attribute or Measure) as shown in the example below.
+
This step allows you to define the role for each concept (Dimension, Attribute or Measure) as shown in the example below.
  
  
[[File:RE7.PNG|1000px]]
+
[[File:RE7.PNG|Reverse Engineer Wizard - Step 4|1000px]]
  
Click Next to continue.  
+
Click '''Next''' to continue.  
  
 
You will be asked to confirm and then you will be taken to the Data Structure page with the newly created DSD highlighted.
 
You will be asked to confirm and then you will be taken to the Data Structure page with the newly created DSD highlighted.
Line 85: Line 85:
 
To load, validate or convert a data file against this new Data Structure, all you need to do is to add the Data Structure ID etc in accordance with the rules regarding the file format. [https://fmrwiki.sdmxcloud.org/Category:FMR_Formats_Reference You can learn more about file formats here.]
 
To load, validate or convert a data file against this new Data Structure, all you need to do is to add the Data Structure ID etc in accordance with the rules regarding the file format. [https://fmrwiki.sdmxcloud.org/Category:FMR_Formats_Reference You can learn more about file formats here.]
  
If you intend to Publish the data you will need to create a Dataflow and a Provision Agreement. [https://fmrwiki.sdmxcloud.org/Category:Editing_and_Maintaining_Structures_using_the_UI You can learn more here.]
+
If you intend to validate the data against a Dataflow or a Provision Agreement, [https://fmrwiki.sdmxcloud.org/Category:Structural_Metadata you can learn more about those structures here.]

Latest revision as of 08:12, 5 September 2022

Overview

The Registry can create a Data Structure Definition from a CSV file which only needs the headings to be used in the DSD. Having created the Reverse Engineered Data Structure, you can validate and transform data.

Preparation

Source data

For this process to work, the first step is to obtain the dataset that you wish to use.

Next, ensure that all unnecessary data is removed and that the individual concepts all appear in the top row with each concept in its own cell.

When done, the file should be saved as an Excel CSV file ready to be used in the Reverse Engineer (RE) process.


Example CSV file

Concept Schemes

The RE process includes a step where you can link the concepts to an existing Concept Scheme owned by the same Agency as being used in the RE process. Alternatively, you can ignore this feature in which case the Concept scheme will be created for you.

Codelists

If you decide to use the RE process to create a new Concept Scheme, Codelists will also be created for any concepts that you specify thus in Step 2 of the Wizard (see below).

Note that the Codelist will be empty.

Using an existing Concept Scheme

If you are planning to use an existing concept scheme you will need to be aware of how the concept scheme itself has been created.


Checking the Concept Scheme

Process

Reverse Engineer Wizard - Step 1 - High Level Details

The RE option is available from the Data options, Data Structure, Dataflow and Provision Agreement.

Reverse Engineer option

In Step 1 of the Wizard, enter the required ID and Agency together with a suitable name.

If you are using an existing Concept Scheme, it MUST belong to the same Agency and use the same ID. If the Registry does not find an exact match is will create a new concept scheme using the Agency and ID.

If the Concept scheme ID is left blank, the Registry will create a new Concept Scheme.

Next, select your CSV datafile or drag it onto the panel. If successful a tick will appear as shown in the image below.


DSD - Step 1

Click Next to continue.

Reverse Engineer Wizard - Step 2 - Column Assignment

This step allows you to define how each of the columns is to be treated. If you click the down chevron you will see the options available:


Reverse Engineer Wizard - Step 2


Make the appropriate selections for EVERY column as shown in the example below then click Next to continue.

Note that if you select an Enumerated type (Example ID), a Codelist will be created.


Completed Step 2

Reverse Engineer Wizard - Step 3 - Enumerated Columns

Step 3 will display the Concepts that in Step 2, you chose (in this example) "Use Code ID". If you are using an existing Concept Scheme, this step is for information purposes only, however if you are creating a Concept Scheme the settings here will determine the Codelists which will be created along with the Concept Scheme.


Reverse Engineer Wizard - Step 3

Reverse Engineer Wizard - Step 4 - Concept Roles

This step allows you to define the role for each concept (Dimension, Attribute or Measure) as shown in the example below.


Reverse Engineer Wizard - Step 4

Click Next to continue.

You will be asked to confirm and then you will be taken to the Data Structure page with the newly created DSD highlighted.

Data

To load, validate or convert a data file against this new Data Structure, all you need to do is to add the Data Structure ID etc in accordance with the rules regarding the file format. You can learn more about file formats here.

If you intend to validate the data against a Dataflow or a Provision Agreement, you can learn more about those structures here.