Data Warehouse Helper

Upsert to and read data from Data Sources

Overview

The Data Warehouse Helper allows you to data with automatic flattening of nested JSON structures. It also enables you to retrieve data and apply filters using PostgreSQL WHERE statements.

Actions

1. Insert or update data

The data warehouse helper allows you to store various types of data, e.g. even complex structures, like JSONs, will be stored flat.

Possible input values:

  • Value reference - the data object that you want to store in the Data Warehouse.

  • Primary key column - The primary key on which you want to deduplicate, e.g. usually ID.

  • Data source ID - a data source ID of a data source that you previously generated so that one can store the data against it.

Nested JSON

Nested JSON structures will be automatically flattened, so that you don't have to do this anymore during the transformation step.

For example, this is the nested input list of dictionaries:

[
  {
    "field_a": "value 123",
    "my_nested_field": {
      "another_field": 456,
      "even_more_nested": {
        "field_a": "example",
        "field_b": true
      }
    }
  },
  {
    ...
]

During the data insertion, the nested JSON will be automatically flattened and the field name will be the json path, connected by _ (the nested structure will be preserved as well):

[
  {
    "field_a": "value 123",
    "my_nested_field": {
      "another_field": 456,
      "even_more_nested": {
        "field_a": "example",
        "field_b": true
      }
    },
    "my_nested_field_another_field": 456,
    "my_nested_field_even_more_nested_field_a": "example",
    "my_nested_field_even_more_nested_field_b": true,
  },
  {
    ...
]

In case the column name generated by this exceeds 63 characters, the name will be truncated from the beginning (in order to keep the column name unique), e.g. my_very_long_field_name_that_is_also_nested_inside_another_dict_field_abc

Will be stored as:

ng_field_name_that_is_also_nested_inside_another_dict_field_abc

2. Retrieve data

Using the Retrieve from action, you can retrieve data from Data Sources, Transforms, and Insights, directly from a flow and filter using PostgreSQL WHERE statements.

Possible input values:

  • Source type - the source, e.g. data_source

  • Source id - the ID of the source, e.g. the ID of the data source

  • WHERE Statement - A WHERE statement, using PostgreSQL, that will be applied to the retrieval. The WHERE does not need to be written - optional

  • Start Date - The start date in case the variable $start_date is used in the Insight that you're retrieving data from - optional (Insights only)

  • End Date - The end date in case the variable $end_date is used in the Insight that you're retrieving data from - optional (Insights only)

In PostgreSQL unquoted names are case-insensitive. This means that mycolumn = 'abc' and myColumn = 'abc' are equivalent and both columns are interpreted as mycolumn.

However, quoted names are case-sensitive. So in case there is any casing in your column names you should use quotes, e.g. "myColumn" = 'abc'.

Last updated