Pagination Automation
This will automatically do the entire pagination logic and will output a nice 'flat' list of records.
Feature Introduction
For manual pagination implementation users have to use
a Looper,
a special loop condition,
sometimes a Dict Helper to reference to the actual list of entities in the response, and
a Dict Helper to flatten the loop's output
in order to get the results from all pages from a Connector.
With Automatic Pagination the process is much simpler: Users just toggle the Retrieve all data switch button:
This will automatically do the entire pagination logic and will output a nice 'flat' list of records, ready to be further used inside the flow, thus replacing the manual 4 step process by a button click.
Configuration on connector
Here we have to fill the Pagination Configuration field in JSON format, the same one that's also used by Remote Search Configuration:
In most cases the configuration will looks something like this:
You can use Jinja in all fields in order to have have conditional statements or doing some simple calculations (e.g. as above page + 1
).
In the first request, the parameters are not used, only when pagination is needed it will be used. response_jsonpath
is of course also used to point to the list of results of the first request
query
query
Either query
, body
, or header
needs to be specified
This follows mostly the same rules and logic as in the Search configuration:
The query
is a string that contains a dictionary of query parameters that will be appended to the action call in order to do the pagination.
In case the same parameter is given by the action itself, it will be overwritten by the parameter defined here.
The page
parameter starts with 1 and is increased by 1 for every subsequent call.
Since most Connectors' paginate starting with 1, we do need to add + 1
to it in most cases, to start the first pagination request with 2 (as pointed out above, the first request is done independently of it, so the counting only starts after the second request).
The maximum page size is currently 100 in order to not accidentally run into endless loops or something similar. If we see that this needs to be adjusted, we can do so easily.
By using previous_request.
we can reference to the previous requests's response, which is e.g. needed when dealing with next page tokens.
If the query
parameter is given a GET
call is being done.
If a POST
call should be done instead, also specify body
with {}
as it's value.
response_jsonpath
response_jsonpath
Optional
This points to the list of results that should be used, which is relevant if the response is nested, e.g. in this example:
we expect the final output of the action to be a list of dictionaries, only containing entries that are inside the deals
list.
We have point to it using JSON path (more examples below).
endpoint
endpoint
Optional
This can be used to adjust the default endpoint for all except the first page, which is in some cases (e.g. Dropbox) needs to be extended to get the results of the next page with a next page token.
You can also reference to endpoint
in any of the parameters, which is often useful for e.g. the response_jsonpath
(see examples below).
body
body
Either query
, body
, or header
needs to be specified
The body
parameter is similar to the query
parameter. It also contains a string which is a dictionary. However, instead of query parameters, it contain the request body that will be used in the calls.
If the body
parameter is given a POST
call is automatically being done.
replace_body
replace_body
Optional
This defines whether the regular body that's being sent by the action should be replaced and only contain the pagination body as defined in the configuration.
So far we only experienced this for one connector (Dropbox), so this can be left empty by default ("replace_body": false
will be automatically added during saving of the configuration)
It can be either false
or true
.
header
header
Either query
, body
, or header
needs to be specified
The header
parameter is similar to the body
parameter. It also contains a string which is a dictionary. However, instead of request body, it contains the request headers that will be used in the calls.
If the header
parameter is given a GET
call is automatically being done.
If a POST
call should be done instead, also specify body with {}
as it's value.
when
when
Optional
The, potentially rendered, result of this parameter will be compared to True
and if it matches, that pagination configuration will be used.
This needs to be used in combination with multiple configurations.
Multiple configurations
In order to add multiple pagination configurations for one Connector, the configuration needs to be a list of configurations and the parameter when
needs to be used, except for the last statement, which can be a 'catch all' configuration.
The when
parameter will be checked from top to bottom and the one that matches first will be used. In case none match and a pagination configuration does not have a when
parameter, that configuration will be used (i.e. similar to multiple elif
and lastly an else
statement).
Configuration on action
On the action just toggle the switch button next to Supports automatic pagination in order to show the Retrieve all data (supports_automatic_pagination
) toggle on the action:
Examples
Freshsales - page / per_page with nested response list
Here the interesting piece is the response_jsonpath
:
As mentioned above, the Freshsales response structure looks like this:
where it's one of deals/contacts/sales_accounts
depending on the endpoint (e.g. deals/view/{id}
).
With {{ endpoint.split('/', 1)[0] }}
the endpoint is split into a list at every /
and then the first element of that list ([0]
) is accessed, which would be for the example deals
.
So the response_jsonpath
is $.deals
, which points directly to the list of records.
Freshdesk - page / per_page with raw list in response
Freshdesk returns the list of records without any nesting before it, thus we only need to specify the query
parameter here.
Zoom - cursor based token
The Zoom API works (in some parts, e.g. for meeting participants) with next page token.
The response from Zoom looks like this:
Thus, the query needs to reference to the previous request's response using previous_request.
and then the key of the next page token, which in Zoom's case is on the top level and is called next_page_token
.
Here rsplit
with a maxsplit
of 1 (second parameter) is used to split the endpoint into two elements (starting from the right side of the string) (difference between split
and rsplit
).
As the endpoint looks like metrics/meetings/{meetingId}/participants
, the result of {{ endpoint.rsplit('/', 1) }}
is ['metrics/meetings/{meetingId}', 'participants']
and the second element ([1]
) is thus participants
, which points to the list of records.
Dropbox - strange cursor based with special pagination endpoints
Dropbox, e.g. with the list_folders endpoint is quite a special case and uses a different endpoint (and http method) for getting the next pages.
In order to allow for this the endpoint
and body
parameters have to be used. Furthermore, replace_body
has to be set to true
.
The response_jsonpath
is quite straight forward, as the list of records is nested in entries
for every pagination enabled endpoint, so $.entries
is used.
Braze - limit and offset
Braze, Xandr uses limit and offset based pagination (e.g. for this endpoint).
Alternative keywords: limit / start e.g. Pipedrive.
This works very similarly to page number pagination: The offset parameter needs to be increased with each call, while the limit parameter needs to be a static value.
Here we can set the limit
query parameter to the default of 100
and then we have to multiple page + 1
with the limit
value and add 1 to the result, as we want to begin not with the 100th element (which we already got in the first call, but rather with the 101th element.
This might work differently for other limit and offset based paginations, as offset
sometimes refers to the number of results that should be skipped. However, Braze defines offset
as: “Optional beginning point in the list to retrieve from”
Plentific - limit and offset in header
Shopify - cursor based token in response header
Shopify's pagination has a few challenges:
The pagination tokens are links in a single header value (called
Link
) (i.e. one field for both previous and next link)The
Link
header value is different for the first and last page, as in those cases it contains only one linkNo filter query parameters are allowed to be set in pagination requests
The response jsonpath cannot be reliably derived from it's endpoint
These challenges can be solved as followed:
For the first two challenges, quite a bit of Jinja logic is required, as can be seen in the
endpoint
parameterThe query parameters can be removed from paginations requests, by setting them to
null
(as it has been done in thequery
parameter). This way, they will not be sent as query parameter at allLastly, JSON path filter expressions can be used in order to be highly flexible in terms of response bodies
Limitations
Currently, we have a few known and probably further unknown limitations which might be temporary or permanent.
Very low rate limits
We automatically handle rate limits with the same logic that's used for retries in regular flow runs.
However, some APIs, such as Twitter, have a very low rate limit, where only a few requests every minute are allowed. In case many more pages than the rate limit per minute have to be retrieved, the request will most likely result in an error as the pagination request will automatically stop after having received too many errors
Last updated