You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: fern/docs/pages/airdrop/data-extraction.mdx
+86-24Lines changed: 86 additions & 24 deletions
Original file line number
Diff line number
Diff line change
@@ -1,47 +1,90 @@
1
1
In the data extraction phase, the extractor is expected to call the external system's APIs
2
-
to retrieve all the items that were updated since the start of the last extraction.
3
-
If there was no previous extraction (the current run is an initial import),
4
-
then all the items should be extracted.
2
+
to retrieve all the items that should be synced with DevRev.
5
3
6
-
The extractor must store at what time it started each extraction in its state,
7
-
so that it can extract only items created or updated since this date in the next sync run.
4
+
If the current run is an initial sync, this means all the items should be extracted.
5
+
Otherwise the extractor should retrieve all the items that were changed since the start of the last extraction.
6
+
7
+
Each snap-in invocation is a separate runtime instance with a maximum execution time of 12 minutes.
8
+
If a large amount of data needs to be extracted, it might not all be extracted within this time frame.
9
+
To handle such situations, the snap-in uses a state object.
10
+
This state object is shared across all invocations and keeps track of where the previous snap-in invocations ended in the extraction process.
8
11
9
12
## Triggering event
10
13
11
14
Airdrop initiates data extraction by starting the snap-in with a message with event of type
12
15
`EXTRACTION_DATA_START` when transitioning to the data extraction phase.
13
16
14
17
During the data extraction phase, the snap-in extracts data from an external system,
15
-
prepares batches of data and uploads them in the form of artifacts to DevRev.
18
+
prepares batches of data and uploads them in the form of artifacts (files) to DevRev.
16
19
17
-
The snap-in must respond to Airdrop with a message with event of type `EXTRACTION_DATA_PROGRESS`,
18
-
together with an optional progress estimate and relevant artifacts
19
-
when it extracts some data and the maximum Airdrop snap-in runtime (12 minutes) has been reached.
20
+
The snap-in must respond to Airdrop with a message with event type of`EXTRACTION_DATA_PROGRESS`,
21
+
together with an optional progress estimate and relevant list of artifacts
22
+
when the maximum Airdrop snap-in runtime (12 minutes) has been reached.
20
23
21
24
If the extraction has been rate-limited by the external system and back-off is required, the snap-in
22
-
must respond to Airdrop with a message with event of type `EXTRACTION_DATA_DELAY` and specifying
23
-
back-off time with `delay` attribute.
25
+
must respond to Airdrop with a message with event type `EXTRACTION_DATA_DELAY` and specifying
26
+
back-off time with `delay` attribute (in seconds).
24
27
25
28
In both cases, Airdrop starts the snap-in with a message with event of type `EXTRACTION_DATA_CONTINUE`.
26
-
The restarting is immediate (in case of `EXTRACTION_DATA_PROGRESS`) or delayed
27
-
(in case of `EXTRACTION_DATA_DELAY`).
29
+
In case of `EXTRACTION_DATA_PROGRESS` the restarting is immediate,
30
+
meanwhile in case of `EXTRACTION_DATA_DELAY` the restarting is delayed for the given number of seconds.
28
31
29
-
Once the data extraction is done, the snap-in must respond to Airdrop with a message with event of
30
-
type `EXTRACTION_DATA_DONE`.
32
+
Once the data extraction is done, the snap-in must respond to Airdrop with a message with event type `EXTRACTION_DATA_DONE`.
31
33
32
34
If data extraction failed in any moment of extraction, the snap-in must respond to Airdrop with a
33
-
message with event of type `EXTRACTION_DATA_ERROR`.
35
+
message with event type `EXTRACTION_DATA_ERROR`.
34
36
35
37
## Implementation
36
38
37
39
Data extraction should be implemented in the [data-extraction.ts](https://github.com/devrev/airdrop-template/blob/main/code/src/functions/extraction/workers/data-extraction.ts) file.
38
40
39
-
During the data extraction phase, the snap-in uploads batches of extracted items (with tooling from `@devrev/adaas-sdk`).
41
+
### Extracting and storing the data
42
+
43
+
The SDK library includes a repository system for storing extracted items.
44
+
Each item type, such as users, tasks, or issues, has its own repository.
45
+
These are defined in the `repos` array as `itemType`.
46
+
The `itemType` name should match the `record_type` specified in the provided metadata.
47
+
48
+
```typescript
49
+
const repos = [
50
+
{
51
+
itemType: 'todos',
52
+
},
53
+
{
54
+
itemType: 'users',
55
+
},
56
+
{
57
+
itemType: 'attachments',
58
+
},
59
+
];
60
+
```
61
+
62
+
The `initializeRepos` function initializes the repositories and should be the first step when the process begins.
63
+
64
+
```typescript
65
+
processTask<ExtractorState>({
66
+
task: async ({ adapter }) => {
67
+
adapter.initializeRepos(repos);
68
+
// ...
69
+
},
70
+
onTimeout: async ({ adapter }) => {
71
+
// ...
72
+
},
73
+
});
74
+
```
75
+
76
+
After initialization, items are retrieved from the external system and stored in the repository by calling the `push` function.
77
+
78
+
```typescript
79
+
awaitadapter.getRepo('users')?.push(items);
80
+
```
81
+
82
+
### Data normalization
40
83
41
-
Each artifact is submitted with an `item_type`, defining a separate domain object from the
42
-
external system and matching the `record_type`in the provided metadata.
84
+
Extracted data must be normalized to fit the domain metadata defined in the`external-domain-metadata.json` file.
85
+
More details on this process are provided in the [Metadata extraction](/public/snapin-development/adaas/metadata-extraction) section.
43
86
44
-
Extracted data must be normalized:
87
+
Normalization rules:
45
88
46
89
- Null values: All fields without a value should either be omitted or set to null.
47
90
For example, if an external system provides values such as "", –1 for missing values,
@@ -52,6 +95,27 @@ Extracted data must be normalized:
52
95
- Number fields must be valid JSON numbers (not strings).
53
96
- Multiselect fields must be provided as an array (not CSV).
54
97
98
+
Extracted items are automatically normalized when pushed to the `repo` if a normalization function is provided under the `normalize` key in the repo object.
99
+
100
+
```typescript
101
+
const repos = [
102
+
{
103
+
itemType: 'todos',
104
+
normalize: normalizeTodo,
105
+
},
106
+
{
107
+
itemType: 'users',
108
+
normalize: normalizeUser,
109
+
},
110
+
{
111
+
itemType: 'attachments',
112
+
normalize: normalizeAttachment,
113
+
},
114
+
];
115
+
```
116
+
117
+
For examples of normalization functions, refer to the [data-normalization.ts](https://github.com/devrev/airdrop-template/blob/main/code/src/functions/external-system/data-normalization.ts) file in the starter template.
118
+
55
119
Each line of the file contains an `id` and the optional `created_date` and `modified_date` fields
56
120
in the beginning of the record.
57
121
All other fields are contained within the `data` attribute.
Copy file name to clipboardExpand all lines: fern/docs/pages/airdrop/loading-phases.mdx
+5Lines changed: 5 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -49,4 +49,9 @@ export default run;
49
49
Loading phases run as separate runtime instances, similar to extraction phases, with a maximum execution time of 12 minutes.
50
50
These phases share a `state`, defined in the `LoaderState` interface.
51
51
It is important to note that the loader state is separate from the extractor state.
52
+
52
53
Access to the `state` is available through the SDK's `adapter` object.
54
+
55
+
## Creating items in DevRev
56
+
57
+
To create an item in DevRev and sync it with the external system, start by creating an item with a **subtype** that was established during the initial sync. After selecting the subtype, fill out the necessary details for the item.
Copy file name to clipboardExpand all lines: fern/docs/pages/airdrop/manifest.mdx
+19-11Lines changed: 19 additions & 11 deletions
Original file line number
Diff line number
Diff line change
@@ -46,18 +46,18 @@ Ensure that `extractor_function` and `loader_function` names correspond with tho
46
46
47
47
## Establish a connection to the external system
48
48
49
-
_Keyrings_ are a collection of authentication information, used by a snap-in to authenticate to the external system in API calls. This can include a key (for example, a PAT token or API key), its type, the organization ID for which a key is valid, and in some cases the organization name.
49
+
_Keyrings_ provide a secure way to store and manage credentials within your DevRev snap-in.
50
50
51
-
Keyrings provide a secure way to store and manage credentials within your DevRev snap-in.
52
-
This eliminates the need to expose sensitive information like passwords or access tokens directly
53
-
within your code or configuration files, enhancing overall security.
54
-
They also provide a valid token by abstracting OAuth token renewal from the end user.
51
+
Keyrings are a collection of authentication information, used by a snap-in to authenticate to the external system in API calls.
52
+
This can include a key (for example, a PAT token or API key), its type and the organization ID for which a key is valid.
55
53
56
-
They are called **Connections** in the DevRev app.
54
+
This eliminates the need to expose sensitive information like passwords or access tokens directly within your code or configuration files. They also provide a valid token by abstracting OAuth token renewal from the end user, so less work is needed on the developer's side.
55
+
56
+
Keyrings are called **Connections** in the DevRev app.
57
57
58
58
### Configure a keyring
59
59
60
-
Keyrings are configured in the `manifest.yaml` by configuring a `keyring_type`, like in the [example](https://github.com/devrev/airdrop-template/blob/main/manifest.yaml).
60
+
Keyrings are configured in the `manifest.yaml` by configuring a `keyring_type`, like in the [example](https://github.com/devrev/airdrop-template/blob/main/manifest.yaml):
61
61
62
62
```yaml
63
63
keyring_types:
@@ -67,8 +67,7 @@ keyring_types:
67
67
# The kind field specifies the type of keyring.
68
68
kind: <"secret"/"oauth2">
69
69
# is_subdomain field specifies whether the keyring contains a subdomain.
70
-
# Enabling this field allows the keyring to get the subdomain from the user during creation.
71
-
# This is useful when the keyring requires a subdomain as part of the configuration.
70
+
# Enabling this field allows the keyring to get the subdomain from the user during keyring creation.
72
71
# Default is false.
73
72
is_subdomain: <true/false>
74
73
# Name of the external system you are importing from.
@@ -96,7 +95,7 @@ keyring_types:
96
95
# Optional: query parameters to be included in the verification request.
97
96
query_params:
98
97
<param_name>: <param_value> # optional: query parameters to be included in the verification request.
99
-
# Fetching Organization Data: This allows you to retrieve additional information about the user's organization.
98
+
# Optional: fetching organization data if is_subdomain option is false.
100
99
organization_data:
101
100
type: "config"
102
101
# The URL to which the request is sent to fetch organization data.
@@ -106,7 +105,16 @@ keyring_types:
106
105
headers:
107
106
<header_name>: <header_value>
108
107
# The jq filter used to extract the organization data from the response.
108
+
# It should provide an object with id and name, depending on what the external system returns.
109
+
# For example "{id: .data[0].id, name: .data[0].name }".
109
110
response_jq: <jq_filter>
110
111
```
112
+
There are some options to consider:
113
+
114
+
* `kind`
115
+
The `kind` option can be either "secret" or "oauth2". The "secret" option is intended for storing various tokens, such as a PAT token. Use of OAuth2 is encouraged when possible. More information is available for [secret](/public/snapin-development/references/keyrings/secret-configuration) and [oauth2](/oauth-configuration).
116
+
117
+
* `is_subdomain`
118
+
The `is_subdomain` field relates to the API endpoints being called. When the endpoints for fetching data from an external system include a slug representing the organization—such as for example `https://subdomain.freshdesk.com/api/v2/tickets`—set this key to "true". In this scenario, users creating a new connection are prompted to insert the subdomain.
111
119
112
-
You can find more information about keyrings and keyring types [here](/snapin-development/references/keyrings/keyring-intro).
120
+
If no subdomain is present in the endpoint URL, set this key to "false". In this case, provide the `organization_data` part of the configuration. Specify the endpoint in the `url` field to fetch organization data. Users creating a new connection are prompted to select the organization from a list of options, as retrieved from the `organization_data.url` value.
0 commit comments