You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Add more information on local Airdrop snap-in development
* Describe Extractions phases in more detail
* Describe state handling
* Describe the starter template
* Add more information on creating a keyring
* Add supported DevRev object types
---------
Co-authored-by: GasperSenk <[email protected]>
Co-authored-by: Radovan Jorgić <[email protected]>
Copy file name to clipboardExpand all lines: fern/docs/pages/airdrop/attachments-extraction.mdx
+6-3Lines changed: 6 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,4 @@
1
-
For the attachment extraction phase of the import process, the extractor has to upload each
2
-
attachment to DevRev's S3 using the `S3Interact` API.
1
+
In the attachment extraction phase, the snap-in has to upload each attachment to DevRev and associate it with its parent data object.
3
2
4
3
## Triggering event
5
4
@@ -29,7 +28,10 @@ with an event of type `EXTRACTION_ATTACHMENTS_DONE`.
29
28
If attachment extraction fails the snap-in must respond to Airdrop with a message with an event of
30
29
type `EXTRACTION_ATTACHMENTS_ERROR`.
31
30
32
-
## Response from the snap-in
31
+
## Implementation
32
+
33
+
Attachments extraction is already provided by SDK, but if you need to customize it for your use case,
34
+
it should be implemented in the [attachments-extraction.ts](https://github.com/devrev/adaas-template/blob/main/code/src/functions/extraction/workers/attachments-extraction.ts) file.
33
35
34
36
After uploading an attachment or a batch of attachments, the extractor also has to prepare and
35
37
upload a file specifying the extracted and uploaded attachments.
@@ -43,6 +45,7 @@ The uploaded artifact is structured like a normal artifact containing extracted
Copy file name to clipboardExpand all lines: fern/docs/pages/airdrop/data-extraction.mdx
+45-27Lines changed: 45 additions & 27 deletions
Original file line number
Diff line number
Diff line change
@@ -27,27 +27,27 @@ The restarting is immediate (in case of `EXTRACTION_DATA_PROGRESS`) or delayed
27
27
(in case of `EXTRACTION_DATA_DELAY`).
28
28
29
29
Once the data extraction is done, the snap-in must respond to Airdrop with a message with event of
30
-
type `EXTRACTION_DATA_DONE`.
30
+
type `EXTRACTION_DATA_DONE`.
31
31
32
32
If data extraction failed in any moment of extraction, the snap-in must respond to Airdrop with a
33
33
message with event of type `EXTRACTION_DATA_ERROR`.
34
34
35
-
## Response from the snap-in
35
+
## Implementation
36
36
37
-
During the data extraction phase, the snap-in uploads batches of extracted items (the recommended
38
-
batch size is 2000 items) formatted in JSONL (JSON Lines format), gzipped, and submitted as an
39
-
artifact to S3Interact (with tooling from `@devrev/adaas-sdk`).
37
+
Data extraction should be implemented in the [data-extraction.ts](https://github.com/devrev/adaas-template/blob/main/code/src/functions/extraction/workers/data-extraction.ts) file.
38
+
39
+
During the data extraction phase, the snap-in uploads batches of extracted items (with tooling from `@devrev/adaas-sdk`).
40
40
41
41
Each artifact is submitted with an `item_type`, defining a separate domain object from the
42
42
external system and matching the `record_type` in the provided metadata.
43
-
Item types defined when uploading extracted data must validate the declarations in the metadata file.
44
43
45
44
Extracted data must be normalized:
45
+
46
46
- Null values: All fields without a value should either be omitted or set to null.
47
-
For example, if an external system provides values such as "", –1 for missing values,
48
-
those must be set to null.
47
+
For example, if an external system provides values such as "", –1 for missing values,
48
+
those must be set to null.
49
49
- Timestamps: Full-precision timestamps should be formatted as RFC3339 (`1972-03-29T22:04:47+01:00`),
50
-
and dates should be just `2020-12-31`.
50
+
and dates should be just `2020-12-31`.
51
51
- References: references must be strings, not numbers or objects.
52
52
- Number fields must be valid JSON numbers (not strings).
53
53
- Multiselect fields must be provided as an array (not CSV).
@@ -58,17 +58,17 @@ All other fields are contained within the `data` attribute.
58
58
59
59
```json {2-4}
60
60
{
61
-
"id": "2102e01F",
62
-
"created_date": "1972-03-29T22:04:47+01:00",
63
-
"modified_date": "1970-01-01T01:00:04+01:00",
64
-
"data": {
65
-
"actual_close_date": "1970-01-01T02:33:18+01:00",
66
-
"creator": "b8",
67
-
"owner": "A3A",
68
-
"rca": null,
69
-
"severity": "fatal",
70
-
"summary": "Lorem ipsum"
71
-
}
61
+
"id": "2102e01F",
62
+
"created_date": "1972-03-29T22:04:47+01:00",
63
+
"modified_date": "1970-01-01T01:00:04+01:00",
64
+
"data": {
65
+
"actual_close_date": "1970-01-01T02:33:18+01:00",
66
+
"creator": "b8",
67
+
"owner": "A3A",
68
+
"rca": null,
69
+
"severity": "fatal",
70
+
"summary": "Lorem ipsum"
71
+
}
72
72
}
73
73
```
74
74
@@ -86,14 +86,32 @@ You can also generate example data to show the format the data has to be normali
Since each snap-in invocation is a separate runtime instance (with a maximum execution time of 12 minutes),
92
+
it does not know what has been previously accomplished or how many records have already been extracted.
93
+
To enable information passing between invocations and runs, support has been added for saving a limited amount
94
+
of data as the snap-in `state`. Snap-in `state` persists between phases in one sync run as well as between multiple sync runs.
95
+
You can access the `state` through SDK's `adapter` object.
90
96
91
-
Once you have implemented data extraction, you should deploy your snap-in to your test organization and run an import.
97
+
A snap-in must consult its state to obtain information on when the last successful forward sync started.
92
98
93
-
To deploy the snap-in, run `make auth` and `make deploy` in the snap-in repository.
94
-
Then, activate the snap-in by running `devrev snap_in activate`.
99
+
- The snap-in's `state` is loaded at the start of each invocation and saved at its end.
100
+
- The snap-in's `state` must be a valid JSON object.
101
+
- Each sync direction (to DevRev and from DevRev) has its own `state` object that is not shared.
102
+
- The snap-in `state` should be smaller than 1 MB, which maps to approximately 500,000 characters.
95
103
96
-
After activation, you can create an import in the DevRev UI, which will initially reach the 'waiting for user input' stage.
97
-
During this phase, you can verify your data extraction implementation is working correctly.
104
+
Effective use of the state and breaking down the problem into smaller chunks are crucial for good performance and user experience. Without knowing what has been processed, the snap-in extracts the same data multiple times, using valuable API capacity and time, and possibly duplicates the data inside DevRev or the external application.
98
105
99
-
Relevant documentation can be found in the [Snap-in development](/snapin-development/locally-testing-snap-ins) section.
106
+
The snap-in starter template contains an [example](https://github.com/devrev/adaas-template/blob/main/code/src/functions/extraction/index.ts) of a simple state. Adding more data to the state can help with pagination and rate limiting by saving the point at which extraction was left off.
107
+
108
+
To test the state in development, you can decrease the timeout between snap-in invocations.
In the external sync unit extraction phase, the extractor is expected to obtain a list of external
2
-
sync units that it can extract with the provided credentials and send it to Airdrop in its response.
3
-
4
1
An _external sync unit_ refers to a single unit in the external system that is being airdropped to DevRev.
5
2
In some systems, this is a project; in some it is a repository; in support systems it could be
6
3
called a brand or an organization.
7
4
What a unit of data is called and what it represents depends on the external system's domain model.
8
5
It usually combines contacts, users, work-like items, and comments into a unit of domain objects.
9
6
10
-
Some external systems may offer a single unit in their free plans,
11
-
while their enterprise plans may offer their clients to operate many separate units.
12
-
13
-
The external sync unit ID is the identifier of the sync unit (project, repository, or similar)
14
-
in the external system.
15
-
For GitHub, this would be the repository, for example `cli` in `github.com/devrev/cli`.
16
-
17
-
## Triggering event
7
+
In the external sync unit extraction phase, the snap-in is expected to obtain a list of external
8
+
sync units that it can extract from the external system API and send it to Airdrop in its response.
18
9
19
10
External sync unit extraction is executed only during the initial import.
20
-
It extracts external sync units available in the external system, so that the end user can choose
21
-
which external sync unit should be airdropped during the creation of an **Import** in the DevRev App.
22
11
23
-
Airdrop initiates the external sync unit extraction phase by starting the worker with a message
24
-
with an event of type `EXTRACTION_EXTERNAL_SYNC_UNITS_START`.
12
+
### Implementation
25
13
26
-
The snap-in must respond to Airdrop with a message with an event of type
27
-
`EXTRACTION_EXTERNAL_SYNC_UNITS_DONE`, which contains a list of external sync units as a payload,
28
-
or `EXTRACTION_EXTERNAL_SYNC_UNITS_ERROR` in case of an error.
14
+
This phase should be implemented in the [`external-sync-units-extraction.ts`](https://github.com/devrev/adaas-template/blob/main/code/src/functions/extraction/workers/external-sync-units-extraction.ts) file.
29
15
30
-
## Response from the snap-in
16
+
The snap-in should emit the list of external sync units in the given format:
17
+
18
+
```typescript
19
+
const externalSyncUnits:ExternalSyncUnit[] = [
20
+
{
21
+
id: "devrev",
22
+
name: "devrev",
23
+
description: "Demo external sync unit",
24
+
item_count: 100,
25
+
},
26
+
];
27
+
```
31
28
32
-
The snap-in provides the list of external sync units in the provided event message
33
-
`event_data.external_sync_units` containing the following fields:
34
29
-`id`: The unique identifier in the external system.
35
30
-`name`: The human-readable name in the external system.
36
31
-`description`: The short description if the external system provides it.
37
32
-`item_count`: The number of items (issues, tickets, comments or others) in the external system.
38
-
Item count should be provided if it can be obtained in a lightweight manner, such as by calling an API endpoint.
39
-
If there is no such way to get it (for example, if the items would need to be extracted to count them),
40
-
then the item count should be `-1` to avoid blocking the import with long-running queries.
33
+
Item count should be provided if it can be obtained in a lightweight manner, such as by calling an API endpoint.
34
+
If there is no such way to get it (for example, if the items would need to be extracted to count them),
35
+
then the item count should be `-1` to avoid blocking the import with long-running queries.
41
36
42
-
Example:
43
-
```json
44
-
[
45
-
{
46
-
"id": "a-microservice-repository",
47
-
"name": "A Microservice Repository",
48
-
"description": "Our greatest microservice repo",
49
-
"item_count": 232
50
-
}
51
-
]
37
+
The snap-in must respond to Airdrop with a message, which contains a list of external sync units as a payload:
message: "Failed to extract external sync units. Lambda timeout.",
51
+
},
52
+
});
52
53
```
54
+
55
+
To test your changes, start a new airdrop in the DevRev App. If external sync units extraction is successful, you should be prompted to choose an external sync unit from the list.
0 commit comments