Skip to content

Commit 5d48a57

Browse files
patricijabreckoGasperSenkradovanjorgic
authored
Feat: Improve ADaaS documentation (#178)
* Add more information on local Airdrop snap-in development * Describe Extractions phases in more detail * Describe state handling * Describe the starter template * Add more information on creating a keyring * Add supported DevRev object types --------- Co-authored-by: GasperSenk <[email protected]> Co-authored-by: Radovan Jorgić <[email protected]>
1 parent dddfc5f commit 5d48a57

15 files changed

+1810
-333
lines changed

fern/docs/pages/airdrop/attachments-extraction.mdx

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
1-
For the attachment extraction phase of the import process, the extractor has to upload each
2-
attachment to DevRev's S3 using the `S3Interact` API.
1+
In the attachment extraction phase, the snap-in has to upload each attachment to DevRev and associate it with its parent data object.
32

43
## Triggering event
54

@@ -29,7 +28,10 @@ with an event of type `EXTRACTION_ATTACHMENTS_DONE`.
2928
If attachment extraction fails the snap-in must respond to Airdrop with a message with an event of
3029
type `EXTRACTION_ATTACHMENTS_ERROR`.
3130

32-
## Response from the snap-in
31+
## Implementation
32+
33+
Attachments extraction is already provided by SDK, but if you need to customize it for your use case,
34+
it should be implemented in the [attachments-extraction.ts](https://github.com/devrev/adaas-template/blob/main/code/src/functions/extraction/workers/attachments-extraction.ts) file.
3335

3436
After uploading an attachment or a batch of attachments, the extractor also has to prepare and
3537
upload a file specifying the extracted and uploaded attachments.
@@ -43,6 +45,7 @@ The uploaded artifact is structured like a normal artifact containing extracted
4345
## Examples
4446

4547
Here is an example of an SSOR attachment file:
48+
4649
```json lines
4750
{
4851
"id": {

fern/docs/pages/airdrop/data-extraction.mdx

Lines changed: 45 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -27,27 +27,27 @@ The restarting is immediate (in case of `EXTRACTION_DATA_PROGRESS`) or delayed
2727
(in case of `EXTRACTION_DATA_DELAY`).
2828

2929
Once the data extraction is done, the snap-in must respond to Airdrop with a message with event of
30-
type `EXTRACTION_DATA_DONE`.
30+
type `EXTRACTION_DATA_DONE`.
3131

3232
If data extraction failed in any moment of extraction, the snap-in must respond to Airdrop with a
3333
message with event of type `EXTRACTION_DATA_ERROR`.
3434

35-
## Response from the snap-in
35+
## Implementation
3636

37-
During the data extraction phase, the snap-in uploads batches of extracted items (the recommended
38-
batch size is 2000 items) formatted in JSONL (JSON Lines format), gzipped, and submitted as an
39-
artifact to S3Interact (with tooling from `@devrev/adaas-sdk`).
37+
Data extraction should be implemented in the [data-extraction.ts](https://github.com/devrev/adaas-template/blob/main/code/src/functions/extraction/workers/data-extraction.ts) file.
38+
39+
During the data extraction phase, the snap-in uploads batches of extracted items (with tooling from `@devrev/adaas-sdk`).
4040

4141
Each artifact is submitted with an `item_type`, defining a separate domain object from the
4242
external system and matching the `record_type` in the provided metadata.
43-
Item types defined when uploading extracted data must validate the declarations in the metadata file.
4443

4544
Extracted data must be normalized:
45+
4646
- Null values: All fields without a value should either be omitted or set to null.
47-
For example, if an external system provides values such as "", –1 for missing values,
48-
those must be set to null.
47+
For example, if an external system provides values such as "", –1 for missing values,
48+
those must be set to null.
4949
- Timestamps: Full-precision timestamps should be formatted as RFC3339 (`1972-03-29T22:04:47+01:00`),
50-
and dates should be just `2020-12-31`.
50+
and dates should be just `2020-12-31`.
5151
- References: references must be strings, not numbers or objects.
5252
- Number fields must be valid JSON numbers (not strings).
5353
- Multiselect fields must be provided as an array (not CSV).
@@ -58,17 +58,17 @@ All other fields are contained within the `data` attribute.
5858

5959
```json {2-4}
6060
{
61-
"id": "2102e01F",
62-
"created_date": "1972-03-29T22:04:47+01:00",
63-
"modified_date": "1970-01-01T01:00:04+01:00",
64-
"data": {
65-
"actual_close_date": "1970-01-01T02:33:18+01:00",
66-
"creator": "b8",
67-
"owner": "A3A",
68-
"rca": null,
69-
"severity": "fatal",
70-
"summary": "Lorem ipsum"
71-
}
61+
"id": "2102e01F",
62+
"created_date": "1972-03-29T22:04:47+01:00",
63+
"modified_date": "1970-01-01T01:00:04+01:00",
64+
"data": {
65+
"actual_close_date": "1970-01-01T02:33:18+01:00",
66+
"creator": "b8",
67+
"owner": "A3A",
68+
"rca": null,
69+
"severity": "fatal",
70+
"summary": "Lorem ipsum"
71+
}
7272
}
7373
```
7474

@@ -86,14 +86,32 @@ You can also generate example data to show the format the data has to be normali
8686
echo '{}' | chef-cli fuzz-extracted -r issue -m external_domain_metadata.json > example_issues.json
8787
```
8888

89-
## Deploying and testing the snap-in
89+
## State handling
90+
91+
Since each snap-in invocation is a separate runtime instance (with a maximum execution time of 12 minutes),
92+
it does not know what has been previously accomplished or how many records have already been extracted.
93+
To enable information passing between invocations and runs, support has been added for saving a limited amount
94+
of data as the snap-in `state`. Snap-in `state` persists between phases in one sync run as well as between multiple sync runs.
95+
You can access the `state` through SDK's `adapter` object.
9096

91-
Once you have implemented data extraction, you should deploy your snap-in to your test organization and run an import.
97+
A snap-in must consult its state to obtain information on when the last successful forward sync started.
9298

93-
To deploy the snap-in, run `make auth` and `make deploy` in the snap-in repository.
94-
Then, activate the snap-in by running `devrev snap_in activate`.
99+
- The snap-in's `state` is loaded at the start of each invocation and saved at its end.
100+
- The snap-in's `state` must be a valid JSON object.
101+
- Each sync direction (to DevRev and from DevRev) has its own `state` object that is not shared.
102+
- The snap-in `state` should be smaller than 1 MB, which maps to approximately 500,000 characters.
95103

96-
After activation, you can create an import in the DevRev UI, which will initially reach the 'waiting for user input' stage.
97-
During this phase, you can verify your data extraction implementation is working correctly.
104+
Effective use of the state and breaking down the problem into smaller chunks are crucial for good performance and user experience. Without knowing what has been processed, the snap-in extracts the same data multiple times, using valuable API capacity and time, and possibly duplicates the data inside DevRev or the external application.
98105

99-
Relevant documentation can be found in the [Snap-in development](/snapin-development/locally-testing-snap-ins) section.
106+
The snap-in starter template contains an [example](https://github.com/devrev/adaas-template/blob/main/code/src/functions/extraction/index.ts) of a simple state. Adding more data to the state can help with pagination and rate limiting by saving the point at which extraction was left off.
107+
108+
To test the state in development, you can decrease the timeout between snap-in invocations.
109+
110+
```typescript
111+
await spawn<DummyExtractorState>({
112+
...,
113+
option: {
114+
timeout: 1 * 60 * 1000; // 1 minute in milliseconds
115+
}
116+
});
117+
```
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
Once you're ready to test your snap-in in a production environment, you can deploy the snap-in to your organization.
2+
3+
Follow these steps:
4+
5+
1. Copy `.env.example` to a new file named `.env` and fill in the required variables.
6+
2. Deploy a draft version of your snap-in to your organization by using `make deploy`.
7+
3. Install the snap-in in your DevRev by going to **Settings** > **Snap-ins** > **Install snap-in**.
8+
4. Set up the connection under **Settings** > **Airdrops** > **Connections**.
9+
5. Create an import at **Settings** > **Airdrops** > **Airdrop**.
10+
11+
This step is also a prerequisite for publishing the snap-in on the DevRev marketplace.
12+
13+
### Observability
14+
15+
To observe logs from your snap-in in your development environment:
16+
17+
```bash
18+
devrev snap_in_package logs | jq
19+
```
20+
21+
To open logs in your favorite editor:
22+
23+
```bash
24+
devrev snap_in_package logs | code -
25+
```
26+
27+
For more information, refer to [Debugging](/snapin-development/debugging).
Lines changed: 37 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,52 +1,55 @@
1-
In the external sync unit extraction phase, the extractor is expected to obtain a list of external
2-
sync units that it can extract with the provided credentials and send it to Airdrop in its response.
3-
41
An _external sync unit_ refers to a single unit in the external system that is being airdropped to DevRev.
52
In some systems, this is a project; in some it is a repository; in support systems it could be
63
called a brand or an organization.
74
What a unit of data is called and what it represents depends on the external system's domain model.
85
It usually combines contacts, users, work-like items, and comments into a unit of domain objects.
96

10-
Some external systems may offer a single unit in their free plans,
11-
while their enterprise plans may offer their clients to operate many separate units.
12-
13-
The external sync unit ID is the identifier of the sync unit (project, repository, or similar)
14-
in the external system.
15-
For GitHub, this would be the repository, for example `cli` in `github.com/devrev/cli`.
16-
17-
## Triggering event
7+
In the external sync unit extraction phase, the snap-in is expected to obtain a list of external
8+
sync units that it can extract from the external system API and send it to Airdrop in its response.
189

1910
External sync unit extraction is executed only during the initial import.
20-
It extracts external sync units available in the external system, so that the end user can choose
21-
which external sync unit should be airdropped during the creation of an **Import** in the DevRev App.
2211

23-
Airdrop initiates the external sync unit extraction phase by starting the worker with a message
24-
with an event of type `EXTRACTION_EXTERNAL_SYNC_UNITS_START`.
12+
### Implementation
2513

26-
The snap-in must respond to Airdrop with a message with an event of type
27-
`EXTRACTION_EXTERNAL_SYNC_UNITS_DONE`, which contains a list of external sync units as a payload,
28-
or `EXTRACTION_EXTERNAL_SYNC_UNITS_ERROR` in case of an error.
14+
This phase should be implemented in the [`external-sync-units-extraction.ts`](https://github.com/devrev/adaas-template/blob/main/code/src/functions/extraction/workers/external-sync-units-extraction.ts) file.
2915

30-
## Response from the snap-in
16+
The snap-in should emit the list of external sync units in the given format:
17+
18+
```typescript
19+
const externalSyncUnits: ExternalSyncUnit[] = [
20+
{
21+
id: "devrev",
22+
name: "devrev",
23+
description: "Demo external sync unit",
24+
item_count: 100,
25+
},
26+
];
27+
```
3128

32-
The snap-in provides the list of external sync units in the provided event message
33-
`event_data.external_sync_units` containing the following fields:
3429
- `id`: The unique identifier in the external system.
3530
- `name`: The human-readable name in the external system.
3631
- `description`: The short description if the external system provides it.
3732
- `item_count`: The number of items (issues, tickets, comments or others) in the external system.
38-
Item count should be provided if it can be obtained in a lightweight manner, such as by calling an API endpoint.
39-
If there is no such way to get it (for example, if the items would need to be extracted to count them),
40-
then the item count should be `-1` to avoid blocking the import with long-running queries.
33+
Item count should be provided if it can be obtained in a lightweight manner, such as by calling an API endpoint.
34+
If there is no such way to get it (for example, if the items would need to be extracted to count them),
35+
then the item count should be `-1` to avoid blocking the import with long-running queries.
4136

42-
Example:
43-
```json
44-
[
45-
{
46-
"id": "a-microservice-repository",
47-
"name": "A Microservice Repository",
48-
"description": "Our greatest microservice repo",
49-
"item_count": 232
50-
}
51-
]
37+
The snap-in must respond to Airdrop with a message, which contains a list of external sync units as a payload:
38+
39+
```typescript
40+
await adapter.emit(ExtractorEventType.ExtractionExternalSyncUnitsDone, {
41+
external_sync_units: externalSyncUnits,
42+
});
43+
```
44+
45+
or an error:
46+
47+
```typescript
48+
await adapter.emit(ExtractorEventType.ExtractionExternalSyncUnitsError, {
49+
error: {
50+
message: "Failed to extract external sync units. Lambda timeout.",
51+
},
52+
});
5253
```
54+
55+
To test your changes, start a new airdrop in the DevRev App. If external sync units extraction is successful, you should be prompted to choose an external sync unit from the list.

0 commit comments

Comments
 (0)