Skip to content

Commit 6a061e8

Browse files
ISS-162937: Update Airdrop docs
1 parent 7ec7798 commit 6a061e8

10 files changed

+173
-73
lines changed

fern/docs/pages/airdrop/data-extraction.mdx

Lines changed: 86 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,47 +1,90 @@
11
In the data extraction phase, the extractor is expected to call the external system's APIs
2-
to retrieve all the items that were updated since the start of the last extraction.
3-
If there was no previous extraction (the current run is an initial import),
4-
then all the items should be extracted.
2+
to retrieve all the items that should be synced with DevRev.
53

6-
The extractor must store at what time it started each extraction in its state,
7-
so that it can extract only items created or updated since this date in the next sync run.
4+
If the current run is an initial sync, this means all the items should be extracted.
5+
Otherwise the extractor should retrieve all the items that were changed since the start of the last extraction.
6+
7+
Each snap-in invocation is a separate runtime instance with a maximum execution time of 12 minutes.
8+
If a large amount of data needs to be extracted, it might not all be extracted within this time frame.
9+
To handle such situations, the snap-in uses a state object.
10+
This state object is shared across all invocations and keeps track of where the previous snap-in invocations ended in the extraction process.
811

912
## Triggering event
1013

1114
Airdrop initiates data extraction by starting the snap-in with a message with event of type
1215
`EXTRACTION_DATA_START` when transitioning to the data extraction phase.
1316

1417
During the data extraction phase, the snap-in extracts data from an external system,
15-
prepares batches of data and uploads them in the form of artifacts to DevRev.
18+
prepares batches of data and uploads them in the form of artifacts (files) to DevRev.
1619

17-
The snap-in must respond to Airdrop with a message with event of type `EXTRACTION_DATA_PROGRESS`,
18-
together with an optional progress estimate and relevant artifacts
19-
when it extracts some data and the maximum Airdrop snap-in runtime (12 minutes) has been reached.
20+
The snap-in must respond to Airdrop with a message with event type of `EXTRACTION_DATA_PROGRESS`,
21+
together with an optional progress estimate and relevant list of artifacts
22+
when the maximum Airdrop snap-in runtime (12 minutes) has been reached.
2023

2124
If the extraction has been rate-limited by the external system and back-off is required, the snap-in
22-
must respond to Airdrop with a message with event of type `EXTRACTION_DATA_DELAY` and specifying
23-
back-off time with `delay` attribute.
25+
must respond to Airdrop with a message with event type `EXTRACTION_DATA_DELAY` and specifying
26+
back-off time with `delay` attribute (in seconds).
2427

2528
In both cases, Airdrop starts the snap-in with a message with event of type `EXTRACTION_DATA_CONTINUE`.
26-
The restarting is immediate (in case of `EXTRACTION_DATA_PROGRESS`) or delayed
27-
(in case of `EXTRACTION_DATA_DELAY`).
29+
In case of `EXTRACTION_DATA_PROGRESS` the restarting is immediate,
30+
meanwhile in case of `EXTRACTION_DATA_DELAY` the restarting is delayed for the given number of seconds.
2831

29-
Once the data extraction is done, the snap-in must respond to Airdrop with a message with event of
30-
type `EXTRACTION_DATA_DONE`.
32+
Once the data extraction is done, the snap-in must respond to Airdrop with a message with event type `EXTRACTION_DATA_DONE`.
3133

3234
If data extraction failed in any moment of extraction, the snap-in must respond to Airdrop with a
33-
message with event of type `EXTRACTION_DATA_ERROR`.
35+
message with event type `EXTRACTION_DATA_ERROR`.
3436

3537
## Implementation
3638

3739
Data extraction should be implemented in the [data-extraction.ts](https://github.com/devrev/airdrop-template/blob/main/code/src/functions/extraction/workers/data-extraction.ts) file.
3840

39-
During the data extraction phase, the snap-in uploads batches of extracted items (with tooling from `@devrev/adaas-sdk`).
41+
### Extracting and storing the data
42+
43+
The SDK library includes a repository system for storing extracted items.
44+
Each item type, such as users, tasks, or issues, has its own repository.
45+
These are defined in the `repos` array as `itemType`.
46+
The `itemType` name should match the `record_type` specified in the provided metadata.
47+
48+
```typescript
49+
const repos = [
50+
{
51+
itemType: 'todos',
52+
},
53+
{
54+
itemType: 'users',
55+
},
56+
{
57+
itemType: 'attachments',
58+
},
59+
];
60+
```
61+
62+
The `initializeRepos` function initializes the repositories and should be the first step when the process begins.
63+
64+
```typescript
65+
processTask<ExtractorState>({
66+
task: async ({ adapter }) => {
67+
adapter.initializeRepos(repos);
68+
// ...
69+
},
70+
onTimeout: async ({ adapter }) => {
71+
// ...
72+
},
73+
});
74+
```
75+
76+
After initialization, items are retrieved from the external system and stored in the repository by calling the `push` function.
77+
78+
```typescript
79+
await adapter.getRepo('users')?.push(items);
80+
```
81+
82+
### Data normalization
4083

41-
Each artifact is submitted with an `item_type`, defining a separate domain object from the
42-
external system and matching the `record_type` in the provided metadata.
84+
Extracted data must be normalized to fit the domain metadata defined in the `external-domain-metadata.json` file.
85+
More details on this process are provided in the [Metadata extraction](/public/snapin-development/adaas/metadata-extraction) section.
4386

44-
Extracted data must be normalized:
87+
Normalization rules:
4588

4689
- Null values: All fields without a value should either be omitted or set to null.
4790
For example, if an external system provides values such as "", –1 for missing values,
@@ -52,6 +95,27 @@ Extracted data must be normalized:
5295
- Number fields must be valid JSON numbers (not strings).
5396
- Multiselect fields must be provided as an array (not CSV).
5497

98+
Extracted items are automatically normalized when pushed to the `repo` if a normalization function is provided under the `normalize` key in the repo object.
99+
100+
```typescript
101+
const repos = [
102+
{
103+
itemType: 'todos',
104+
normalize: normalizeTodo,
105+
},
106+
{
107+
itemType: 'users',
108+
normalize: normalizeUser,
109+
},
110+
{
111+
itemType: 'attachments',
112+
normalize: normalizeAttachment,
113+
},
114+
];
115+
```
116+
117+
For examples of normalization functions, refer to the [data-normalization.ts](https://github.com/devrev/airdrop-template/blob/main/code/src/functions/external-system/data-normalization.ts) file in the starter template.
118+
55119
Each line of the file contains an `id` and the optional `created_date` and `modified_date` fields
56120
in the beginning of the record.
57121
All other fields are contained within the `data` attribute.
@@ -88,10 +152,8 @@ echo '{}' | chef-cli fuzz-extracted -r issue -m external_domain_metadata.json >
88152

89153
## State handling
90154

91-
Since each snap-in invocation is a separate runtime instance (with a maximum execution time of 12 minutes),
92-
it does not know what has been previously accomplished or how many records have already been extracted.
93-
To enable information passing between invocations and runs, support has been added for saving a limited amount
94-
of data as the snap-in `state`. Snap-in `state` persists between phases in one sync run as well as between multiple sync runs.
155+
To enable information passing between invocations and runs, a limited amount of data can be saved as the snap-in `state`.
156+
Snap-in `state` persists between phases in one sync run as well as between multiple sync runs.
95157
You can access the `state` through SDK's `adapter` object.
96158

97159
A snap-in must consult its state to obtain information on when the last successful forward sync started.

fern/docs/pages/airdrop/extraction-phases.mdx

Lines changed: 14 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,29 @@
11
Each snap-in must handle all the phases of Airdrop extraction. In a snap-in, you typically define a run
22
function that iterates over events and invokes workers per extraction phase.
33

4-
The SDK library exports `processTask` to structure the work within each phase, and `onTimeout` function
5-
to handle timeouts.
6-
74
The Airdrop snap-in extraction lifecycle consists of four phases:
8-
* External sync units extraction
5+
* External sync units extraction (only for initial sync)
96
* Metadata extraction
107
* Data extraction
118
* Attachments extraction
129

1310
Each phase is defined in a separate file and is responsible for fetching the respective data.
1411

15-
The SDK library provides a repository management system to handle artifacts in batches.
16-
The `initializeRepos` function initializes the repositories, and the `push` function uploads the
17-
artifacts to the repositories. The `postState` function is used to post the state of the extraction task.
12+
<Note>
13+
Snap-in development is an iterative process.
14+
It typically begins with retrieving some data from the external system.
15+
The next step involves crafting an initial version of the external domain metadata and validating it through chef-cli.
16+
This metadata is used to prepare the initial domain mapping and checking for any possible issues.
17+
API calls to the external system are then corrected to fetch the missing data.
18+
Start by working with one item type, and once it maps well to DevRev objects and imports as desired, proceed with other item types.
19+
</Note>
20+
21+
The SDK library exports `processTask` to structure the work within each phase, and `onTimeout` function
22+
to handle timeouts.
1823

1924
State management is crucial for snap-ins to maintain the state of the extraction task.
20-
The `postState` function is used to post the state of the extraction task.
21-
The state is stored in the adapter and can be retrieved using the `adapter.state` property.
25+
State is saved to the Airdrop backend by calling the `postState` function.
26+
During the extraction the state is stored in the adapter and can be retrieved using the `adapter.state` property.
2227

2328
```typescript
2429
import { AirdropEvent, EventType, spawn } from "@devrev/ts-adaas";

fern/docs/pages/airdrop/getting-started.mdx

Lines changed: 24 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -30,15 +30,18 @@ consider gathering the following information:
3030
- **Error handling**: Learn about error response formats and codes. Knowing this helps in
3131
handling errors and exceptions in your integration.
3232

33-
## Terminology
33+
## Basic concepts
3434

3535
### Sync unit
3636

37-
A _sync unit_ is one self encompassing unit of data that is synced to an external system. Examples:
37+
A _sync unit_ is one self encompassing unit of data that is synced to an external system. For example:
3838
- A project in Jira.
3939
- An account in SalesForce.
4040
- An organization Zendesk.
4141

42+
In Jira, users often have multiple projects. Each project acts as an individual sync unit.
43+
In contrast, Zendesk operates with a single large pool of tickets and agents. Here, the entire Zendesk instance can be synced in a single airdrop.
44+
4245
### Sync run
4346

4447
Airdrop extractions are done in _sync runs_.
@@ -61,13 +64,13 @@ An **extractor** function in the snap-in is responsible for extracting data from
6164
A _reverse sync_ is a sync run from DevRev to an external system.
6265
It uses a **loader** function, to create or update data in the external system.
6366

64-
### Initial import
67+
### Initial sync
6568

66-
An _initial import_ is the first import of data from the external system to DevRev.
69+
The first sync is called the _initial sync_.
6770
It is triggered manually by the end user in DevRev's **Airdrops** UI.
6871

69-
In initial import all data needs to be extracted to create a baseline (while in incremental runs only
70-
updated objects need to be extracted).
72+
During the initial sync, all data from the external sync unit is extracted and loaded into DevRev.
73+
This process typically involves a large import and may take some time.
7174

7275
An _initial import_ consists of the following phases:
7376

@@ -79,18 +82,25 @@ An _initial import_ consists of the following phases:
7982
### 1-way (incremental) sync
8083

8184
A _1-way sync_ (or _incremental sync_) refers to any extraction after the initial sync run has been successfully completed.
82-
An extractor extracts data that was created or updated in the external system after the start
83-
of the latest successful forward sync, including any changes that occurred during the forward sync,
84-
but were not picked up by it.
85+
This can be a forward sync or a reverse sync.
8586

86-
A snap-in must consult its state to get information on when the last successful forward sync started.
87-
Airdrop snap-ins must maintain their own states that persists between phases in a sync run,
88-
as well as between sync runs.
87+
#### 1-way forward sync
88+
89+
An extractor extracts data that was created or updated in the external system after the start
90+
of the latest successful forward sync.
91+
This includes any changes that happened during the previous sync, but were not picked up by it.
8992

90-
A 1-way sync consists of the following phases:
93+
A 1-way forward sync consists of the following phases:
9194

9295
1. Metadata extraction
9396
2. Data extraction
9497
3. Attachments extraction
9598

96-
A 1-way sync extracts only the domain objects updated or created since the previous successful sync run.
99+
#### 1-way reverse sync
100+
101+
The loader will check for any changes in DevRev after the latest successful reverse sync and update the data in the external system.
102+
103+
A 1-way reverse sync consists of the following phases:
104+
105+
1. Data loading
106+
2. Attachments loading

fern/docs/pages/airdrop/initial-domain-mapping.mdx

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,11 @@
11
Initial domain mapping is a process that establishes relationships between
2-
external data schemas and DevRev's native record types. This mapping is configured
3-
once and then becomes available to all users of your integration,
2+
external data schemas and DevRev's native record types.
3+
This mapping is configured once and then becomes available to all users of your snap-in,
44
allowing them to import data while maintaining semantic meaning from their source systems.
55

6+
The initial domain mapping is installed with your snap-in.
7+
The extractor automatically triggers a function to upload these mappings to the Airdrop system.
8+
69
## Chef-cli initial domain mapping setup
710

811
### Prerequisites

fern/docs/pages/airdrop/loading-phases.mdx

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,4 +49,9 @@ export default run;
4949
Loading phases run as separate runtime instances, similar to extraction phases, with a maximum execution time of 12 minutes.
5050
These phases share a `state`, defined in the `LoaderState` interface.
5151
It is important to note that the loader state is separate from the extractor state.
52+
5253
Access to the `state` is available through the SDK's `adapter` object.
54+
55+
## Creating items in DevRev
56+
57+
To create an item in DevRev and sync it with the external system, start by creating an item with a **subtype** that was established during the initial sync. After selecting the subtype, fill out the necessary details for the item.

fern/docs/pages/airdrop/local-development.mdx

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ For easier development you can run your Airdrop snap-in locally and receive logs
99

1010
## Run the template
1111

12-
DevRev provides a starter template, which you can run and test out right away.
12+
DevRev offers a starter Airdrop snap-in template that is ready for immediate use and testing.
1313

1414
1. Create a new repository:
1515
- Create a new repository from this template by clicking the "Use this template" button in the upper right corner and then "Create a new repository".
@@ -46,6 +46,8 @@ DevRev provides a starter template, which you can run and test out right away.
4646
devrev snap_in activate
4747
```
4848

49+
# Initial sync
50+
4951
Now that you have a running snap-in, you can start an airdrop.
5052
Go to DevRev app and click **Airdrops** -> **Start Airdrop** -> **Your snap-in**.
5153

fern/docs/pages/airdrop/manifest.mdx

Lines changed: 19 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -46,18 +46,18 @@ Ensure that `extractor_function` and `loader_function` names correspond with tho
4646

4747
## Establish a connection to the external system
4848

49-
_Keyrings_ are a collection of authentication information, used by a snap-in to authenticate to the external system in API calls. This can include a key (for example, a PAT token or API key), its type, the organization ID for which a key is valid, and in some cases the organization name.
49+
_Keyrings_ provide a secure way to store and manage credentials within your DevRev snap-in.
5050

51-
Keyrings provide a secure way to store and manage credentials within your DevRev snap-in.
52-
This eliminates the need to expose sensitive information like passwords or access tokens directly
53-
within your code or configuration files, enhancing overall security.
54-
They also provide a valid token by abstracting OAuth token renewal from the end user.
51+
Keyrings are a collection of authentication information, used by a snap-in to authenticate to the external system in API calls.
52+
This can include a key (for example, a PAT token or API key), its type and the organization ID for which a key is valid.
5553

56-
They are called **Connections** in the DevRev app.
54+
This eliminates the need to expose sensitive information like passwords or access tokens directly within your code or configuration files. They also provide a valid token by abstracting OAuth token renewal from the end user, so less work is needed on the developer's side.
55+
56+
Keyrings are called **Connections** in the DevRev app.
5757

5858
### Configure a keyring
5959

60-
Keyrings are configured in the `manifest.yaml` by configuring a `keyring_type`, like in the [example](https://github.com/devrev/airdrop-template/blob/main/manifest.yaml).
60+
Keyrings are configured in the `manifest.yaml` by configuring a `keyring_type`, like in the [example](https://github.com/devrev/airdrop-template/blob/main/manifest.yaml):
6161

6262
```yaml
6363
keyring_types:
@@ -67,8 +67,7 @@ keyring_types:
6767
# The kind field specifies the type of keyring.
6868
kind: <"secret"/"oauth2">
6969
# is_subdomain field specifies whether the keyring contains a subdomain.
70-
# Enabling this field allows the keyring to get the subdomain from the user during creation.
71-
# This is useful when the keyring requires a subdomain as part of the configuration.
70+
# Enabling this field allows the keyring to get the subdomain from the user during keyring creation.
7271
# Default is false.
7372
is_subdomain: <true/false>
7473
# Name of the external system you are importing from.
@@ -96,7 +95,7 @@ keyring_types:
9695
# Optional: query parameters to be included in the verification request.
9796
query_params:
9897
<param_name>: <param_value> # optional: query parameters to be included in the verification request.
99-
# Fetching Organization Data: This allows you to retrieve additional information about the user's organization.
98+
# Optional: fetching organization data if is_subdomain option is false.
10099
organization_data:
101100
type: "config"
102101
# The URL to which the request is sent to fetch organization data.
@@ -106,7 +105,16 @@ keyring_types:
106105
headers:
107106
<header_name>: <header_value>
108107
# The jq filter used to extract the organization data from the response.
108+
# It should provide an object with id and name, depending on what the external system returns.
109+
# For example "{id: .data[0].id, name: .data[0].name }".
109110
response_jq: <jq_filter>
110111
```
112+
There are some options to consider:
113+
114+
* `kind`
115+
The `kind` option can be either "secret" or "oauth2". The "secret" option is intended for storing various tokens, such as a PAT token. Use of OAuth2 is encouraged when possible. More information is available for [secret](/public/snapin-development/references/keyrings/secret-configuration) and [oauth2](/oauth-configuration).
116+
117+
* `is_subdomain`
118+
The `is_subdomain` field relates to the API endpoints being called. When the endpoints for fetching data from an external system include a slug representing the organization—such as for example `https://subdomain.freshdesk.com/api/v2/tickets`—set this key to "true". In this scenario, users creating a new connection are prompted to insert the subdomain.
111119

112-
You can find more information about keyrings and keyring types [here](/snapin-development/references/keyrings/keyring-intro).
120+
If no subdomain is present in the endpoint URL, set this key to "false". In this case, provide the `organization_data` part of the configuration. Specify the endpoint in the `url` field to fetch organization data. Users creating a new connection are prompted to select the organization from a list of options, as retrieved from the `organization_data.url` value.

fern/docs/pages/airdrop/metadata-extraction.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -243,7 +243,7 @@ be changed by the end user at any time, such as mandatory fields or custom field
243243
A good practice is to retrieve the set of possible values for all enum fields from the external
244244
system's APIs in each sync run. You can mark specific enum values as deprecated using the `is_deprecated` property.
245245

246-
`ID` (primary key) of the record, `created_date`, and `modified_date` must not be declared.
246+
**`ID` (primary key) of the record, `created_date`, and `modified_date` must not be declared.**
247247

248248
Example:
249249

0 commit comments

Comments
 (0)