Skip to content

Feat: Add Airdrop loading documentation #201

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 8, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions fern/docs/pages/airdrop/extraction-phases.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,13 @@ function that iterates over events and invokes workers per extraction phase.
The SDK library exports `processTask` to structure the work within each phase, and `onTimeout` function
to handle timeouts.

The Airdrop snap-in extraction lifecycle consists of four phases: External Sync Units Extraction,
Metadata Extraction, Data Extraction and Attachments Extraction. Each phase is defined in a
separate file and is responsible for fetching the respective data.
The Airdrop snap-in extraction lifecycle consists of four phases:
* External sync units extraction
* Metadata extraction
* Data extraction
* Attachments extraction

Each phase is defined in a separate file and is responsible for fetching the respective data.

The SDK library provides a repository management system to handle artifacts in batches.
The `initializeRepos` function initializes the repositories, and the `push` function uploads the
Expand Down Expand Up @@ -59,9 +63,6 @@ const run = async (events: AirdropEvent[]) => {
event,
initialState,
workerPath: file,

// TODO: If needed you can pass additional options to the spawn function.
// For example timeout of the lambda, batch size, etc.
// options: {},
});
}
Expand Down
35 changes: 35 additions & 0 deletions fern/docs/pages/airdrop/load-attachments.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
In the load attachments phase, the snap-in saves each attachment to the external system.

## Triggering event

Airdrop initiates the load attachments phase by starting the snap-in with a message containing an event of type `START_LOADING_ATTACHMENTS`.

## Implementation

This phase is defined in [load-attachments.ts](https://github.com/devrev/airdrop-template/blob/main/code/src/functions/loading/workers/load-attachments.ts).

The loading process involves providing the `create` function to add attachments to the external system. The `create` function is responsible for making API calls to the external system to create the attachments, as well as handling errors and the external system's rate limiting. The function should return the `id` and optionally `modifiedDate` of the record in the external system or indicates a rate-limiting back-off or logs errors if the attachment could not be created.

```typescript
processTask<LoaderState>({
task: async ({ adapter }) => {
const { reports, processed_files } = await adapter.loadAttachments({
create: createAttachment,
});

await adapter.emit(LoaderEventType.AttachmentLoadingDone, {
reports,
processed_files,
});
},
onTimeout: async ({ adapter }) => {
await adapter.emit(LoaderEventType.AttachmentLoadingProgress, {
reports: adapter.reports,
processed_files: adapter.processedFiles,
});
},
});
```



42 changes: 42 additions & 0 deletions fern/docs/pages/airdrop/load-data.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
The load data phase manages the creation and updating of items in the external system.

## Triggering event

Airdrop initiates data loading by starting the snap-in with a message containing an event of type `START_LOADING_DATA`.

## Implementation

This phase is defined in [load-data.ts](https://github.com/devrev/airdrop-template/blob/main/code/src/functions/loading/workers/load-data.ts).

Loading is performed by providing a list of item types to load (`itemTypesToLoad`), ordered in the sequence they should be loaded.

Each item type must provide `create` and `update` functions, which handle the denormalization of records to the schema of the external system and facilitate HTTP calls to the external system. Both loading functions must manage rate limiting for the external system and handle errors. The `create` and `update` functions should return an `id` of the record in the external system and optionally also `modifiedDate`. If a record cannot be created or updated, they indicate the rate-limiting offset or errors.

Same as with extraction, the SDK library exports the `processTask` function to structure the work within each phase and the `onTimeout` function to handle timeouts.

```typescript
processTask<LoaderState>({
task: async ({ adapter }) => {
const { reports, processed_files } = await adapter.loadItemTypes({
itemTypesToLoad: [
{
itemType: 'todos',
create: createTodo,
update: updateTodo,
},
],
});

await adapter.emit(LoaderEventType.DataLoadingDone, {
reports,
processed_files,
});
},
onTimeout: async ({ adapter }) => {
await adapter.emit(LoaderEventType.DataLoadingProgress, {
reports: adapter.reports,
processed_files: adapter.processedFiles,
});
},
});
```
53 changes: 51 additions & 2 deletions fern/docs/pages/airdrop/loading-phases.mdx
Original file line number Diff line number Diff line change
@@ -1,3 +1,52 @@
# Loading
Loading is the process of exporting data from DevRev back to the external system.
This process includes creating new items in the external system and updating them with any changes made in DevRev.

To be added.
The snap-in manages two phases for loading:
* Load data
* Load attachments

Each phase is defined in a separate file and is responsible for loading the corresponding data.

```typescript
import { AirdropEvent, EventType, spawn } from '@devrev/ts-adaas';

export interface LoaderState {}

export const initialLoaderState: LoaderState = {};

function getWorkerPerLoadingPhase(event: AirdropEvent) {
let path;
switch (event.payload.event_type) {
case EventType.StartLoadingData:
case EventType.ContinueLoadingData:
path = __dirname + '/workers/load-data';
break;
case EventType.StartLoadingAttachments:
case EventType.ContinueLoadingAttachments:
path = __dirname + '/workers/load-attachments';
break;
}
return path;
}

const run = async (events: AirdropEvent[]) => {
for (const event of events) {
const file = getWorkerPerLoadingPhase(event);
await spawn<LoaderState>({
event,
initialState: initialLoaderState,
workerPath: file,
// options: {},
});
}
};

export default run;
```

## State handling

Loading phases run as separate runtime instances, similar to extraction phases, with a maximum execution time of 12 minutes.
These phases share a `state`, defined in the `LoaderState` interface.
It is important to note that the loader state is separate from the extractor state.
Access to the `state` is available through the SDK's `adapter` object.
8 changes: 6 additions & 2 deletions fern/versions/public.yml
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,6 @@ navigation:
slug: metadata-extraction
path: ../docs/pages/airdrop/metadata-extraction.mdx
- page: "Initial domain mapping"
hidden: false
slug: initial-domain-mapping
path: ../docs/pages/airdrop/initial-domain-mapping.mdx
- page: "Data extraction"
Expand All @@ -177,8 +176,13 @@ navigation:
path: ../docs/pages/airdrop/attachments-extraction.mdx
- page: "Loading phases"
slug: loading-phases
hidden: true
path: ../docs/pages/airdrop/loading-phases.mdx
- page: "Load data"
slug: load-data
path: ../docs/pages/airdrop/load-data.mdx
- page: "Load attachments"
slug: load-attachments
path: ../docs/pages/airdrop/load-attachments.mdx
- page: "Data and attachments deletion"
slug: data-attachments-deletion
hidden: true
Expand Down
Loading