|
2 | 2 |
|
3 | 3 | ## Release Notes
|
4 | 4 |
|
| 5 | +### v1.2.5 |
| 6 | + |
| 7 | +- Add batch size option. |
| 8 | +- Replace DevRev Typescript SDK requests with Axios for uploading and downloading artifacts. |
| 9 | +- Remove unneccessary postState from default workers. |
| 10 | +- Fix bugs related to attachment streaming. |
| 11 | + |
5 | 12 | ### v1.2.4
|
6 | 13 |
|
7 | 14 | - Do not fail the extraction of attachments if streaming of single attachment fails.
|
@@ -126,286 +133,3 @@ It provides features such as:
|
126 | 133 | ```bash
|
127 | 134 | npm install @devrev/ts-adaas
|
128 | 135 | ```
|
129 |
| - |
130 |
| -# Usage |
131 |
| - |
132 |
| -ADaaS Snap-ins can import data in both directions: from external sources to DevRev and from DevRev to external sources. Both directions are composed of several phases. |
133 |
| - |
134 |
| -From external source to DevRev: |
135 |
| - |
136 |
| -- External Sync Units Extraction |
137 |
| -- Metadata Extraction |
138 |
| -- Data Extraction |
139 |
| -- Attachments Extraction |
140 |
| - |
141 |
| -From DevRev to external source: |
142 |
| - |
143 |
| -- Data Loading |
144 |
| - |
145 |
| -Each phase comes with unique requirements for processing task, and both timeout and error handling. |
146 |
| - |
147 |
| -The ADaaS library exports processTask to structure the work within each phase, and onTimeout function to handle timeouts. |
148 |
| - |
149 |
| -### ADaaS Snap-in Invocation |
150 |
| - |
151 |
| -Each ADaaS snap-in must handle all the phases of ADaaS extraction. In a Snap-in, you typically define a `run` function that iterates over events and invokes workers per extraction phase. |
152 |
| - |
153 |
| -```typescript |
154 |
| -import { AirdropEvent, EventType, spawn } from '@devrev/ts-adaas'; |
155 |
| - |
156 |
| -interface DummyExtractorState { |
157 |
| - issues: { completed: boolean }; |
158 |
| - users: { completed: boolean }; |
159 |
| - attachments: { completed: boolean }; |
160 |
| -} |
161 |
| - |
162 |
| -const initialState: DummyExtractorState = { |
163 |
| - issues: { completed: false }, |
164 |
| - users: { completed: false }, |
165 |
| - attachments: { completed: false }, |
166 |
| -}; |
167 |
| - |
168 |
| -function getWorkerPerExtractionPhase(event: AirdropEvent) { |
169 |
| - let path; |
170 |
| - switch (event.payload.event_type) { |
171 |
| - case EventType.ExtractionExternalSyncUnitsStart: |
172 |
| - path = __dirname + '/workers/external-sync-units-extraction'; |
173 |
| - break; |
174 |
| - case EventType.ExtractionMetadataStart: |
175 |
| - path = __dirname + '/workers/metadata-extraction'; |
176 |
| - break; |
177 |
| - case EventType.ExtractionDataStart: |
178 |
| - case EventType.ExtractionDataContinue: |
179 |
| - path = __dirname + '/workers/data-extraction'; |
180 |
| - break; |
181 |
| - } |
182 |
| - return path; |
183 |
| -} |
184 |
| - |
185 |
| -const run = async (events: AirdropEvent[]) => { |
186 |
| - for (const event of events) { |
187 |
| - const file = getWorkerPerExtractionPhase(event); |
188 |
| - await spawn<DummyExtractorState>({ |
189 |
| - event, |
190 |
| - initialState, |
191 |
| - workerPath: file, |
192 |
| - options: { |
193 |
| - isLocalDevelopment: true, |
194 |
| - }, |
195 |
| - }); |
196 |
| - } |
197 |
| -}; |
198 |
| - |
199 |
| -export default run; |
200 |
| -``` |
201 |
| - |
202 |
| -## Extraction |
203 |
| - |
204 |
| -The ADaaS snap-in extraction lifecycle consists of three main phases: External Sync Units Extraction, Metadata Extraction, and Data Extraction. Each phase is defined in a separate file and is responsible for fetching the respective data. |
205 |
| - |
206 |
| -The ADaaS library provides a repository management system to handle artifacts in batches. The `initializeRepos` function initializes the repositories, and the `push` function uploads the artifacts to the repositories. The `postState` function is used to post the state of the extraction task. |
207 |
| - |
208 |
| -State management is crucial for ADaaS Snap-ins to maintain the state of the extraction task. The `postState` function is used to post the state of the extraction task. The state is stored in the adapter and can be retrieved using the `adapter.state` property. |
209 |
| - |
210 |
| -### 1. External Sync Units Extraction |
211 |
| - |
212 |
| -This phase is defined in `external-sync-units-extraction.ts` and is responsible for fetching the external sync units. |
213 |
| - |
214 |
| -```typescript |
215 |
| -import { |
216 |
| - ExternalSyncUnit, |
217 |
| - ExtractorEventType, |
218 |
| - processTask, |
219 |
| -} from '@devrev/ts-adaas'; |
220 |
| - |
221 |
| -const externalSyncUnits: ExternalSyncUnit[] = [ |
222 |
| - { |
223 |
| - id: 'devrev', |
224 |
| - name: 'devrev', |
225 |
| - description: 'Demo external sync unit', |
226 |
| - item_count: 2, |
227 |
| - item_type: 'issues', |
228 |
| - }, |
229 |
| -]; |
230 |
| - |
231 |
| -processTask({ |
232 |
| - task: async ({ adapter }) => { |
233 |
| - await adapter.emit(ExtractorEventType.ExtractionExternalSyncUnitsDone, { |
234 |
| - external_sync_units: externalSyncUnits, |
235 |
| - }); |
236 |
| - }, |
237 |
| - onTimeout: async ({ adapter }) => { |
238 |
| - await adapter.emit(ExtractorEventType.ExtractionExternalSyncUnitsError, { |
239 |
| - error: { |
240 |
| - message: 'Failed to extract external sync units. Lambda timeout.', |
241 |
| - }, |
242 |
| - }); |
243 |
| - }, |
244 |
| -}); |
245 |
| -``` |
246 |
| - |
247 |
| -### 2. Metadata Extraction |
248 |
| - |
249 |
| -This phase is defined in `metadata-extraction.ts` and is responsible for fetching the metadata. |
250 |
| - |
251 |
| -```typescript |
252 |
| -import { ExtractorEventType, processTask } from '@devrev/ts-adaas'; |
253 |
| -import externalDomainMetadata from '../dummy-extractor/external_domain_metadata.json'; |
254 |
| - |
255 |
| -const repos = [{ itemType: 'external_domain_metadata' }]; |
256 |
| - |
257 |
| -processTask({ |
258 |
| - task: async ({ adapter }) => { |
259 |
| - adapter.initializeRepos(repos); |
260 |
| - await adapter |
261 |
| - .getRepo('external_domain_metadata') |
262 |
| - ?.push([externalDomainMetadata]); |
263 |
| - await adapter.emit(ExtractorEventType.ExtractionMetadataDone); |
264 |
| - }, |
265 |
| - onTimeout: async ({ adapter }) => { |
266 |
| - await adapter.emit(ExtractorEventType.ExtractionMetadataError, { |
267 |
| - error: { message: 'Failed to extract metadata. Lambda timeout.' }, |
268 |
| - }); |
269 |
| - }, |
270 |
| -}); |
271 |
| -``` |
272 |
| - |
273 |
| -### 3. Data Extraction |
274 |
| - |
275 |
| -This phase is defined in `data-extraction.ts` and is responsible for fetching the data. In this phase also attachments metadata is extracted. |
276 |
| - |
277 |
| -```typescript |
278 |
| -import { EventType, ExtractorEventType, processTask } from '@devrev/ts-adaas'; |
279 |
| -import { normalizeAttachment, normalizeIssue, normalizeUser } from '../dummy-extractor/data-normalization'; |
280 |
| - |
281 |
| -const issues = [ |
282 |
| - { id: 'issue-1', created_date: '1999-12-25T01:00:03+01:00', ... }, |
283 |
| - { id: 'issue-2', created_date: '1999-12-27T15:31:34+01:00', ... }, |
284 |
| -]; |
285 |
| - |
286 |
| -const users = [ |
287 |
| - { id: 'user-1', created_date: '1999-12-25T01:00:03+01:00', ... }, |
288 |
| - { id: 'user-2', created_date: '1999-12-27T15:31:34+01:00', ... }, |
289 |
| -]; |
290 |
| - |
291 |
| -const attachments = [ |
292 |
| - { url: 'https://app.dev.devrev-eng.ai/favicon.ico', id: 'attachment-1', ... }, |
293 |
| - { url: 'https://app.dev.devrev-eng.ai/favicon.ico', id: 'attachment-2', ... }, |
294 |
| -]; |
295 |
| - |
296 |
| -const repos = [ |
297 |
| - { itemType: 'issues', normalize: normalizeIssue }, |
298 |
| - { itemType: 'users', normalize: normalizeUser }, |
299 |
| - { itemType: 'attachments', normalize: normalizeAttachment }, |
300 |
| -]; |
301 |
| - |
302 |
| -processTask({ |
303 |
| - task: async ({ adapter }) => { |
304 |
| - adapter.initializeRepos(repos); |
305 |
| - |
306 |
| - if (adapter.event.payload.event_type === EventType.ExtractionDataStart) { |
307 |
| - await adapter.getRepo('issues')?.push(issues); |
308 |
| - await adapter.emit(ExtractorEventType.ExtractionDataProgress, { progress: 50 }); |
309 |
| - } else { |
310 |
| - await adapter.getRepo('users')?.push(users); |
311 |
| - await adapter.getRepo('attachments')?.push(attachments); |
312 |
| - await adapter.emit(ExtractorEventType.ExtractionDataDone, { progress: 100 }); |
313 |
| - } |
314 |
| - }, |
315 |
| - onTimeout: async ({ adapter }) => { |
316 |
| - await adapter.postState(); |
317 |
| - await adapter.emit(ExtractorEventType.ExtractionDataProgress, { progress: 50 }); |
318 |
| - }, |
319 |
| -}); |
320 |
| -``` |
321 |
| - |
322 |
| -### 4. Attachments Streaming |
323 |
| - |
324 |
| -The ADaaS library handles attachments streaming to improve efficiency and reduce complexity for developers. During the extraction phase, developers need only to provide metadata in a specific format for each attachment, and the library manages the streaming process. |
325 |
| - |
326 |
| -The Snap-in should provide attachment metadata following the `NormalizedAttachment` interface: |
327 |
| - |
328 |
| -```typescript |
329 |
| -export interface NormalizedAttachment { |
330 |
| - url: string; |
331 |
| - id: string; |
332 |
| - file_name: string; |
333 |
| - author_id: string; |
334 |
| - parent_id: string; |
335 |
| -} |
336 |
| -``` |
337 |
| - |
338 |
| -## Loading phases |
339 |
| - |
340 |
| -### 1. Loading Data |
341 |
| - |
342 |
| -This phase is defined in `load-data.ts` and is responsible for loading the data to the external system. |
343 |
| - |
344 |
| -Loading is done by providing an ordered list of itemTypes to load and their respective create and update functions. |
345 |
| - |
346 |
| -```typescript |
347 |
| - processTask({ |
348 |
| - task: async ({ adapter }) => { |
349 |
| - const { reports, processed_files } = await adapter.loadItemTypes({ |
350 |
| - itemTypesToLoad: [ |
351 |
| - { |
352 |
| - itemType: 'tickets', |
353 |
| - create: createTicket, |
354 |
| - update: updateTicket, |
355 |
| - }, |
356 |
| - { |
357 |
| - itemType: 'conversations', |
358 |
| - create: createConversation, |
359 |
| - update: updateConversation, |
360 |
| - }, |
361 |
| - ], |
362 |
| - }); |
363 |
| - |
364 |
| - await adapter.emit(LoaderEventType.DataLoadingDone, { |
365 |
| - reports, |
366 |
| - processed_files, |
367 |
| - }); |
368 |
| - }, |
369 |
| - onTimeout: async ({ adapter }) => { |
370 |
| - await adapter.emit(LoaderEventType.DataLoadingProgress, { |
371 |
| - reports: adapter.reports, |
372 |
| - processed_files: adapter.processedFiles, |
373 |
| - }); |
374 |
| -}); |
375 |
| -``` |
376 |
| -
|
377 |
| -The loading functions `create` and `update` provide loading to the external system. They provide denormalization of the records to the schema of the external system and provide HTTP calls to the external system. Both loading functions must handle rate limiting for the external system and handle errors. |
378 |
| -
|
379 |
| -Functions return an ID and modified date of the record in the external system, or specify rate-liming offset or errors, if the record could not be created or updated. |
380 |
| -
|
381 |
| -### 2. Loading Attachments |
382 |
| -
|
383 |
| -This phase is defined in `load-attachments.ts` and is responsible for loading the attachments to the external system. |
384 |
| -
|
385 |
| -Loading is done by providing the create function to create attachments in the external system. |
386 |
| -
|
387 |
| -```typescript |
388 |
| -processTask({ |
389 |
| - task: async ({ adapter }) => { |
390 |
| - const { reports, processed_files } = await adapter.loadAttachments({ |
391 |
| - create, |
392 |
| - }); |
393 |
| - |
394 |
| - await adapter.emit(LoaderEventType.AttachmentLoadingDone, { |
395 |
| - reports, |
396 |
| - processed_files, |
397 |
| - }); |
398 |
| - }, |
399 |
| - onTimeout: async ({ adapter }) => { |
400 |
| - await adapter.postState(); |
401 |
| - await adapter.emit(LoaderEventType.AttachmentLoadingProgress, { |
402 |
| - reports: adapter.reports, |
403 |
| - processed_files: adapter.processedFiles, |
404 |
| - }); |
405 |
| - }, |
406 |
| -}); |
407 |
| -``` |
408 |
| -
|
409 |
| -The loading function `create` provides loading to the external system, to make API calls to the external system to create the attachments and handle errors and external system's rate limiting. |
410 |
| -
|
411 |
| -Functions return an ID and modified date of the record in the external system, specify rate-liming back-off, or log errors, if the attachment could not be created. |
0 commit comments