@@ -201,8 +201,6 @@ The Run Engine emits events using its `eventBus`. This is used for runs completi
201
201
202
202
# RunEngine System Architecture
203
203
204
- The RunEngine is composed of several specialized systems that handle different aspects of task execution and management. Below is a diagram showing the relationships between these systems.
205
-
206
204
``` mermaid
207
205
graph TD
208
206
RE[RunEngine]
@@ -216,6 +214,7 @@ graph TD
216
214
DRS[DelayedRunSystem]
217
215
TS[TtlSystem]
218
216
WFS[WaitingForWorkerSystem]
217
+ RCS[ReleaseConcurrencySystem]
219
218
220
219
%% Core Dependencies
221
220
RE --> DS
@@ -228,6 +227,7 @@ graph TD
228
227
RE --> DRS
229
228
RE --> TS
230
229
RE --> WFS
230
+ RE --> RCS
231
231
232
232
%% System Dependencies
233
233
DS --> ESS
@@ -239,11 +239,13 @@ graph TD
239
239
240
240
WS --> ESS
241
241
WS --> ES
242
+ WS --> RCS
242
243
243
244
ES --> ESS
244
245
245
246
CS --> ESS
246
247
CS --> ES
248
+ CS --> RCS
247
249
248
250
DRS --> ES
249
251
@@ -260,54 +262,34 @@ graph TD
260
262
RL[RunLocker]
261
263
EB[EventBus]
262
264
WRK[Worker]
263
- RCQ[ReleaseConcurrencyQueue]
264
265
end
265
266
266
267
%% Resource Dependencies
267
268
RE -.-> Resources
268
- DS & RAS & ESS & WS & BS & ES & CS & DRS & TS & WFS -.-> Resources
269
+ DS & RAS & ESS & WS & BS & ES & CS & DRS & TS & WFS & RCS -.-> Resources
269
270
```
270
271
271
272
## System Responsibilities
272
273
273
- ### DequeueSystem
274
-
275
- - Handles dequeuing of tasks from master queues
276
- - Manages resource allocation and constraints
277
- - Handles task deployment verification
278
-
279
- ### RunAttemptSystem
280
-
281
- - Manages run attempt lifecycle
282
- - Handles success/failure scenarios
283
- - Manages retries and cancellations
284
- - Coordinates with other systems for run completion
285
-
286
- ### ExecutionSnapshotSystem
287
-
288
- - Creates and manages execution snapshots
289
- - Tracks run state and progress
290
- - Manages heartbeats for active runs
291
- - Maintains execution history
292
-
293
- ### WaitpointSystem
274
+ ### Core Systems
294
275
295
- - Manages waitpoints for task synchronization
296
- - Handles waitpoint completion
297
- - Coordinates blocked runs
298
- - Manages concurrency release
276
+ - ** ExecutionSnapshotSystem (ESS) ** : Manages execution state tracking and history
277
+ - ** EnqueueSystem (ES) ** : Handles run scheduling and queueing
278
+ - ** RunAttemptSystem (RAS) ** : Controls run lifecycle and execution attempts
279
+ - ** ReleaseConcurrencySystem (RCS) ** : Manages concurrency control and token management
299
280
300
- ### BatchSystem
281
+ ### Queue Management
301
282
302
- - Manages batch operations
303
- - Handles batch completion
304
- - Coordinates batch-related task runs
283
+ - ** DequeueSystem (DS) ** : Handles task dequeuing and resource allocation
284
+ - ** DelayedRunSystem (DRS) ** : Manages delayed run scheduling
285
+ - ** WaitingForWorkerSystem (WFS) ** : Coordinates runs waiting for worker availability
305
286
306
- ### EnqueueSystem
287
+ ### State Management
307
288
308
- - Handles enqueueing of runs
309
- - Manages run scheduling
310
- - Coordinates with execution snapshots
289
+ - ** CheckpointSystem (CS)** : Manages execution checkpoints and resumption
290
+ - ** WaitpointSystem (WS)** : Coordinates task synchronization points
291
+ - ** BatchSystem (BS)** : Handles batch operations and coordination
292
+ - ** TtlSystem (TS)** : Manages run time-to-live and expiration
311
293
312
294
## Shared Resources
313
295
@@ -318,13 +300,27 @@ graph TD
318
300
- ** RunLocker** : Run locking mechanism
319
301
- ** EventBus** : Event communication
320
302
- ** Worker** : Background task execution
321
- - ** ReleaseConcurrencyQueue** : Manages concurrency token release
322
303
323
304
## Key Interactions
324
305
325
- 1 . ** RunEngine** orchestrates all systems and manages shared resources
326
- 2 . ** DequeueSystem** works with ** RunAttemptSystem** for task execution
327
- 3 . ** RunAttemptSystem** coordinates with ** WaitpointSystem** and ** BatchSystem**
328
- 4 . ** WaitpointSystem** uses ** EnqueueSystem** for run scheduling
329
- 5 . ** ExecutionSnapshotSystem** is used by all other systems to track state
330
- 6 . All systems share common resources through the ` SystemResources ` interface
306
+ 1 . ** Core Execution Flow**
307
+
308
+ - ExecutionSnapshotSystem provides state tracking for all systems
309
+ - EnqueueSystem coordinates with multiple systems for scheduling
310
+ - RunAttemptSystem manages execution lifecycle
311
+ - ReleaseConcurrencySystem controls concurrent execution limits
312
+
313
+ 2 . ** Queue Management**
314
+
315
+ - DequeueSystem coordinates with RunAttemptSystem for execution
316
+ - DelayedRunSystem and WaitingForWorkerSystem handle specialized queuing
317
+
318
+ 3 . ** State Coordination**
319
+
320
+ - CheckpointSystem works with ReleaseConcurrencySystem and EnqueueSystem
321
+ - WaitpointSystem coordinates with ReleaseConcurrencySystem for execution control
322
+ - TtlSystem manages expiration through WaitpointSystem
323
+
324
+ 4 . ** Resource Management**
325
+ - All systems share common resources through SystemResources
326
+ - ReleaseConcurrencySystem provides centralized concurrency control
0 commit comments