Skip to content

Commit e69ffd3

Browse files
ericallamnicktrnmatt-aitken
authored
v3: Refactor attempt creation to be worker requested (#1077)
* WIP worker TaskRunAttempt creation * Handling failing task runs that cannot create an attempt for whatever reason * Move the visibility queue stuff into a graphile job * Fixed task runs with unsanitized queue names * “Borrow” the code from alerts PR to get self hosted deployments working * Add an admin API endpoint to get info about the shared marqs queue * Allow admins to view any project metrics * start adding lazy attempts to prod * lazy attempt creation for prod workers * resurrect prod stack traces * add exception event to failed run spans * simplify dependency resumes * fix typecheck * fix merge * fresh process for all attempts * always try sigterm first * stop heartbeat timeout on non-inplace replace message * add missing ack on checkpoint creation service failure * bypass dequeue for retries with running worker * respect retry delays * crash runs with invalid run status for execution * remove debug logs * fix nack message * fix version locking * fresh attempt processes in dev and prod * improve handling of ipc timeouts * consider checkpoint failures on cancellation * add basic chaos monkey to checkpointer * changeset * control forced checkpoint simulation via env var * fix merge * kill old attempt processes before checkpointing * detailed perf logging for checkpointing * add coordinator otlp endpoint example * improve prod run cancellation * rename supports lazy attempts migration * fix graceful exit * fix retry mechanics * clear paused state before retry * remove checkpoint image after push * crash worker on unrecoverable errors * refactor unrecoverable error emit * switch to do hosted busybox image * increase wait for duration ipc timeout * add changeset for misc fixes * fix merge * fix retry delay span runId * fix dev retries * improve prod worker logging * log checkpoint sizes * add lazy attempts catalog entries * Fixed merge issue: use zodFetch, not wrapZodFetch * Revert "Fixed merge issue: use zodFetch, not wrapZodFetch" This reverts commit d137e4e. * importEnvVars uses wrapZodFetch now * add backwards compat for retries without checkpoints * handle more cases of unrecoverable runs * don't kill the child process if it shouldn't be killed --------- Co-authored-by: nicktrn <[email protected]> Co-authored-by: Matt Aitken <[email protected]>
1 parent 782d4f7 commit e69ffd3

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

51 files changed

+3345
-885
lines changed

.changeset/tricky-keys-attack.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
---
2+
"trigger.dev": patch
3+
"@trigger.dev/core": patch
4+
---
5+
6+
- Clear paused states before retry
7+
- Detect and handle unrecoverable worker errors
8+
- Remove checkpoints after successful push
9+
- Permanently switch to DO hosted busybox image
10+
- Fix IPC timeout issue, or at least handle it more gracefully
11+
- Handle checkpoint failures
12+
- Basic chaos monkey for checkpoint testing
13+
- Stack traces are back in the dashboard
14+
- Display final errors on root span

.changeset/warm-olives-provide.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
"@trigger.dev/core": patch
3+
---
4+
5+
Improve handling of IPC timeouts and fix checkpoint cancellation after failures

0 commit comments

Comments
 (0)