Skip to content

Improve unit test workflow performance #2096

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
May 23, 2025
Merged

Improve unit test workflow performance #2096

merged 20 commits into from
May 23, 2025

Conversation

nicktrn
Copy link
Collaborator

@nicktrn nicktrn commented May 23, 2025

This decreases unit test duration from >21m to ~6m30s.

vitest and @vitest/coverage-v8 have been updated and locked to the latest version. They are now only present on the root package.json just like typescript, tsx, etc.

Copy link

changeset-bot bot commented May 23, 2025

⚠️ No Changeset found

Latest commit: d6dd1a7

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Contributor

coderabbitai bot commented May 23, 2025

Walkthrough

The changes introduce a modular and sharded approach to running unit tests in the repository's continuous integration setup. Three new GitHub Actions workflows are added: one each for webapp, packages, and internal unit tests, each supporting sharded execution and report merging. The main unit test workflow is refactored to delegate to these reusable workflows. In addition, the root package.json upgrades and centralizes the vitest testing framework and its coverage plugin, while all individual package-level vitest and related dependencies are removed. The webapp's replication service integration tests are split into two files, redistributing and expanding test coverage. Additionally, a performance test in the webapp suite is adjusted to allow a 15% tolerance margin in timing assertions.

Note

⚡️ AI Code Reviews for VS Code, Cursor, Windsurf

CodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback.
Learn more here.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d7c0f34 and d6dd1a7.

📒 Files selected for processing (1)
  • apps/webapp/test/fairDequeuingStrategy.test.ts (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (25)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (10, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (9, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 10)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: typecheck / typecheck
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (2)
apps/webapp/test/fairDequeuingStrategy.test.ts (2)

259-261: Well-implemented tolerance mechanism for timing assertions.

The introduction of a 15% tolerance margin is a good practice for performance tests running in CI environments, especially with the sharded execution mentioned in the PR. The implementation is clean and maintainable.


267-267: Correctly applied tolerance to timing assertions.

The tolerance is appropriately applied to both performance assertions:

  • Line 267: Relaxes the upper bound for the "9x faster" requirement
  • Line 281: Relaxes the lower bound for the "4x slower" requirement

This maintains the test's validation of snapshot reuse behavior while making it more resilient to timing variations in CI environments.

Also applies to: 281-281

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
package.json (1)

57-57: Centralize and pin Vitest dependencies at the root.
Adding @vitest/coverage-v8 and updating vitest to 3.1.4 in the root package.json appropriately centralizes testing tooling. Verify compatibility with existing workflows and consider using caret ranges (e.g., ^3.1.4) if you’d like to automatically pick up future patch releases.

Also applies to: 68-68

.github/workflows/unit-tests-packages.yml (1)

14-16: Consider future sharding expansion for packages.

Currently using a single shard for package tests. Consider expanding to multiple shards if package test execution time becomes a bottleneck in the future.

.github/workflows/unit-tests-webapp.yml (2)

84-88: Consider using GitHub secrets for test database credentials.

While these appear to be test credentials for local PostgreSQL, hardcoding them in the workflow file could set a precedent for less secure practices. Consider moving these to GitHub secrets or documenting that these are intentionally non-sensitive test values.

-        env:
-          DATABASE_URL: postgresql://postgres:postgres@localhost:5432/postgres
-          DIRECT_URL: postgresql://postgres:postgres@localhost:5432/postgres
-          SESSION_SECRET: "secret"
-          MAGIC_LINK_SECRET: "secret"
-          ENCRYPTION_KEY: "secret"
+        env:
+          DATABASE_URL: ${{ secrets.TEST_DATABASE_URL || 'postgresql://postgres:postgres@localhost:5432/postgres' }}
+          DIRECT_URL: ${{ secrets.TEST_DIRECT_URL || 'postgresql://postgres:postgres@localhost:5432/postgres' }}
+          SESSION_SECRET: ${{ secrets.TEST_SESSION_SECRET || 'secret' }}
+          MAGIC_LINK_SECRET: ${{ secrets.TEST_MAGIC_LINK_SECRET || 'secret' }}
+          ENCRYPTION_KEY: ${{ secrets.TEST_ENCRYPTION_KEY || 'secret' }}
🧰 Tools
🪛 Checkov (3.2.334)

[MEDIUM] 84-85: Basic Auth Credentials

(CKV_SECRET_4)


14-16: Consider making the shard configuration dynamic.

The shard count is hardcoded to 10. For better flexibility and reusability, consider accepting these as workflow inputs.

Add workflow inputs:

on:
  workflow_call:
    inputs:
      shard-total:
        description: 'Total number of shards'
        type: number
        default: 10

Then update the matrix:

-        shardIndex: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
-        shardTotal: [10]
+        shardIndex: ${{ fromJson(format('[{0}]', join(range(1, inputs.shard-total + 1), ','))) }}
+        shardTotal: [${{ inputs.shard-total }}]
apps/webapp/test/runsReplicationService.part2.test.ts (1)

225-226: Consider dynamic wait strategies instead of fixed timeouts.

The fixed 5-second wait might be insufficient under heavy CI load or excessive in optimal conditions. Consider implementing a polling mechanism that checks for replication completion.

// Helper function to wait for replication
async function waitForReplication(clickhouse: ClickHouse, expectedCount: number, maxWaitMs = 10000) {
  const pollInterval = 500;
  const startTime = Date.now();
  
  while (Date.now() - startTime < maxWaitMs) {
    const [err, result] = await clickhouse.reader.query({
      query: "SELECT count(*) as count FROM trigger_dev.task_runs_v2 FINAL",
      schema: z.object({ count: z.number() })
    })({});
    
    if (!err && result?.[0]?.count >= expectedCount) {
      return;
    }
    
    await setTimeout(pollInterval);
  }
  
  throw new Error(`Replication did not complete within ${maxWaitMs}ms`);
}

Also applies to: 344-345

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cf37359 and d7c0f34.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (19)
  • .github/workflows/unit-tests-internal.yml (1 hunks)
  • .github/workflows/unit-tests-packages.yml (1 hunks)
  • .github/workflows/unit-tests-webapp.yml (1 hunks)
  • .github/workflows/unit-tests.yml (1 hunks)
  • apps/supervisor/package.json (1 hunks)
  • apps/webapp/package.json (1 hunks)
  • apps/webapp/test/runsReplicationService.part1.test.ts (1 hunks)
  • apps/webapp/test/runsReplicationService.part2.test.ts (1 hunks)
  • internal-packages/clickhouse/package.json (1 hunks)
  • internal-packages/redis/package.json (0 hunks)
  • internal-packages/replication/package.json (0 hunks)
  • internal-packages/run-engine/package.json (1 hunks)
  • internal-packages/testcontainers/package.json (1 hunks)
  • internal-packages/tracing/package.json (0 hunks)
  • internal-packages/zod-worker/package.json (1 hunks)
  • package.json (2 hunks)
  • packages/cli-v3/package.json (1 hunks)
  • packages/core/package.json (1 hunks)
  • packages/redis-worker/package.json (1 hunks)
💤 Files with no reviewable changes (3)
  • internal-packages/redis/package.json
  • internal-packages/tracing/package.json
  • internal-packages/replication/package.json
🧰 Additional context used
🪛 Checkov (3.2.334)
.github/workflows/unit-tests-webapp.yml

[MEDIUM] 84-85: Basic Auth Credentials

(CKV_SECRET_4)

⏰ Context from checks skipped due to timeout of 90000ms (5)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 10)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: typecheck / typecheck
🔇 Additional comments (21)
internal-packages/testcontainers/package.json (1)

19-19: Verify necessity of new tinyexec dependency.
The tinyexec package was added to devDependencies. Please confirm that it’s required for testcontainer operations and doesn’t overlap with existing utilities in this package.

packages/redis-worker/package.json (1)

40-40: Confirm the addition of tsx in devDependencies.
The tsx dependency was introduced but no scripts reference it in this package. Ensure it’s intentional—remove it if it’s not being used.

apps/supervisor/package.json (1)

25-25: Ensure test scripts resolve vitest from the root workspace.
vitest was removed from local devDependencies, yet scripts still call vitest. Confirm that the root-level vitest is hoisted and accessible for package-local test runs.

internal-packages/clickhouse/package.json (1)

17-17: Verify centralized testing strategy after local removal of Vitest.
With only rimraf left in devDependencies and test scripts still invoking vitest, ensure that the intended centralized Vitest installation is picked up and that tests execute correctly in CI and locally.

packages/core/package.json (1)

215-215: Minor reordering of devDependencies entry
The tsx dependency remains unchanged but has shifted position due to the removal of local Vitest. No functional impact.

apps/webapp/package.json (1)

257-257: Verify updated vite-tsconfig-paths version
The version of vite-tsconfig-paths has been updated to ^4.0.5. Please confirm that this upgrade is fully compatible with the project's Vite and Remix setup and does not introduce any build issues.

packages/cli-v3/package.json (1)

68-68: Ensure tsx support for update-version script
The tsx dependency has been added to support the update-version script invocation. Verify that this aligns with the root-level devDependencies and that the script runs correctly in all development environments.

internal-packages/zod-worker/package.json (1)

17-17: Confirm necessity of @types/pg
A @types/pg entry was introduced in devDependencies. Please verify that this package is actually used by zod-worker or its tests. If it's not required, consider removing it to keep dependencies lean.

internal-packages/run-engine/package.json (1)

38-38: Verify upgrade to major rimraf version
The rimraf devDependency has been bumped to 6.0.1. Since this is a major version upgrade, please confirm that the clean script (rimraf dist) and any programmatic use of the API remain compatible.

apps/webapp/test/runsReplicationService.part1.test.ts (2)

13-13: Excellent test organization for performance optimization.

The split of integration tests into part 1/2 is a smart approach that aligns perfectly with the PR's goal of improving unit test workflow performance. This allows for better parallelization and sharding in the CI pipeline.


11-11: Good timeout configuration for integration tests.

The 60-second timeout is appropriate for container-based integration tests that involve database replication and multiple service interactions.

.github/workflows/unit-tests.yml (1)

10-18: Excellent modular architecture for improved performance.

The refactoring from a monolithic job to three separate reusable workflows is a significant improvement that enables:

  • Parallel execution: Each test category (webapp, packages, internal) can run independently
  • Targeted sharding: Each workflow can implement its own optimal sharding strategy
  • Resource optimization: Different test types can use appropriate runner configurations
  • Maintainability: Separation of concerns makes the CI configuration easier to manage

This modular approach directly supports the PR's goal of reducing test time from 21+ minutes to ~6.5 minutes.

.github/workflows/unit-tests-packages.yml (3)

22-46: Excellent Docker configuration for CI stability.

The IPv6 disabling and Docker address pool configuration are excellent practices that prevent common networking issues in CI environments. The daemon restart ensures configuration takes effect.


65-73: Smart conditional DockerHub login implementation.

The conditional login based on secret availability is a good practice that:

  • Avoids rate limits when credentials are available
  • Gracefully handles environments without secrets
  • Provides clear logging for debugging

84-98: Robust artifact collection and retention strategy.

The blob report collection logic properly:

  • Handles test cancellation scenarios with if: ${{ !cancelled() }}
  • Flattens nested report structures for easier merging
  • Uses appropriate 1-day retention for temporary CI artifacts
  • Includes hidden files which may contain important test metadata
.github/workflows/unit-tests-internal.yml (3)

14-16: Appropriate sharding strategy for internal tests.

Using 8 shards for internal tests shows good understanding of the test execution time distribution. This higher shard count compared to packages (1 shard) suggests internal tests require more parallelization, which is well-reasoned.


84-90: Consistent artifact collection pattern.

The blob report collection logic is identical to the packages workflow, demonstrating good consistency and reusability of patterns across the CI configuration.


100-130: Efficient report merging strategy.

The merge-reports job efficiently:

  • Runs only when needed (if: ${{ !cancelled() }})
  • Skips dependency caching since it doesn't install packages
  • Uses pattern matching to collect all shard reports
  • Leverages Vitest's built-in report merging capabilities
.github/workflows/unit-tests-webapp.yml (2)

22-46: LGTM! Docker configuration addresses CI environment constraints.

The IPv6 disabling and custom address pool configuration effectively prevent network conflicts in the CI environment, which is essential for running containerized integration tests reliably.


90-105: Excellent test report collection and artifact management.

The implementation properly:

  • Collects Vitest blob reports from all subdirectories
  • Preserves package context in filenames for better organization
  • Uses appropriate retention period for CI artifacts
  • Handles cancellation scenarios gracefully
apps/webapp/test/runsReplicationService.part2.test.ts (1)

432-507: Well-designed transaction replication test.

This test effectively validates:

  • Multiple operations within a single transaction
  • Proper replication of both inserts and updates
  • Correct use of parameterized queries for ClickHouse
  • Appropriate assertions for data integrity

@nicktrn nicktrn merged commit 7c791dd into main May 23, 2025
33 checks passed
@nicktrn nicktrn deleted the feat/sharded-unit-tests branch May 23, 2025 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants