Skip to content

Commit 510db35

Browse files
committed
# This is a combination of 2 commits.
# This is the 1st commit message: Rebase # This is the commit message #2: fix: terraform error
1 parent 1a6324d commit 510db35

File tree

11 files changed

+552
-126
lines changed

11 files changed

+552
-126
lines changed

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -36,11 +36,11 @@ A logical question would be why not Kubernetes? In the current approach we stay
3636

3737
## Overview
3838

39-
The moment a GitHub action workflow requiring a `self-hosted` runner is triggered, GitHub will try to find a runner which can execute the workload. This module reacts to GitHub's [`check_run` event](https://docs.github.com/en/free-pro-team@latest/developers/webhooks-and-events/webhook-events-and-payloads#check_run) for the triggered workflow and creates a new runner if necessary.
39+
The moment a GitHub action workflow requiring a `self-hosted` runner is triggered, GitHub will try to find a runner which can execute the workload. This module reacts to GitHub's [`workflow_job` event](https://docs.github.com/en/free-pro-team@latest/developers/webhooks-and-events/webhook-events-and-payloads#workflow_job) for the triggered workflow and creates a new runner if necessary.
4040

41-
For receiving the `check_run` event, a GitHub App needs to be created with a webhook to which the event will be published. Installing the GitHub App in a specific repository or all repositories ensures the `check_run` event will be sent to the webhook.
41+
For receiving the `workflow_job` event, a Webhook needs to be created. The webhook hook can be defined on enterprise, org, repo, or app level. When using the GitHub app ensure the app is installed in the specific repository or all repositories.
4242

43-
In AWS a [API gateway](https://docs.aws.amazon.com/apigateway/index.html) endpoint is created that is able to receive the GitHub webhook events via HTTP post. The gateway triggers the webhook lambda which will verify the signature of the event. This check guarantees the event is sent by the GitHub App. The lambda only handles `check_run` events with status `created`. The accepted events are posted on a SQS queue. Messages on this queue will be delayed for a configurable amount of seconds (default 30 seconds) to give the available runners time to pick up this build.
43+
In AWS a [API gateway](https://docs.aws.amazon.com/apigateway/index.html) endpoint is created that is able to receive the GitHub webhook events via HTTP post. The gateway triggers the webhook lambda which will verify the signature of the event. This check guarantees the event is sent by the GitHub App. The lambda only handles `workflow_job` events with status `queued` and matching the runner labels. The accepted events are posted on a SQS queue. Messages on this queue will be delayed for a configurable amount of seconds (default 30 seconds) to give the available runners time to pick up this build.
4444

4545
The "scale up runner" lambda is listening to the SQS queue and picks up events. The lambda runs various checks to decide whether a new EC2 spot instance needs to be created. For example, the instance is not created if the build is already started by an existing runner, or the maximum number of runners is reached.
4646

@@ -56,7 +56,7 @@ Secrets and private keys are stored in SSM Parameter Store. These values are enc
5656

5757
Permission are managed on several places. Below the most important ones. For details check the Terraform sources.
5858

59-
- The GitHub App requires access to actions and publish `check_run` events to AWS.
59+
- The GitHub App requires access to actions and publish `workflow_job` events to the AWS webhook (API gateway).
6060
- The scale up lambda should have access to EC2 for creating and tagging instances.
6161
- The scale down lambda should have access to EC2 to terminate instances.
6262

examples/default/main.tf

Lines changed: 15 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,17 @@ resource "random_password" "random" {
77
length = 28
88
}
99

10-
module "runners" {
11-
source = "../../"
1210

13-
aws_region = local.aws_region
14-
vpc_id = module.vpc.vpc_id
15-
subnet_ids = module.vpc.private_subnets
11+
################################################################################
12+
### Hybrid acccount
13+
################################################################################
14+
15+
module "runners" {
16+
source = "../../"
17+
create_service_linked_role_spot = true
18+
aws_region = local.aws_region
19+
vpc_id = module.vpc.vpc_id
20+
subnet_ids = module.vpc.private_subnets
1621

1722
environment = local.environment
1823
tags = {
@@ -27,11 +32,11 @@ module "runners" {
2732
webhook_secret = random_password.random.result
2833
}
2934

30-
webhook_lambda_zip = "lambdas-download/webhook.zip"
31-
runner_binaries_syncer_lambda_zip = "lambdas-download/runner-binaries-syncer.zip"
32-
runners_lambda_zip = "lambdas-download/runners.zip"
33-
enable_organization_runners = false
34-
runner_extra_labels = "default,example"
35+
# webhook_lambda_zip = "lambdas-download/webhook.zip"
36+
# runner_binaries_syncer_lambda_zip = "lambdas-download/runner-binaries-syncer.zip"
37+
# runners_lambda_zip = "lambdas-download/runners.zip"
38+
enable_organization_runners = true
39+
runner_extra_labels = "default,example"
3540

3641
# enable access to the runners via SSM
3742
enable_ssm_on_runners = true

modules/runners/lambdas/runners/src/scale-runners/scale-up.test.ts

Lines changed: 26 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ const mockOctokit = {
1010
actions: {
1111
createRegistrationTokenForOrg: jest.fn(),
1212
createRegistrationTokenForRepo: jest.fn(),
13+
getJobForWorkflowRun: jest.fn(),
1314
},
1415
apps: {
1516
getOrgInstallation: jest.fn(),
@@ -30,15 +31,15 @@ const mockCreateClient = mocked(ghAuth.createOctoClient, true);
3031

3132
const TEST_DATA: scaleUpModule.ActionRequestMessage = {
3233
id: 1,
33-
eventType: 'check_run',
34+
eventType: 'workflow_job',
3435
repositoryName: 'hello-world',
3536
repositoryOwner: 'Codertocat',
3637
installationId: 2,
3738
};
3839

3940
const TEST_DATA_WITHOUT_INSTALL_ID: scaleUpModule.ActionRequestMessage = {
4041
id: 3,
41-
eventType: 'check_run',
42+
eventType: 'workflow_job',
4243
repositoryName: 'hello-world',
4344
repositoryOwner: 'Codertocat',
4445
installationId: 0,
@@ -69,6 +70,12 @@ beforeEach(() => {
6970
process.env.ENVIRONMENT = 'unit-test-environment';
7071
process.env.LAUNCH_TEMPLATE_NAME = 'lt-1,lt-2';
7172

73+
mockOctokit.actions.getJobForWorkflowRun.mockImplementation(() => ({
74+
data: {
75+
status: 'queued',
76+
},
77+
}));
78+
7279
mockOctokit.checks.get.mockImplementation(() => ({
7380
data: {
7481
status: 'queued',
@@ -126,16 +133,16 @@ describe('scaleUp with GHES', () => {
126133

127134
it('checks queued workflows', async () => {
128135
await scaleUpModule.scaleUp('aws:sqs', TEST_DATA);
129-
expect(mockOctokit.checks.get).toBeCalledWith({
130-
check_run_id: TEST_DATA.id,
136+
expect(mockOctokit.actions.getJobForWorkflowRun).toBeCalledWith({
137+
job_id: TEST_DATA.id,
131138
owner: TEST_DATA.repositoryOwner,
132139
repo: TEST_DATA.repositoryName,
133140
});
134141
});
135142

136143
it('does not list runners when no workflows are queued', async () => {
137-
mockOctokit.checks.get.mockImplementation(() => ({
138-
data: { total_count: 0, runners: [] },
144+
mockOctokit.actions.getJobForWorkflowRun.mockImplementation(() => ({
145+
data: { total_count: 0 },
139146
}));
140147
await scaleUpModule.scaleUp('aws:sqs', TEST_DATA);
141148
expect(listRunners).not.toBeCalled();
@@ -200,6 +207,11 @@ describe('scaleUp with GHES', () => {
200207
expect(createRunner).toBeCalledWith(expectedRunnerParams, 'lt-1');
201208
});
202209

210+
it('creates a runner with legacy event check_run', async () => {
211+
await scaleUpModule.scaleUp('aws:sqs', { ...TEST_DATA, eventType: 'check_run' });
212+
expect(createRunner).toBeCalledWith(expectedRunnerParams, 'lt-1');
213+
});
214+
203215
it('creates a runner with labels in a specific group', async () => {
204216
process.env.RUNNER_EXTRA_LABELS = 'label1,label2';
205217
process.env.RUNNER_GROUP_NAME = 'TEST_GROUP';
@@ -339,8 +351,8 @@ describe('scaleUp with public GH', () => {
339351

340352
it('checks queued workflows', async () => {
341353
await scaleUpModule.scaleUp('aws:sqs', TEST_DATA);
342-
expect(mockOctokit.checks.get).toBeCalledWith({
343-
check_run_id: TEST_DATA.id,
354+
expect(mockOctokit.actions.getJobForWorkflowRun).toBeCalledWith({
355+
job_id: TEST_DATA.id,
344356
owner: TEST_DATA.repositoryOwner,
345357
repo: TEST_DATA.repositoryName,
346358
});
@@ -363,7 +375,7 @@ describe('scaleUp with public GH', () => {
363375
});
364376

365377
it('does not list runners when no workflows are queued', async () => {
366-
mockOctokit.checks.get.mockImplementation(() => ({
378+
mockOctokit.actions.getJobForWorkflowRun.mockImplementation(() => ({
367379
data: { status: 'completed' },
368380
}));
369381
await scaleUpModule.scaleUp('aws:sqs', TEST_DATA);
@@ -406,6 +418,11 @@ describe('scaleUp with public GH', () => {
406418
expect(createRunner).toBeCalledWith(expectedRunnerParams, LAUNCH_TEMPLATE);
407419
});
408420

421+
it('creates a runner with legacy event check_run', async () => {
422+
await scaleUpModule.scaleUp('aws:sqs', { ...TEST_DATA, eventType: 'check_run' });
423+
expect(createRunner).toBeCalledWith(expectedRunnerParams, LAUNCH_TEMPLATE);
424+
});
425+
409426
it('creates a runner with labels in s specific group', async () => {
410427
process.env.RUNNER_EXTRA_LABELS = 'label1,label2';
411428
process.env.RUNNER_GROUP_NAME = 'TEST_GROUP';

modules/runners/lambdas/runners/src/scale-runners/scale-up.ts

Lines changed: 40 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
11
import { listRunners, createRunner, RunnerInputParameters } from './runners';
22
import { createOctoClient, createGithubAuth } from './gh-auth';
33
import yn from 'yn';
4+
import { Octokit } from '@octokit/rest';
45

56
export interface ActionRequestMessage {
67
id: number;
7-
eventType: string;
8+
eventType: 'check_run' | 'workflow_job';
89
repositoryName: string;
910
repositoryOwner: string;
1011
installationId: number;
@@ -30,31 +31,27 @@ export const scaleUp = async (eventSource: string, payload: ActionRequestMessage
3031
const githubClient = await createOctoClient(ghAuth.token, ghesApiUrl);
3132
installationId = enableOrgLevel
3233
? (
33-
await githubClient.apps.getOrgInstallation({
34-
org: payload.repositoryOwner,
35-
})
36-
).data.id
34+
await githubClient.apps.getOrgInstallation({
35+
org: payload.repositoryOwner,
36+
})
37+
).data.id
3738
: (
38-
await githubClient.apps.getRepoInstallation({
39-
owner: payload.repositoryOwner,
40-
repo: payload.repositoryName,
41-
})
42-
).data.id;
39+
await githubClient.apps.getRepoInstallation({
40+
owner: payload.repositoryOwner,
41+
repo: payload.repositoryName,
42+
})
43+
).data.id;
4344
}
4445

4546
const ghAuth = await createGithubAuth(installationId, 'installation', ghesApiUrl);
4647

4748
const githubInstallationClient = await createOctoClient(ghAuth.token, ghesApiUrl);
48-
const checkRun = await githubInstallationClient.checks.get({
49-
check_run_id: payload.id,
50-
owner: payload.repositoryOwner,
51-
repo: payload.repositoryName,
52-
});
5349

5450
const runnerType = enableOrgLevel ? 'Org' : 'Repo';
5551
const runnerOwner = enableOrgLevel ? payload.repositoryOwner : `${payload.repositoryOwner}/${payload.repositoryName}`;
5652

57-
if (checkRun.data.status === 'queued') {
53+
const isQueued = await getJobStatus(githubInstallationClient, payload);
54+
if (isQueued) {
5855
const currentRunners = await listRunners({
5956
environment,
6057
runnerType,
@@ -67,9 +64,9 @@ export const scaleUp = async (eventSource: string, payload: ActionRequestMessage
6764
const registrationToken = enableOrgLevel
6865
? await githubInstallationClient.actions.createRegistrationTokenForOrg({ org: payload.repositoryOwner })
6966
: await githubInstallationClient.actions.createRegistrationTokenForRepo({
70-
owner: payload.repositoryOwner,
71-
repo: payload.repositoryName,
72-
});
67+
owner: payload.repositoryOwner,
68+
repo: payload.repositoryName,
69+
});
7370
const token = registrationToken.data.token;
7471

7572
const labelsArgument = runnerExtraLabels !== undefined ? `--labels ${runnerExtraLabels}` : '';
@@ -81,7 +78,7 @@ export const scaleUp = async (eventSource: string, payload: ActionRequestMessage
8178
runnerServiceConfig: enableOrgLevel
8279
? `--url ${configBaseUrl}/${payload.repositoryOwner} --token ${token} ${labelsArgument}${runnerGroupArgument}`
8380
: `--url ${configBaseUrl}/${payload.repositoryOwner}/${payload.repositoryName} ` +
84-
`--token ${token} ${labelsArgument}`,
81+
`--token ${token} ${labelsArgument}`,
8582
runnerOwner,
8683
runnerType,
8784
});
@@ -91,6 +88,29 @@ export const scaleUp = async (eventSource: string, payload: ActionRequestMessage
9188
}
9289
};
9390

91+
async function getJobStatus(githubInstallationClient: Octokit, payload: ActionRequestMessage): Promise<boolean> {
92+
let isQueued = false;
93+
if (payload.eventType === 'workflow_job') {
94+
const jobForWorkflowRun = await githubInstallationClient.actions.getJobForWorkflowRun({
95+
job_id: payload.id,
96+
owner: payload.repositoryOwner,
97+
repo: payload.repositoryName,
98+
});
99+
isQueued = jobForWorkflowRun.data.status === 'queued';
100+
} else if (payload.eventType === 'check_run') {
101+
const checkRun = await githubInstallationClient.checks.get({
102+
check_run_id: payload.id,
103+
owner: payload.repositoryOwner,
104+
repo: payload.repositoryName,
105+
});
106+
isQueued = checkRun.data.status === 'queued';
107+
} else {
108+
throw Error(`Event ${payload.eventType} is not supported`);
109+
}
110+
111+
return isQueued;
112+
}
113+
94114
export async function createRunnerLoop(runnerParameters: RunnerInputParameters): Promise<void> {
95115
const launchTemplateNames = process.env.LAUNCH_TEMPLATE_NAME?.split(',') as string[];
96116
let launched = false;

modules/webhook/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,7 @@ No Modules.
7878
| repository\_white\_list | List of repositories allowed to use the github app | `list(string)` | `[]` | no |
7979
| role\_path | The path that will be added to the role, if not set the environment name will be used. | `string` | `null` | no |
8080
| role\_permissions\_boundary | Permissions boundary that will be added to the created role for the lambda. | `string` | `null` | no |
81+
| runner\_extra\_labels | Extra labels for the runners (GitHub). Separate each label by a comma | `string` | `""` | no |
8182
| sqs\_build\_queue | SQS queue to publish accepted build events. | <pre>object({<br> id = string<br> arn = string<br> })</pre> | n/a | yes |
8283
| tags | Map of tags that will be added to created resources. By default resources will be tagged with name and environment. | `map(string)` | `{}` | no |
8384
| webhook\_lambda\_s3\_key | S3 key for webhook lambda function. Required if using S3 bucket to specify lambdas. | `any` | `null` | no |

0 commit comments

Comments
 (0)