Skip to content

Reapply [workflows] Split pr-code-format into two parts to make it more secure (#78215) #80495

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
128 changes: 128 additions & 0 deletions .github/workflows/issue-write.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
name: Comment on an issue

on:
workflow_run:
workflows: ["Check code formatting"]
types:
- completed

permissions:
contents: read

jobs:
pr-comment:
runs-on: ubuntu-latest
permissions:
pull-requests: write
if: >
github.event.workflow_run.event == 'pull_request'
steps:
- name: 'Download artifact'
uses: actions/download-artifact@6b208ae046db98c579e8a3aa621ab581ff575935 # v4.1.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this version different than the upload artifact action in the pull_request job? 4.3.0 vs 4.1.1.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those two actions are versioned differently.

with:
github-token: ${{ secrets.ISSUE_WRITE_DOWNLOAD_ARTIFACT }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tstellar, we use clang-format check in downstream repository and this split "breaks" some functionality. According to my understanding, we should have ISSUE_WRITE_DOWNLOAD_ARTIFACT secret available to GitHub Actions in our repository in order to have comment from clang-format action. Could you clarify what permissions should be granted to ISSUE_WRITE_DOWNLOAD_ARTIFACT secret, please?
BTW, why can't we use GITHUB_TOKEN secret?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The token needs actions:read permissions. You can't use 'GITHUB_TOKEN`, because it's only scoped for the current workflow run. See https://github.com/actions/download-artifact?tab=readme-ov-file#download-artifacts-from-other-workflow-runs-or-repositories

I do have a workaround that will allow us to download the artifact without a token, but that is part of a larger PR here. If you think it would be useful to avoid using the token, I can try to pull out this change into its own PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying!

I do have a workaround that will allow us to download the artifact without a token, but that is part of a larger PR here. If you think it would be useful to avoid using the token, I can try to pull out this change into its own PR.

Do you plan to use "Unprivileged Download Artifact" in issue-write? If no, we will use the token.

The only problem I have with introducing a new token is that it's kind of unexpected and requires some efforts to investigate why things are broken. If we can make it work out-of-box and keep the upstream project secure, it would be great.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to use "Unprivileged Download Artifact" in issue-write? If no, we will use the token.

Yeah, I was planning to to do this eventually. I'll start looking into this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for your help with this. I really appreciate the efforts to make GitHub Actions CI useful for downstream forks. 👍

run-id: ${{ github.event.workflow_run.id }}
name: workflow-args

- name: 'Comment on PR'
uses: actions/github-script@v3
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |
var fs = require('fs');
const comments = JSON.parse(fs.readFileSync('./comments'));
if (!comments) {
return;
}

let runInfo = await github.actions.getWorkflowRun({
owner: context.repo.owner,
repo: context.repo.repo,
run_id: context.payload.workflow_run.id
});

console.log(runInfo);


// Query to find the number of the pull request that triggered this job.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need all this? How can it end up with multiple associated pull requests? Or have a different baseRepository?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The associated pull requests are based off of the branch name, so if you create a pull request for a branch, close it, and then create another pull request with the same branch, then this query will return two associated pull requests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, ok. Can you add comments to the code explaining that?

// The associated pull requests are based off of the branch name, so if
// you create a pull request for a branch, close it, and then create
// another pull request with the same branch, then this query will return
// two associated pull requests. This is why we have to fetch all the
// associated pull requests and then iterate through them to find the
// one that is open.
const gql_query = `
query($repo_owner : String!, $repo_name : String!, $branch: String!) {
repository(owner: $repo_owner, name: $repo_name) {
ref (qualifiedName: $branch) {
associatedPullRequests(first: 100) {
nodes {
baseRepository {
owner {
login
}
}
number
state
}
}
}
}
}
`
const gql_variables = {
repo_owner: runInfo.data.head_repository.owner.login,
repo_name: runInfo.data.head_repository.name,
branch: runInfo.data.head_branch
}
const gql_result = await github.graphql(gql_query, gql_variables);
console.log(gql_result);
console.log(gql_result.repository.ref.associatedPullRequests.nodes);

var pr_number = 0;
gql_result.repository.ref.associatedPullRequests.nodes.forEach((pr) => {
if (pr.baseRepository.owner.login = context.repo.owner && pr.state == 'OPEN') {
pr_number = pr.number;
}
});
if (pr_number == 0) {
console.log("Error retrieving pull request number");
return;
}

await comments.forEach(function (comment) {
if (comment.id) {
// Security check: Ensure that this comment was created by
// the github-actions bot, so a malicious input won't overwrite
// a user's comment.
github.issues.getComment({
owner: context.repo.owner,
repo: context.repo.repo,
comment_id: comment.id
}).then((old_comment) => {
console.log(old_comment);
if (old_comment.data.user.login != "github-actions[bot]") {
console.log("Invalid comment id: " + comment.id);
return;
}
github.issues.updateComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: pr_number,
comment_id: comment.id,
body: comment.body
});
});
} else {
github.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: pr_number,
body: comment.body
});
}
});

- name: Dump comments file
if: always()
run: cat comments
20 changes: 13 additions & 7 deletions .github/workflows/pr-code-format.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,9 @@
name: "Check code formatting"
on:
pull_request_target:
pull_request:
branches:
- main

permissions:
pull-requests: write

jobs:
code_formatter:
runs-on: ubuntu-latest
Expand All @@ -31,12 +28,13 @@ jobs:
separator: ","
skip_initial_fetch: true

# We need to make sure that we aren't executing/using any code from the
# PR for security reasons as we're using pull_request_target. Checkout
# the target branch with the necessary files.
# We need to pull the script from the main branch, so that we ensure
# we get the latest version of this script.
- name: Fetch code formatting utils
uses: actions/checkout@v4
with:
reository: ${{ github.repository }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reository -> repository.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f6c87be

Thanks for pointing this out!

ref: ${{ github.base_ref }}
sparse-checkout: |
llvm/utils/git/requirements_formatting.txt
llvm/utils/git/code-format-helper.py
Expand Down Expand Up @@ -77,8 +75,16 @@ jobs:
# the merge base.
run: |
python ./code-format-tools/llvm/utils/git/code-format-helper.py \
--write-comment-to-file \
--token ${{ secrets.GITHUB_TOKEN }} \
--issue-number $GITHUB_PR_NUMBER \
--start-rev $(git merge-base $START_REV $END_REV) \
--end-rev $END_REV \
--changed-files "$CHANGED_FILES"

- uses: actions/upload-artifact@26f96dfa697d77e81fd5907df203aa23a56210a8 #v4.3.0
if: always()
with:
name: workflow-args
path: |
comments
26 changes: 26 additions & 0 deletions llvm/utils/git/code-format-helper.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ class FormatArgs:
token: str = None
verbose: bool = True
issue_number: int = 0
write_comment_to_file: bool = False

def __init__(self, args: argparse.Namespace = None) -> None:
if not args is None:
Expand All @@ -53,12 +54,14 @@ def __init__(self, args: argparse.Namespace = None) -> None:
self.token = args.token
self.changed_files = args.changed_files
self.issue_number = args.issue_number
self.write_comment_to_file = args.write_comment_to_file


class FormatHelper:
COMMENT_TAG = "<!--LLVM CODE FORMAT COMMENT: {fmt}-->"
name: str
friendly_name: str
comment: dict = None

@property
def comment_tag(self) -> str:
Expand Down Expand Up @@ -119,6 +122,13 @@ def update_pr(self, comment_text: str, args: FormatArgs, create_new: bool) -> No
comment_text = self.comment_tag + "\n\n" + comment_text

existing_comment = self.find_comment(pr)

if args.write_comment_to_file:
self.comment = {"body": comment_text}
if existing_comment:
self.comment["id"] = existing_comment.id
return

if existing_comment:
existing_comment.edit(comment_text)
elif create_new:
Expand Down Expand Up @@ -309,6 +319,8 @@ def hook_main():
if fmt.has_tool():
if not fmt.run(args.changed_files, args):
failed_fmts.append(fmt.name)
if fmt.comment:
comments.append(fmt.comment)
else:
print(f"Couldn't find {fmt.name}, can't check " + fmt.friendly_name.lower())

Expand Down Expand Up @@ -349,6 +361,11 @@ def hook_main():
type=str,
help="Comma separated list of files that has been changed",
)
parser.add_argument(
"--write-comment-to-file",
action="store_true",
help="Don't post comments on the PR, instead write the comments and metadata a file called 'comment'",
)

args = FormatArgs(parser.parse_args())

Expand All @@ -357,9 +374,18 @@ def hook_main():
changed_files = args.changed_files.split(",")

failed_formatters = []
comments = []
for fmt in ALL_FORMATTERS:
if not fmt.run(changed_files, args):
failed_formatters.append(fmt.name)
if fmt.comment:
comments.append(fmt.comment)

if len(comments):
with open("comments", "w") as f:
import json

json.dump(comments, f)

if len(failed_formatters) > 0:
print(f"error: some formatters failed: {' '.join(failed_formatters)}")
Expand Down