Skip to content

Support pagination to fetch _all_ Bitbucket branches #18563

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 23, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,18 @@ export class BitbucketRepositoryProvider implements RepositoryProvider {
async getBranches(user: User, owner: string, repo: string): Promise<Branch[]> {
const branches: Branch[] = [];
const api = await this.apiFactory.create(user);

// Handle pagination.
let nextPage = 1;
let isMoreDataAvailable = true;

while (isMoreDataAvailable) {
Copy link
Member

@akosyakov akosyakov Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there risk that hit some rate limitting here? Can we guess how much actual data would be given similar configuration of bbs and test against it?

Asking not blocking.

Copy link
Contributor Author

@jankeromnes jankeromnes Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! 🎯 Probably yes (just like all other Bitbucket API requests), but I think using a pagelen of 100 mitigates this somewhat (i.e. you'd have to have an incredibly large number of branches in order to get into rate limit territory, and we also do all these requests in sequence and not in parallel).

My feeling here is that it's safe to let any (supposedly very rare / unlikely) rate limit error just bubble up as is if it ever happens, i.e. the Gitpod user would see something like "Unable to fetch branches: Bitbucket API rate limit exceeded".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For repositories we have pagination as well but it broke with 10000 as far as I understand and was very slow to fetch everything breaking ability to create a project.

Copy link
Contributor Author

@jankeromnes jankeromnes Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akosyakov aha -- was that with pagelen: 100 or the default page length? (Optimizing the page length can divide the number of queries required by ~3x)

Copy link
Member

@akosyakov akosyakov Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was 10K repos with 1000 page size, but we were doing for 2 different properties. @AlexTugarev would be more helpful here, but he is out sick today

const response = await api.repositories.listBranches({
workspace: owner,
repo_slug: repo,
sort: "target.date",
page: String(nextPage),
pagelen: 100,
});

for (const branch of response.data.values!) {
Expand All @@ -79,6 +87,14 @@ export class BitbucketRepositoryProvider implements RepositoryProvider {
});
}

// If the response has a "next" property, it indicates there are more pages.
if (response.data.next) {
nextPage++;
} else {
isMoreDataAvailable = false;
}
}

return branches;
}

Expand Down