Skip to content

Use different Ubuntu pools for Ubuntu tests to avoid disk space issues. #20742

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 11, 2020

Conversation

NTaylorMullen
Copy link

  • Updated the default-build.yml to have a new parameter specific to Ubuntu listed useHostedUbuntu. If we feel the need to expand the parameters usage outside of the Ubuntu configuration we can always rename to useHosted. Also didn't want to touch the isTestingJob because I wasn't sure of hte implications.
  • Updated the Ubuntu test job to turn off hosted pools.

Fixes https://github.com/dotnet/aspnetcore-internal/issues/3574

@ghost ghost added the area-infrastructure Includes: MSBuild projects/targets, build scripts, CI, Installers and shared framework label Apr 10, 2020
Copy link
Contributor

@dougbu dougbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much appreciated! Let's hope we don't find these agents are missing much.

Eventually, it'll be interesting to get #20704 in and see how much space these guys have lying about.

- Updated the `default-build.yml` to have a new parameter specific to Ubuntu listed `useHostedUbuntu`. If we feel the need to expand the parameters usage outside of the Ubuntu configuration we can always rename to `useHosted`. Also didn't want to touch the `isTestingJob` because I wasn't sure of hte implications.
- Updated the Ubuntu test job to turn off hosted pools.

Fixes https://github.com/dotnet/aspnetcore-internal/issues/3574
@NTaylorMullen NTaylorMullen force-pushed the nimullen/changeubuntupools branch from fc86c36 to 7e7c9b4 Compare April 10, 2020 23:25
@NTaylorMullen
Copy link
Author

@dougbu are those pools/queues correct? It looks like it's attempting to run our Linux scripts on a windows box for the Linux jobs:
image

queue: buildpool.ubuntu.1604.amd64.open
${{ if eq(variables['System.TeamProject'], 'internal') }}:
name: NetCoreInternal-Pool
queue: buildpool.ubuntu.1604.amd64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ilyas1974 and @riarenas how are we messing up here? These jobs are trying to run on Windows agents.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is useHostedUbuntu somehow undefined here so it is using the default?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suspicion based on https://helix.dot.net/api/2019-06-17/info/queues (now that I look closely) is queue names are case-sensitive. I just pushed a commit to see if that's the case.

queue: buildpool.ubuntu.1604.amd64.open
${{ if eq(variables['System.TeamProject'], 'internal') }}:
name: NetCoreInternal-Pool
queue: buildpool.ubuntu.1604.amd64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is useHostedUbuntu somehow undefined here so it is using the default?

@@ -83,6 +83,7 @@ jobs:
enableTelemetry: true
helixRepo: dotnet/aspnetcore
helixType: build.product/
useHostedUbuntu: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether or not queue names are case-sensitive, this placement was the root cause. I didn't remember that this section is about passing parameters into the job.yml template when talking to @NTaylorMullen about this. This parameter default needs to be near the top of the file ☹️

@dougbu
Copy link
Contributor

dougbu commented Apr 11, 2020

Is useHostedUbuntu somehow undefined here so it is using the default?

Yes, that was broken too as I mentioned here. My bad.

@Pilchie
Copy link
Member

Pilchie commented Apr 11, 2020

Strange that some server tests are failing here, with a file not found.

Also this warning in the build:
https://dev.azure.com/dnceng/public/_build/results?buildId=597050&view=logs&j=b72e85ab-3386-5aa9-6405-3837662d9688&t=f7506d25-98a6-55c7-2099-fa268bf83c45&l=14

warning: Some managed projects depend on NodeJS projects. Building NodeJS is disabled so the managed projects will fallback to using the output from previous builds. The output may not be correct or up to date.

@dougbu
Copy link
Contributor

dougbu commented Apr 11, 2020

@Pilchie thanks for digging into this!

@ilyas1974 and @riarenas, I'll see if the missing file errors are all related to not having node installed and let you know if anything else is missing from the images that we need. If it's just that tool, I'll add the node install task in the problem child job.

@dougbu
Copy link
Contributor

dougbu commented Apr 11, 2020

I'm not sure why the build said it didn't find node but suspect it's not the core problem. Instead, the nginx installation failed because it couldn't find PCRE:

checking for PCRE library ... not found
checking for PCRE library in /usr/local/ ... not found
checking for PCRE library in /usr/include/pcre/ ... not found
checking for PCRE library in /usr/pkg/ ... not found
checking for PCRE library in /opt/local/ ... not found

./configure: error: the HTTP rewrite module requires the PCRE library.
You can either disable the module by using --without-http_rewrite_module
option, or install the PCRE library into the system, or build the PCRE library
statically from the source with nginx by using --with-pcre=<path> option.

make: *** No rule to make target 'build', needed by 'default'.  Stop.
make: *** No rule to make target 'install'.  Stop.

@BrennanConroy @anurse do we actually need the HTTP rewrite model? Otherwise, what package should we ask for? https://packages.ubuntu.com/xenial/pcre2-utils looks close but more low level than the agents usually deal with.

Of course, continuing after the error implies eng/scripts/install-nginx-linux.sh and probably install-nginx-mac.sh need set -euo pipefail. (Does this command work on macOS?)

@BrennanConroy
Copy link
Member

I doubt we need http rewrite

- also add `set -euo pipefail` to the script for fast failure
@dougbu
Copy link
Contributor

dougbu commented Apr 11, 2020

Wow, I think we're done with the Ubuntu test job timing out 🚀 Thanks @NTaylorMullen @ilyas1974 @Pilchie @BrennanConroy @riarenas @pranavkm @MattGal @markwilkie and anyone I'm forgetting for your help❕

I'll get this in tonight or tomorrow morning, depending on the remaining two (unchanged) build jobs.

@dougbu dougbu merged commit 51f69a6 into master Apr 11, 2020
@dougbu dougbu deleted the nimullen/changeubuntupools branch April 11, 2020 06:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-infrastructure Includes: MSBuild projects/targets, build scripts, CI, Installers and shared framework
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants