Skip to content

Add conda support to nightly flake test #10523

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 34 commits into from
Mar 11, 2020
Merged

Conversation

rchiodo
Copy link

@rchiodo rchiodo commented Mar 11, 2020

For #10134 - Conda support for tests

Last pipeline run is here. Mostly working so going to submit for now. I'll work on the last couple of failures later
https://dev.azure.com/ms/vscode-python/_build/results?buildId=67498&view=results

Essentially this change enables all of the interpreter services inside the test container. As a result of this there seems to be a memory leak caused by promises never finishing. So part of this work was to cleanup a bunch of stuff during dispose.

@rchiodo rchiodo self-assigned this Mar 11, 2020
@codecov-io
Copy link

codecov-io commented Mar 11, 2020

Codecov Report

Merging #10523 into master will decrease coverage by 0.11%.
The diff coverage is 37%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #10523      +/-   ##
==========================================
- Coverage   60.86%   60.74%   -0.12%     
==========================================
  Files         579      578       -1     
  Lines       31294    31355      +61     
  Branches     4451     4463      +12     
==========================================
+ Hits        19046    19047       +1     
- Misses      11285    11339      +54     
- Partials      963      969       +6
Impacted Files Coverage Δ
...ient/datascience/interactive-ipynb/nativeEditor.ts 11.76% <ø> (+0.35%) ⬆️
src/client/ioc/types.ts 100% <ø> (ø) ⬆️
...reter/locators/services/cacheableLocatorService.ts 78.7% <0%> (-1.49%) ⬇️
src/client/datascience/jupyter/jupyterExecution.ts 49.36% <0%> (-0.32%) ⬇️
...atascience/jupyter/liveshare/guestJupyterServer.ts 11.76% <0%> (ø) ⬆️
src/client/common/process/proc.ts 14.49% <0%> (-0.73%) ⬇️
src/client/datascience/jupyter/notebookStarter.ts 65.25% <0%> (ø) ⬆️
src/client/common/process/pythonDaemon.ts 19.68% <0%> (-0.11%) ⬇️
src/client/datascience/jupyter/jupyterServer.ts 7.36% <0%> (-0.16%) ⬇️
.../datascience/interactive-common/interactiveBase.ts 5.52% <0%> (-0.02%) ⬇️
... and 23 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cd90761...1f03b7c. Read the comment docs.

@@ -0,0 +1,7 @@
name: conda_env_2
dependencies:
- python=3.8

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand the purpose of creating two environments.
We have P37 and P38. Which will be used in tests?
I think P38 will always be used. If that's the case, then why install pandas, jupyter, matplotlib in P37?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the tests uses two environments now.


In reply to: 391113722 [](ancestors = 391113722)

@@ -50,22 +50,6 @@ stages:
steps:
- template: templates/test_phases.yml

- job: 'Py36'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we removing 3.6?
At a minimum I would have expected to test against 37 as thats the most popular version of Python used, sure soon it will be 38.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This didn't actually test against 36. The conda environment determines what version of python is used now.


In reply to: 391116118 [](ancestors = 391116118)

oldAsk = settings.datascience.askForKernelRestart;
settings.datascience.askForKernelRestart = false;
}
await this.restartKernel(true);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this is a breaking change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking how? We talked it over and decided it was a mistake to restart the kernel. Instead will just close the kernel.


In reply to: 391116888 [](ancestors = 391116888)


// The session manager can actually be stuck in the context of a timer. Clear out the specs inside of
// it so the memory for the session is minimized. Otherwise functional tests can run out of memory
if (sessionManager._specs) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code calling this.sessionManager.shutdownAll() and this.sessionManager.dispose should be sufficient (does the same thing).
I'd prefer that, as that's a public api (instead of this approach).

I.e.

try {
    await this.sessionManager.shutdownAll();
} catch {
  	// noop.
} finally {
    this.sessionManager.dispose();
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No that doesn't do the same thing, hence why I added this code. There's bugs in the jupyterlab services implementation. At least with respect to shutdown


In reply to: 391123563 [](ancestors = 391123563)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They never stop their polling.


In reply to: 391135964 [](ancestors = 391135964,391123563)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shutdownAll would probably be good though. I'll add that.


In reply to: 391136172 [](ancestors = 391136172,391135964,391123563)

];

export class EnvironmentActivationServiceCache {
private static useStatic = false;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding some comments explaining the purpose of this class would be useful.
Specially for core extension devs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay


In reply to: 391124953 [](ancestors = 391124953)

} catch (exc) {
// Special case. Conda for some versions will state a file is in use. If
// that's the case, wait and try again. This happens especially on AzDo
const excString = exc.toString();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to add comments about the issue this is trying to solve (probably link to issue number). Right now its not obvious that we're trying to work around some conda issue.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.e. its not obvious as to why we even need a retry.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll link to the issue and explain here that conda sucks.


In reply to: 391126817 [](ancestors = 391126817)

if (result.stderr && result.stderr.length > 0) {
throw new Error(`StdErr from ShellExec, ${result.stderr}`);
let result: ExecutionResult<string> | undefined;
while (!result) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels dangerous to keep trying indefinitely. I'd gate this to 10-20 or other retries. I.e. max out at 5-10-20 seconds.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay sure. I suppose it's possible it could get stuck indefinitely.


In reply to: 391128679 [](ancestors = 391128679)

@@ -2144,11 +2151,18 @@ df.head()`;
await addCell(wrapper, ioc, 'a', false);
}

async function updateFileConfig(key: string, value: any) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome

# Run the pip installs in the 3 environments (windows)
- script: |
call activate base
conda install --quiet -y --file build/ci/conda_base.yml

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why install in base environment?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the functional-requirements.txt seems to install jupyter into the base conda after conda is added to the path. So we need the pandas/matplotlib stuff in the base environment then too. It ends up being the first environment.


In reply to: 391131125 [](ancestors = 391131125)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm going to leave this as is for now (it's just another environment). the original intent was to have base have nothing in it. For some reason we're detecting jupyter in it though. Maybe the anaconda install does this without our help.


In reply to: 391138171 [](ancestors = 391138171,391131125)

@@ -74,7 +74,7 @@ export class GuestJupyterServer
this.dataScience.activationStartTime
);
this.notebooks.set(identity.toString(), result);
const oldDispose = result.dispose;
const oldDispose = result.dispose.bind(result);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would have been one reason why things weren't shutting down..

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly. This would only be on the live share tests though


In reply to: 391131614 [](ancestors = 391131614)

this.serviceManager.addSingleton<IInterpreterService>(IInterpreterService, InterpreterService);

// Make sure full interpreter services are available.
registerInterpreterTypes(this.serviceManager);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

registerInterpreterTypes [](start = 12, length = 24)

This line here makes all of the conda environments be found.

@@ -5,6 +5,7 @@ import { ContentsManager, Kernel, ServerConnection, Session, SessionManager } fr
import { Agent as HttpsAgent } from 'https';
import { CancellationToken } from 'vscode-jsonrpc';

import { noop } from 'jquery';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a mistaken autoimport. I mean a noop is noop, but we have our own one in utils that we use everywhere else. Don't think we want jquery here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I'll fix it.

oldAsk = settings.datascience.askForKernelRestart;
settings.datascience.askForKernelRestart = false;
}
await this.restartKernel(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a day ago I changed restartKernel to pass a bool to not log telemetry just for this case (as it was not user triggered restart). We could remove that now as it's not needed.

// tslint:disable:no-any no-require-imports no-var-requires
// Not sure why but on windows, if you execute a process from the System32 directory, it will just crash Node.
// Not throw an exception, just make node exit.
// However if a system32 process is run first, everything works.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine I guess, but that's super weird.

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities (and Security Hotspot 0 Security Hotspots to review)
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.0% 0.0% Duplication

@rchiodo rchiodo merged commit 1b04ed7 into master Mar 11, 2020
@rchiodo rchiodo deleted the rchiodo/test_with_anaconda branch March 11, 2020 23:39
DonJayamanne added a commit that referenced this pull request Mar 13, 2020
* master:
  Fix flakey file system tests (#10541)
  Tweaks for gather (#10532)
  Fix #10437: Update launch.json handling to support "listen" and "connect" (#10517)
  Add conda support to nightly flake test (#10523)
  Rename datascience to datascience_modules (#10525)
  Clean up the extension activation code. (#10455)
  Allow escape and ctrl+U to clear the interactive window (#10513)
  Fix builtins so they don't interfere with our execution (#10511)
  Jupyter autocompletion will only show up on empty lines, (#10420)
  notify on missing kernel and interpreter with kernel (#10508)
@lock lock bot locked as resolved and limited conversation to collaborators Mar 21, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants