Skip to content

Use regex+getTokenAtPosition to find dynamic import #28104

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 24, 2018

Conversation

sandersn
Copy link
Member

Use regex+getTokenAtPosition to find dynamic imports (and requires) instead of walking the entire tree. A tree walk overflows the stack for very deep trees. I assume, though I have not measured it, that this method is much faster for JS files, which may not have any occurrences of the words import or require, but would previously be walked.

All existing tests pass, and the stack overflow in #27433 no longer repros. However, a test based on the large JS file in that repo now fails in the binder. So I'll leave #27433 open for now even though technically this change makes antd-admin compile successfully.

Instead of walking the entire tree. This stack overflows for large
trees.

Still need to adapt a test.
@sandersn sandersn requested review from a user and weswigham October 24, 2018 17:40
@sandersn sandersn merged commit ff6f947 into master Oct 24, 2018
@sandersn sandersn deleted the refactor-collect-dynamic-import branch October 24, 2018 18:27
function collectDynamicImportOrRequireCallsForEachChild(node: Node) {
forEachChild(node, collectDynamicImportOrRequireCalls);
/** Returns a token if position is in [start-of-leading-trivia, end) */
function getTokenAtPosition(sourceFile: SourceFile, position: number): Node {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this differ from getTokenAtPosition in services/utitilities.ts? if not, it could be moved to compiler/utilities.ts and also used for services

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it uses Node.pos and Node.end instead of the much-more-complicated Node.getStart() and Node.getFullStart() that are present in the language service. The original getTokenAtPosition handles a lot more cases as well; I'd rather wait until something else in the compiler proper needs this code to move it into utilities and harden it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. In addition it doesn't use node.getChildren to get the actual token. So at least it should be renamed to getNodeAtPosition to avoid confusion.

if (hasJSDocNodes(node)) {
forEach(node.jsDoc, collectDynamicImportOrRequireCallsForEachChild);
function collectDynamicImportOrRequireCalls(file: SourceFile) {
const r = /import|require/g;
Copy link
Contributor

@ajafff ajafff Oct 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could tweak the regex to avoid false positives. though you need to adjust the loop a bit to save the result of r.exec to a variable.

/\b(?:import|require)\s*[(/<]/g

The above will not match require.resolve for example as it requires an opening parenthesis or a slash (typically the start of a comment) after it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we'd save much in the majority of files, and I prefer to have the simplest possible regex.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR is about performance, I'd rather we are careful to not regress performance where possible.
There is require.resolve, import.meta, in a project of mine there's a lot requiresTypeInformation and in addition these are regular verbs that could occur in a lot of comments.
Each time you encounter something that contains one of these words, you need to recursively look up the node in the AST. That can potentially be very slow for large files with deep nesting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the new collector is m log(n) where m is the number of import|require and n is the number of nodes. The old collector is n log(n), and I think m will only approach n in pathological cases.

Also, the compiler is going to crash on very deep ASTs in the binder anyway. I tried to fix that by adding a flat OperatorListExpression to replace deep BinaryExpression trees, but gave up (for now) because of the amount of work needed to update all our BinaryExpression-based code.

}
};
while (true) {
const child = hasJSDocNodes(current) && forEach(current.jsDoc, getContainingChild) || forEachChild(current, getContainingChild);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably unrelated to this change: IMO it should only look at JSDoc in JS files as ImportTypeNodes in JSDoc of TS files have no effect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants