Improve the performance of the Kore parser #2296

MirceaS · 2020-12-02T16:32:04Z

Review checklist

The author performs the actions on the checklist. The reviewer evaluates the work and checks the boxes as they are completed.

Summary. Write a summary of the changes. Explain what you did to fix the issue, and why you did it. Present the changes in a logical order. Instead of writing a summary in the pull request, you may push a clean Git history.
Documentation. Write documentation for new functions. Update documentation for functions that changed, or complete documentation where it is missing.
Tests. Write unit tests for every change. Write the unit tests that were missing before the changes. Include any examples from the reported issue as integration tests.
Clean up. The changes are already clean. Clean up anything near the changes that you noticed while working. This does not mean only spatially near the changes, but logically near: any code that interacts with the changes!

Using Text instead of String reduces peak memory use by 43% and total allocation by 24%. Never use String.

MirceaS · 2020-12-02T21:37:14Z

kore/app/share/GlobalMain.hs

+        Text.readFile fileName
+        & liftIO
+        & clockSomethingIO "Reading the input file"


@ttuegel Why are these the other way round now?

I changed it from ($) to (&) so that the action (Text.readFile) would appear before the wrappers.

ttuegel · 2020-12-09T02:29:28Z

I made some changes to your parser performance pull request which reduce the memory use significantly. (For some reason we were still using String. Never use String.) I would like you to try a few more things:

Avoid using Text.cons in the parser. Use lookAhead or match instead. Text.cons causes copying, but the tokens we are parsing are already in contiguous blocks of memory.
Refactor parseAnyId so that it only calls parseIntoId once. Right now, each branch calls parseIntoId separately. That incurs up to three calls to getSourcePos per token parsed.
Investigate the performance of validation.

ttuegel

Please remove the redundant look-ahead, and then we can merge this.

ttuegel · 2020-12-09T02:30:20Z

kore/app/share/GlobalMain.hs

+        Text.readFile fileName
+        & liftIO
+        & clockSomethingIO "Reading the input file"


I changed it from ($) to (&) so that the action (Text.readFile) would appear before the wrappers.

ttuegel · 2020-12-09T02:32:12Z

kore/src/Kore/Parser/Lexer.hs

+    (genericId, _) <- Parser.match
+        $ (Parser.satisfy isFirstChar <?> "first identifier character")
+        >> Parser.takeWhileP (Just "identifier character") isBodyChar


I think this is easier to read:

Suggested change

(genericId, _) <- Parser.match

$ (Parser.satisfy isFirstChar <?> "first identifier character")

>> Parser.takeWhileP (Just "identifier character") isBodyChar

(genericId, _) <- Parser.match $ do

_ <- Parser.satisfy isFirstChar <?> "first identifier character"

_ <- Parser.takeWhileP (Just "identifier character") isBodyChar

pure ()

ttuegel · 2020-12-09T02:33:19Z

kore/src/Kore/Parser/Lexer.hs

-    then do
-        skipChar '\\'
-        (c :) <$> parseIdRaw KeywordsPermitted
+    then fst <$> Parser.match


Using match here makes the look-ahead (above) redundant.

* Inline some primitive parsers * Kore.Parser.Lexer: Remove unused primitive parsers * Run Parser over Text, not String Using Text instead of String reduces peak memory use by 43% and total allocation by 24%. Never use String. Co-authored-by: Thomas Tuegel <[email protected]>

inlined space and stringParserToIdParser Lexer functions

69c9474

MirceaS changed the title ~~inlined space and stringParserToIdParser Lexer functions~~ Improve parser efficiency Dec 2, 2020

ttuegel changed the title ~~Improve parser efficiency~~ Inline some low-level parsers Dec 2, 2020

ttuegel added 3 commits December 2, 2020 14:03

Merge branch 'master' into 2189

69ad76e

Kore.Parser.Lexer: Remove unused primitive parsers

663cfc5

Run Parser over Text, not String

3047aa1

Using Text instead of String reduces peak memory use by 43% and total allocation by 24%. Never use String.

MirceaS commented Dec 2, 2020

View reviewed changes

MirceaS added 2 commits December 4, 2020 16:41

fixed unit tests

900d983

addressed comments

bc05ca9

ttuegel requested review from ttuegel and andreiburdusa December 8, 2020 15:09

ttuegel changed the title ~~Inline some low-level parsers~~ Improve the performance of the Kore parser Dec 8, 2020

Merge branch 'master' into 2189

54d255f

ttuegel suggested changes Dec 9, 2020

View reviewed changes

addressed PR comments

1ba70f8

andreiburdusa approved these changes Dec 9, 2020

View reviewed changes

ttuegel self-requested a review December 9, 2020 15:10

ttuegel approved these changes Dec 9, 2020

View reviewed changes

ttuegel merged commit 08b378f into master Dec 9, 2020

ttuegel deleted the 2189 branch December 9, 2020 17:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve the performance of the Kore parser #2296

Improve the performance of the Kore parser #2296

Uh oh!

MirceaS commented Dec 2, 2020 •

edited by ttuegel

Loading

Uh oh!

MirceaS Dec 2, 2020

Uh oh!

ttuegel Dec 9, 2020

Uh oh!

ttuegel commented Dec 9, 2020

Uh oh!

ttuegel left a comment

Uh oh!

ttuegel Dec 9, 2020

Uh oh!

ttuegel Dec 9, 2020

Uh oh!

ttuegel Dec 9, 2020

Uh oh!

Uh oh!

Improve the performance of the Kore parser #2296

Improve the performance of the Kore parser #2296

Uh oh!

Conversation

MirceaS commented Dec 2, 2020 • edited by ttuegel Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review checklist

Uh oh!

MirceaS Dec 2, 2020

Choose a reason for hiding this comment

Uh oh!

ttuegel Dec 9, 2020

Choose a reason for hiding this comment

Uh oh!

ttuegel commented Dec 9, 2020

Uh oh!

ttuegel left a comment

Choose a reason for hiding this comment

Uh oh!

ttuegel Dec 9, 2020

Choose a reason for hiding this comment

Uh oh!

ttuegel Dec 9, 2020

Choose a reason for hiding this comment

Uh oh!

ttuegel Dec 9, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MirceaS commented Dec 2, 2020 •

edited by ttuegel

Loading