Skip to content

[SUGGESTION][FIX] Add support for raw string literals in mixed mode #69

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

filipsajdak
Copy link
Contributor

@filipsajdak filipsajdak commented Oct 11, 2022

Current implementation in mixed mode is not parsing raw string literals well.

Cpp2 source code

auto r = R"(
    this is raw string literal
    \n
    /*   */
    //
i: int = 42;
)";

/*
 *
 *   this is a comment
 *
 */

main: () -> int = {
    std::cout << "Hello world\n\n" << std::quoted(r) << std::endl;
}

Compiles to

// ----- Cpp2 support -----
#include "cpp2util.h"

#line 1 "/Users/filipsajdak/dev/execspec/external/tests/raw_string.cpp2"

auto r = R"(
    this is raw string literal
    \n
    /*   */
    //

#line 7 "/Users/filipsajdak/dev/execspec/external/tests/raw_string.cpp2"
#line 8 "/Users/filipsajdak/dev/execspec/external/tests/raw_string.cpp2"
)";

#line 16 "/Users/filipsajdak/dev/execspec/external/tests/raw_string.cpp2"
[[nodiscard]] auto main() -> int;

//=== Cpp2 definitions ==========================================================

#line 7 "/Users/filipsajdak/dev/execspec/external/tests/raw_string.cpp2"
int i { 42 }; 
#line 9 "/Users/filipsajdak/dev/execspec/external/tests/raw_string.cpp2"

/*
 
    this is a comment
 
 */

[[nodiscard]] auto main() -> int{
    std::cout << "Hello world\n\n" << std::quoted(r) << std::endl;
}

The proposed change makes cppfront parse raw string literals in a similar way as it parses /* */ comments. After this change cppfront produces:

// ----- Cpp2 support -----
#include "cpp2util.h"

#line 1 "/Users/filipsajdak/dev/execspec/external/tests/raw_string.cpp2"

auto r = R"(
    this is raw string literal
    \n
    /*   */
    //
i: int = 42;
)";

#line 16 "/Users/filipsajdak/dev/execspec/external/tests/raw_string.cpp2"
[[nodiscard]] auto main() -> int;

//=== Cpp2 definitions ==========================================================

#line 9 "/Users/filipsajdak/dev/execspec/external/tests/raw_string.cpp2"

/*
 
    this is a comment
 
 */

[[nodiscard]] auto main() -> int{
    std::cout << "Hello world\n\n" << std::quoted(r) << std::endl;
}

Debug info shows that the lines are recognized as /* R */ which was added symbol to mark raw string literals lines.

/*   */ 
/* 1 */ auto r = R"(
/* R */     this is raw string literal
/* R */     \n
/* R */     /*   */
/* R */     //
/* R */ i: int = 42;
/* 1 */ )";
/* 2 */ 
/* 2 */ /*
/* 2 */  *
/* 2 */  *   this is a comment
/* 2 */  *
/* 2 */  */
/* 2 */ 
/* 2 */ main: () -> int = {
/* 2 */     std::cout << "Hello world\n\n" << std::quoted(r) << std::endl;
/* 2 */ }

Close #62

@filipsajdak
Copy link
Contributor Author

filipsajdak commented Oct 11, 2022

In the last push, I have added also support for d-char-sequence from spec:

prefix(optional) R"d-char-sequence(optional)
(r-char-sequence(optional))d-char-sequence(optional)"

The only thing I did not check yet is the prefix.

The current implementation can handle:

auto r = R"abc(
    this is raw string literal
    \n
    /*   */
    R"(this should be ignored)"
    R"ab(this should be also ignored)ab"
    //
i: int = 42;
)abc";

@filipsajdak filipsajdak changed the title [Suggestion] Add support for raw string literals in mixed mode [SUGGESTION][FIX] Add support for raw string literals in mixed mode Oct 11, 2022
@filipsajdak
Copy link
Contributor Author

I have tested the prefixes and it seems to work, too.

@hsutter
Copy link
Owner

hsutter commented Oct 18, 2022

I actually completely forgot about raw string literals... thanks for doing this. At first blush this looks good, before merging let me take a deeper look with a fresh brain and a fresh cup of coffee...

Did you check the three /passthrough-tests files to make sure they still are diff-identical when passed through cppfront?

@hsutter hsutter self-assigned this Oct 18, 2022
@filipsajdak
Copy link
Contributor Author

No, unfortunately I did not do this.

Will check that.

@JohelEGP
Copy link
Contributor

I think https://github.com/modern-cmake/cppfront does that.

@filipsajdak filipsajdak force-pushed the fsajdak-add-handling-of-cpp1-rawstringliterals branch from d942cbf to 6feefab Compare October 18, 2022 12:10
@filipsajdak
Copy link
Contributor Author

I have rebased my changes to the main and done the passthrough tests.

clang & GCC files are the same.

MSVC ends with the error:

cppfront % ../../build/external/cppfront passthrough-tests/msvc-msstl-e.cpp2      
passthrough-tests/msvc-msstl-e.cpp2...
msvc-msstl-e.cpp2(174324,0): error: closing } does not match a prior {

Need to check what caused the error.

@filipsajdak filipsajdak force-pushed the fsajdak-add-handling-of-cpp1-rawstringliterals branch 2 times, most recently from c4e1a86 to 4c91b78 Compare October 18, 2022 17:14
@filipsajdak
Copy link
Contributor Author

OK, found a bug and fix it.

The issue was here

diff --git a/source/load.h b/source/load.h
index 71fd857..9954a02 100644
--- a/source/load.h
+++ b/source/load.h
@@ -248,7 +248,7 @@ auto process_cpp_line(
                 return r;
             }
             in_raw_string_literal = false;
-            i += end_pos+d_char_sequence.size()-1;
+            i = end_pos+d_char_sequence.size()-1;
         }
         else {
             r.all_comment_line = false;

Small but nasty mistake. Now all files compiles and generate identical output.

@filipsajdak filipsajdak force-pushed the fsajdak-add-handling-of-cpp1-rawstringliterals branch from 4c91b78 to b628e33 Compare October 30, 2022 19:51
@filipsajdak
Copy link
Contributor Author

Rebased to main - all passthrough-tests and regression-tests passed.

@filipsajdak filipsajdak force-pushed the fsajdak-add-handling-of-cpp1-rawstringliterals branch 3 times, most recently from 96d319c to 4945177 Compare December 5, 2022 08:24
@filipsajdak filipsajdak force-pushed the fsajdak-add-handling-of-cpp1-rawstringliterals branch from 4945177 to 14b8cf7 Compare December 6, 2022 21:51
@hsutter
Copy link
Owner

hsutter commented Dec 11, 2022

I tried this out with the passthrough-tests, starting with running cppfront msvc-msstl-e.cpp2, but it crashed.

Running it under the debugger, I see that the crash is an out-of-bounds access attempt in the second-last line in process_cpp_line:

prev = line[i];

where i was about 20-30 higher than line.size().

I didn't investigate further, but this should give you a repro and a roadmap to find the bug... can you take a look?

@filipsajdak
Copy link
Contributor Author

Hmm... I am surprised that it crashed. I will investigate

@filipsajdak
Copy link
Contributor Author

filipsajdak commented Dec 11, 2022

I have trouble reproducing the crash. I even have added an assert before the line that you have spotted (lines 304-306):

    assert(i < line.size());
    prev = line[i];
}

There was a bug that I fixed (#69 (comment)), and it is already pushed.

Can you check if line 251 uses = instead of +=? The correct line should look like this:

i = end_pos+raw_string_closing_seq.size()-1;

I have compiled my cppfront in clang

Apple clang version 14.0.0 (clang-1400.0.29.202)
Target: arm64-apple-darwin22.1.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

@hsutter Can you tell me which compiler you have used?

@filipsajdak
Copy link
Contributor Author

The interesting fact is that only MSVC is using raw string literals in its lib. It is always inline and finalised with the string_view suffix. E.g.:

_Result += R"(: ")"sv;

so d_char_sequence will be empty.

@filipsajdak
Copy link
Contributor Author

@hsutter I have rebased my changes on the newest main and still cannot reproduce. I will probably need to check on other compilers.

@filipsajdak filipsajdak force-pushed the fsajdak-add-handling-of-cpp1-rawstringliterals branch from 14b8cf7 to 5bece77 Compare December 12, 2022 00:11
@filipsajdak
Copy link
Contributor Author

@hsutter I am confused. I have tried it on MSVC and it works as well. I will need more information To be able to reproduce it.

My very first guess is that it looks like the previous bug that I have fixed (check above). Some merge issues might also cause it.

@hsutter
Copy link
Owner

hsutter commented Dec 12, 2022

OK, I'll take another look. Thanks!

@filipsajdak
Copy link
Contributor Author

@hsutter if it will not work, please send me the code you have via email I will verify what is wrong.

@filipsajdak filipsajdak force-pushed the fsajdak-add-handling-of-cpp1-rawstringliterals branch from 5bece77 to cd3160a Compare December 13, 2022 22:30
@hsutter
Copy link
Owner

hsutter commented Dec 15, 2022

Good news: It works fine!

Sorry for the noise: The test failure was almost certainly my error when I opened the branch in GitHub Desktop and must just not have had all the latest commits somehow. (I did try pulling the later commits, but there was a conflict and I'm just not familiar enough with switching to PR branches in GitHub Desktop, and I'm sadly not a git command-line person.)

So I have verified the PR by the time-honored method of manually applying it to my main branch and running the passthrough tests, and they work fine. So now I'll revert that and merge this PR. :) Thanks again.

@hsutter hsutter merged commit 24c4bd5 into hsutter:main Dec 15, 2022
@filipsajdak
Copy link
Contributor Author

Happy to hear that it's working.

The good thing is that this situation motivated me to run sanitizers which helped in discovering another issue.

@filipsajdak filipsajdak deleted the fsajdak-add-handling-of-cpp1-rawstringliterals branch December 22, 2022 10:43
Azmah-Bad pushed a commit to Azmah-Bad/cppfront that referenced this pull request Feb 24, 2023
…f-cpp1-rawstringliterals

[SUGGESTION][FIX] Add support for raw string literals in mixed mode
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] In mixed mode cppfront can interfere with string done with raw string literal
3 participants