Skip to content

Commit f07f15f

Browse files
committed
feat(devin-lang): improve code fence parsing to support embedded languages #101
The commit introduces a new feature to the DevInLexer.flex file, enhancing the code fence parsing functionality to recognize and handle embedded languages within code blocks. This change enables the lexer to correctly interpret content inside code fences, which is crucial for the correct parsing and processing of code samples in the DevInLang language. The modification includes the addition of a new state, `CODE_BLOCK`, to distinguish between normal text segments and code blocks. The lexer now checks for the start of a code block with a leading `@`, `/`, or `$` character, and only transitions to the corresponding block type (e.g., `AGENT_BLOCK`, `COMMAND_BLOCK`, or `VARIABLE_BLOCK`) if the current input is not within a code block. Furthermore, a new test case is added to `DevInParsingTest.kt` to validate the parsing of Java annotations within code blocks, ensuring the robustness of the lexer's new behavior.
1 parent 31fe212 commit f07f15f

File tree

6 files changed

+36
-6
lines changed

6 files changed

+36
-6
lines changed

exts/devin-lang/src/grammar/DevInLexer.flex

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ COMMAND_ID=[a-zA-Z0-9][_\-a-zA-Z0-9]*
3636
LANGUAGE_ID=[a-zA-Z0-9][_\-a-zA-Z0-9]*
3737

3838
TEXT_SEGMENT=[^$/@\n]+
39-
CODE_CONTENT=([^$/@\n]+ )
39+
CODE_CONTENT=[^\n]+
4040
NEWLINE= \n | \r | \r\n
4141

4242
%{
@@ -69,9 +69,10 @@ NEWLINE= \n | \r | \r\n
6969

7070
%%
7171
<YYINITIAL> {
72-
"@" { yybegin(AGENT_BLOCK); return AGENT_START; }
73-
"/" { yybegin(COMMAND_BLOCK); return COMMAND_START; }
74-
"$" { yybegin(VARIABLE_BLOCK); return VARIABLE_START; }
72+
"@" { if(!isCodeStart) { yybegin(AGENT_BLOCK); return AGENT_START; } else { yypushback(1); yybegin(CODE_BLOCK); }}
73+
"/" { if(!isCodeStart) { yybegin(COMMAND_BLOCK); return COMMAND_START; } else { yypushback(1); yybegin(CODE_BLOCK); }}
74+
"$" { if(!isCodeStart) { yybegin(VARIABLE_BLOCK); return VARIABLE_START; } else { yypushback(1); yybegin(CODE_BLOCK); }}
75+
7576
"```" {IDENTIFIER}? { yybegin(LANG_ID); if (isCodeStart == true) { isCodeStart = false; return CODE_BLOCK_END; } else { isCodeStart = true; }; yypushback(yylength()); }
7677

7778
{TEXT_SEGMENT} { if(isCodeStart) { return codeContent(); } else { return TEXT_SEGMENT; } }

exts/devin-lang/src/test/kotlin/cc/unitmesh/language/DevInParsingTest.kt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,4 +19,8 @@ class DevInParsingTest : ParsingTestCase("parser", "devin", DevInParserDefinitio
1919
fun testEmptyCodeFence() {
2020
doTest(true)
2121
}
22+
23+
fun testJavaAnnotation() {
24+
doTest(true)
25+
}
2226
}

exts/devin-lang/src/test/testData/parser/EmptyCodeFence.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ DevInFile
55
CodeBlockElement(CODE)
66
PsiElement(DevInTokenType.CODE_BLOCK_START)('```')
77
PsiElement(DevInTokenType.NEWLINE)('\n')
8-
DevInCodeContentsImpl(CODE_CONTENTS)
8+
ASTWrapperPsiElement(CODE_CONTENTS)
99
PsiElement(DevInTokenType.CODE_CONTENT)('print("Hello, world!")')
1010
PsiElement(DevInTokenType.NEWLINE)('\n')
1111
PsiElement(DevInTokenType.CODE_BLOCK_END)('```')
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
```java
2+
@Target({ElementType.TYPE})
3+
@Retention(RetentionPolicy.RUNTIME)
4+
public @interface ExampleAnnotation {
5+
String value() default "";
6+
}
7+
```
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
DevInFile
2+
CodeBlockElement(CODE)
3+
PsiElement(DevInTokenType.CODE_BLOCK_START)('```')
4+
PsiElement(DevInTokenType.LANGUAGE_ID)('java')
5+
PsiElement(DevInTokenType.NEWLINE)('\n')
6+
ASTWrapperPsiElement(CODE_CONTENTS)
7+
PsiElement(DevInTokenType.CODE_CONTENT)('@Target({ElementType.TYPE})')
8+
PsiElement(DevInTokenType.NEWLINE)('\n')
9+
PsiElement(DevInTokenType.CODE_CONTENT)('@Retention(RetentionPolicy.RUNTIME)')
10+
PsiElement(DevInTokenType.NEWLINE)('\n')
11+
PsiElement(DevInTokenType.CODE_CONTENT)('public ')
12+
PsiElement(DevInTokenType.CODE_CONTENT)('@interface ExampleAnnotation {')
13+
PsiElement(DevInTokenType.NEWLINE)('\n')
14+
PsiElement(DevInTokenType.CODE_CONTENT)(' String value() default "";')
15+
PsiElement(DevInTokenType.NEWLINE)('\n')
16+
PsiElement(DevInTokenType.CODE_CONTENT)('}')
17+
PsiElement(DevInTokenType.NEWLINE)('\n')
18+
PsiElement(DevInTokenType.CODE_BLOCK_END)('```')

exts/devin-lang/src/test/testData/parser/JavaHelloWorld.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ DevInFile
66
PsiElement(DevInTokenType.CODE_BLOCK_START)('```')
77
PsiElement(DevInTokenType.LANGUAGE_ID)('java')
88
PsiElement(DevInTokenType.NEWLINE)('\n')
9-
DevInCodeContentsImpl(CODE_CONTENTS)
9+
ASTWrapperPsiElement(CODE_CONTENTS)
1010
PsiElement(DevInTokenType.CODE_CONTENT)('public class Main {')
1111
PsiElement(DevInTokenType.NEWLINE)('\n')
1212
PsiElement(DevInTokenType.CODE_CONTENT)(' public static void main(String[] args) {')

0 commit comments

Comments
 (0)