Skip to content

preprocessor_proposal

Matěj Štágl edited this page Apr 22, 2022 · 10 revisions

Preprocessor proposal

Symbols

Symbols are essentially global variables. The same rules for a naming of symbols applies as for global variables. Values are either bool, string or double.

Defining symbols

  • #define - defines a symbol
  • #undef - undefines a symbol

The base way of defining a symbol is to use #define followed by one or more whitespaces then by a symbol name composed by a single literal followed by one or more whitespaces, optionally followed by =, followed by one or more whitespaces and finally a value. A value has to be a single literal, either true, false, a double like 1, 3.14 or a string enclosed in either single or double quotes. Opening and enclosing quote has to be of the same type. 'my string', "my string". A string like 'my string" is invalid and results in a preprocessor error.

#define MY_SYMBOL = 5
#define MY_SYMBOL 5

When a symbol is defined and right hand of the definition is ommited, we treat it as if the right hand was = true.

#define MY_SYMBOL

Multiple definitions can be shorthanded in one line via ,. This works with both #define and #undefine.

#define MY_SYMBOL = false, LOG_LEVEL = "warning", MY_VERSION = '4.0.1'
#undef MY_SYMBOL, LOG_LEVEL

Comments

All # directives can contain single line comments // which are ignored by the preprocessor interpreter. When // sequence is encountered the rest of line is ignored. The only time this does not apply is when a string literal is being parsed. #define MY_STRING = '//' is a valid usage of #define.

#define LOG_LEVEL = "warning" // this setting means something and has some known values blah blah blah... I can use #whatever here and it is ignored

Multiline comments are not supported.

Conditional compilation

  • #if
  • #elseif, #else if, #elif (all of these have the same meaning)
  • #else
  • #endif

A block *(zero or more lines of code) between #if and #endif will be compiled only if the #if's expression is truthy. Same rules apply as with normal ifs. We consider a truthy value to be true, any double greater than 0 and any string with length greater than 0.

#if "YES" // any string other than "" or '' is truthy

#endif

Comparing operators >, >=, <, <= can be used in #if's expression, ==, != (non)equivalance comparing operators and finally !, || and && operators are supported.

#if 2 + 2 > 5 // this is illegal, will throw. We don't support arithmetical operators such as + in the preprocessor

#endif

Parentheses are supported in expressions.

#if DEBUG == (((true))) // this is a legit rhs

Parentheses around expression in #if, #elif, #elseif and #else if can be ommited but are supported.

// these two are the same
#if (DEBUG == true)
#if DEBUG

Multiple branches can be added to the #if directive via #elif, #else if, #else if and #else. Alias #elif is introduced due to it's usage in C# and C. #if directive has to end with a #endif directive. If no such directive is found it results in a throw.

Errors and warnings

  • #error
  • #warning

To throw errors #error is used. An optional message can be included in #error as a single string (single or double quotes enclosed) literal.

#if LANGVER < 2.5
    #error "This snippet requires WattleScript at least 2.5"
#endif          

TBD: do we allow omitting "", '' around string literals here?

If no rhs is provided an interpreter will throw a default message with line on which the error was raised.

#warning work similarly expect insead of throwing an error a warning is produced. A mechanism for warning is not yet well specced in WattleScript but when ScriptOptions.ParserErrorMode is set to ParserErrorModes.Report a messages with type warnings can be produced.
It is yet to decide how to do this when parsing is aborted after the first error.

Parser directives

  • #line

#line directive is used by code generators for WattleScript to translate line numbers and enable seamless debugging. There are two ways to write a #line directive:

1. #line LINE_NUMBER "FILENAME"?

LINE_NUMBER has to be an integer greater than 0 or a special literal. Known special literals are: - default - line numbering mode is set to native - anything other is invalid and throws

After LINE_NUMBER a FILENAME string enclosed in double or single quotes can be included. This is purely for debugging purposes.

1. #line 200
2. var a = = 
3. local b = =
4. #line default
5. var c = =

// Unexpected symbol at line 200 after after "var a ="
// Unexpected symbol at line 201 after "local b ="
// Unexpected symbol at line 5 after "var c ="

2.#line (START_LINE, START_COL, END_LINE, END_COL, COL_OFFSET) "FILENAME"?

  • START_LINE, START_COL - the starting line and the first character index on that line (zero based) that follows the directive
  • END_LINE, END_COL - the end line and character index of the marked region
  • COL_OFFSET - the column offset for the #line directive to take place. The next character after COL_OFFSET characters is treated as column 1.
1. #line (5, 1, 10, 1, 20)
2. /*20 chars comment*/var a = =

// "var" in this example starts at line 5, character 1

Regions

  • #region
  • #endregion

#region directives are a feature to be used in LSP implementations as "folding ranges". Preprocessor ignores them apart from checking that each #region is enclosed with #endregion.

Pragmas

  • #pragma

Pragmas serve as a way to set a subset of ScriptOptions from the script itself. This area is yet to be specced but one possible usage would be for example to force table indexing from 0 or 1. In all cases the system running the script should have the last word about whether we honor or ignore the pragma. This could be done with a hook mechanism from which the executing application returns true/false. The syntax of pragmas is:

#pragma name(arg1, arg2)

Only the name part is mandatory.

#pragma name() // no arguments are provided
#pragma name // brks around the name can be ommited when passing no arguments

All arguments have to be a simple literals true, false, a double like -1 or 3.14 or a string enclosed with single or double quotes like 'yes', "no". Pragmas unknown to the interpreter are still validated via the hook mechanism so the executing application can use it for side effects. If the executing application provides no pragmas validation implementation we treat all pragmas as allowed by default.

Pragma handlers are set from ScriptOptions and both sync and async versions are supported. If both are supplied then sync option is ignored and async option is used to resolve pragmas.

ScriptOptions.PragmaHandler // bool f(string name, params object[] args)
ScriptOptions.PragmaHandlerAsync // async Task<bool> f(string name, params object[] args)

Exit

  • #exit

This directive can be used to immediatly abort the precomilation and subsequently the compilation as well. Optionally a message can be included, enclosed is a single or double quotes.

#if LANGVER < 5
    #exit "To run this script WattleScript 5 or newer is required"
#endif

Predefined symbols

  • #LANGVER - a double corresponding to the WattleScript NuGet version installed on the executing system. Example values: 1.2 or 5.
  • #DEBUG- a bool indicating whether the script is to be compiled in debug or release mode. true/false

We allow redefining these symbols. #LANGVER 10 will force the symbol to hold value 10.

Clone this wiki locally