-
Notifications
You must be signed in to change notification settings - Fork 259
Performance, Caching and Concurrency
THIS DOCUMANT IS UNDER DEVELOPMENT AND HAVENT BEEN RELEASED YET
The runtime performance of C# scripting solution is identical to the compiled application. The truly impressive execution speed is attributed to the fact that the script at runtime is just an ordinary CLR assembly, which just happens to be compiled on-fly.
However the compilation itself is a potential performance hit. Unfortunately, the compilation is a heavy task, initialization of the engine can bring an enormous startup overhead (e.g. Roslyn). And this is when caching comes to the rescue.
With caching a given script file is never recompiled unless it's changed since the last execution. This concept is extremely similar to the Python caching model.
The CS-Script sophisticated caching algorithm even partially supports file-less scripts (aka eval) execution. Thus in the code below two delegates are created but the code is actually compiled only once (what is not possible neither for Mono nor Roslyn).
var code = @"int Sqr(int a) { return a * a; }";
var sqr1 = CSScript.CodeDomEvaluator.CreateDelegate(code);
var sqr2 = CSScript.CodeDomEvaluator.CreateDelegate(code);
Note: Only caching is only compatible with CodeDOM compiler engine: calls CSSCript.Load*
, CSScript.Compile.*
and CSScript.CodeDomEvaluator.*
). Thus caching will be effectively disabled for both CSScript.MonoEvaluator.*
and CSScript.RoslynEvaluator.*
.
But of course the major benefit of caching comes for script file execution.
Caching can be disabled:
- With
-c:0
switch from command line - With
CSScript.CacheEnabled = false
from code
However sometimes instead of disabling caching completely it may be more beneficial to perform fine caching control. This points will help you to understand some caching internals:
-
When script file is compiled CS-Script always creates assembly file, which is placed in the dedicated directory (cache) for further loading/execution. This is how CodeDOM compilation works. The benefit of file based assembly is that it can be cached and debugged. Something that neither Mono nor Roslyn compiler-as-service solutions can do.
-
The compiled assembly file is always created. Regardless if the caching is enabled or not. Enabling caching simply enables using the previous compilation result (assembly) if it is still up to date.
-
Location of the compiled script is deterministic and can be discovered by right-clicking the script file in explorer and selecting the corresponding option from the context menu. Alternatively the cached script location can be deducted from the script file location:
CSScript.GetCachedScriptPath("script full path");
-
The cache directories also contain some extra temporary files that are needed for injecting script specific metadata into the script assembly. This metadata is used to allow script reflecting itself: Script Reflection.
-
Cache directory doesn't grow endlessly and it is of a fixed size. Any temp files that are no longer needed are always removed on script host exit. Purging non temporary but cached compiled scripts if the source scripts do not exist any more can be done by executing
cscs cache -trim
command. -
Caching is not an obstruction but a help.
There were a reports about cached files being locked by the executing process leading to compiler error "Access to ...cs.compiled file is denied". This sort of problems is always caused by another process changing the the script file and truing to compile while while the script assembly is still loaded for execution. This scenario is rare but not entirely unusual. It's important to understand that it is a logical problem not a technical one. And while disabling caching will prevent locking it is a very heavy price to be paid and it doesn't address the problem directly.
The more practical and very reliable approach is to keep caching enabled but allow loading assembly as in-memory image leaving the compiled file completely unlocked (details are in the next section).
Any script execution may be a subject to some sort of synchronization in concurrent execution scenarios.
Note: synchronization (concurrency control) may only be required for execution of a given script by two or more competing processes. If one process executes script_a.cs and another one executes script_b.cs then there is no need for any synchronization as the script files are different and their executions do not collide with each other.
Hosted script code execution In case of hosted execution very often it is a script code (in-memory string) that is executed thus there is no any competition for the same resources (e.g. script file) by concurrent executions. Thus if there is no need for any concurrency control.
Hosted script file execution In this case there is a common resource (script file). Thus script engine needs to synchronize the access to this resource with the other concurrent executions if any.
Standalone script file execution In this case the execution is also based on the shared resource (script file) and concurrency control is applicable.
This document is describing the concurrency model implemented by CS-Script engine.
The most critical stage of script execution is "Compilation" and it typically needs to be atomic and synchronized system wide. During this stage script engine compiles the the script into assembly and any attempts to compile the same assembly at the same time will lead to the file locking and eventually the error of the underlying compiler engine.
In order to avoid this CS-Script uses global synchronization objects, which are used to by the competing engine instances for detecting when it is OK to do the compilation. Simply put, the script engine says "I am busy compiling this script. If you want to compile it too, wait until I am done.". Using caching (will be described later) dramatically decreases any possibility for an access collision.
The concurrency model is controlled by the Settings.ConcurrencyControl
configuration object, which is set to Standard
by default.
-
Standard: Simple model. The script engine doesn't start the timestamp validation and the script compilation until another engine validating finishes its job. Note: the compilation may be skipped if caching is enabled and the validation reviles that the previous compilation (cache) is still up to date. Due to the limited choices with the system wide named synchronization objects on Linux
Standard
is the only available synchronization model on Linux. Though it just happens to be a good default choice for Windows as well. -
HighResolution: A legacy synchronization model available on Windows only. While it can be beneficial in the intense concurrent "border line" scenarios, its practical value very limited.
-
None: No concurrency control is done by the script engine. All synchronization is the responsibility of the hosting environment.
- With
-inmem
command line switch - With
CSScript.GlobalSettings.InMemoryAssembly = true
from code - With InMemoryAssembly setting in ConfigConsole
InMemoryAssembly
is an enormously convenient mode that solves many concurrency problems in a very elegant way. The reason for InMemoryAssembly be set to false by defaultis rather historical. Foe years it has been a default loading model of .NET. But in the releases after v3.16 it will be set to true as a recognition of the new trend tarted with Roslyn, which unconditionally loads the assemblies as in-memory image.
- With
Strictly speaking synchronization of the compilation isn't needed if the script is compiled into a randomly named/placed assembly, which is executed afterwards. But if an efficient execution is a must then an optimization of the compilation is required (e.g. caching). And this in turn requires synchronization (caching will be described later in this document).
The next stage stage is the actual execution. During this stage the compiled script is loaded into AppDomain and it's entry points are exercised. This stage is not a subject of any synchronization. During this stage the compiled script file is locked by CLR and there is very little sense to wait for the file to become unlocked as the script business logic can require the script to be loaded indefinitely. Thus if the script engine tries to compile the script into assembly which is already loaded (being executed), it doesn't wait and let's the compilation engine to return the error (e.g. "Access to ...cs.compiled file is denied.").
While this is a relatively rare situation it can be completely eliminated by allowing loading the asse
The execution itself can derail any attempt to manage concurrency The concurre
And this is exactly what most of the compiler-as-service solutions (Roslyn, Mono) do. And CS-Script supports both of these engines. However the simplicity comes with the cost. The compilation is a heavy task, initialization of the engine can bring an enormous startup overhead. That is why caching comes to the rescue (will be described later). But caching is a turn _
This is the stage that requires concurrency safety as if the compiler is building the The most importants stage is VALIDATION First, script should be validated: assessed for having valid already compiled up to date assembly. Validation is done // by checking if the compiled assembly available at alls and then comparing timestamps of the assembly and the script file. // After all on checks are done all script dependencies (imports and ref assemblies) are also validated. Dependency validation is // also timestamp based. For the script the dependencies are identified by parsing the script and for the assembly by extracting the // dependencies metadata injected into assembly during the last compilation by the script engine. // The whole validation stage is atomic and it's synchronized system wide via SystemWideLock validatingFileLock. // SystemWideLock is a decorated Mutex-like system wide synchronization object: Mutex on Windows and file-lock on Linux. // This stage is very fast as there is no heavy lifting to be done just comparing timestamps. // Timeout is infinite as there is very little chance for the stage to hang. // --- // COMPILATION // Next, if assembly is valid (the script hasn't been changed since last compilation) the it is loaded for further execution without recompilation. // Otherwise it is compiled again. Compilation stage is also atomic, so concurrent compilations (if happen) do not try to build the assembly potentially // in the same location with the same name (e.g. caching). The whole validation stage is atomic and it's also synchronized system wide via SystemWideLock // 'compilingFileLock'. // This stage is potentially heavy. Some compilers, while being relatively fast may introduce significant startup overhead (like Roslyn). // That is why caching is a preferred execution approach. // Timeout is fixed as there is a chance that the third party compiler can hang. // --- // EXECUTION // Next, the script assembly needs to be loaded/executed.This stage is extremely heavy as the execution may take infinite time depending on business logic. // When the assembly file is loaded it is locked by CLR and none can delete/recreate it. Meaning that if any one is to recompile the assembly then this // cannot be done until the execution is completed and CLR releases the assembly file. System wide synchronization doesn't make much sense in this case // as open end waiting is not practical at all. Thus it's more practical to let the compiler throw an informative locking (access denied) exception. //
the s
Typically script execution consist of multiple stages and some of them need to be atomic and need to be synchronized system wide. // Note: synchronization (concurrency control) may only be required for execution of a given script by two or more competing // processes. If one process executes script_a.cs and another one executes script_b.cs then there is no need for any synchronization // as the script files are different and their executions do not collide with each other.