Skip to content

Output encoding #97

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Sep 18, 2023
Merged

Output encoding #97

merged 5 commits into from
Sep 18, 2023

Conversation

nblumhardt
Copy link
Member

For ExpressionTemplate to be useful in scenarios like HTML email, webhook URL, or URL-encoded POST body construction, a safer mechanism is needed for output encoding.

For example, imagine we rewrite Serilog.Sinks.Email to use ExpressionTemplate, a message body might look like:

<p class="error">Exception: {@x}</p>

Since the email is being fed exceptions from a running application, a malicious user might cause an error to be generated with HTML in the message:

System.InvalidOperationException: <a href="my-bad-site">Click to see more info</a><br><br><br>is not a valid username.
    at ...

Today, to defend against this an htmlencode user-defined function might be used:

<p class="exception">Exception: {htmlencode(@x)}</p>

But, we all know how easily opt-in security measures can be overlooked.

This PR proposes to introduce a new type, TemplateOutputEncoder, that users (i.e. the Serilog.Sinks.Email assembly) can implement in order to automatically escape all output that's substituted into template holes. For example:

class TemplateOutputHtmlEncoder: TemplateOutputEncoder
{
    /// <summary>
    /// Replaces <c>&</c>, <c>&lt;</c>, <c>&gt;</c>, <c>&quot;</c>, and
    /// <c>&apos;</c> with their equivalent escape sequences. This renders the result safe for
    /// insertion into HTML attributes and element bodies apart from <c>script</c> and <c>style</c>.
    /// </summary>
    /// <param name="value">The string to encode.</param>
    /// <returns>The encoded string.</returns>
    public override string Encode(string value)
    {
        return System.Text.Encodings.Web.HtmlEncoder.Default.Encode(value);
    }
}

The encoder is provided when parsing/compiling the template:

var template = new ExpressionTemplate(
    "<p class="error">Exception: {@x}</p>",
    encoder: new TemplateOutputHtmlEncoder());

Opting out of encoding

The proposal introduces a new function in templates called unsafe, which can be used to opt out of escaping:

<p{unsafe(if @l = 'Error' then ' class="error"' else '')}>Exception: {@x}</p>

Caveats

Note that basic HTML escaping as used in the example can't correctly/safely encode values that appear in style or script contexts. HTML is a familiar use case for the example, but it's not discussed in full here.

Related work

The feature is based on the fork we use in Seq's webhook plug-in, which uses it for URI encoding within webhook URLs: https://github.com/datalust/seq-app-httprequest#configuration (see the URL row in the linked table).

@nblumhardt nblumhardt merged commit 938b02d into serilog:dev Sep 18, 2023
@nblumhardt nblumhardt mentioned this pull request Nov 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant