Skip to content

[SUGGESTION][FIX] Add narrowing 'as' cast and static assert failure instead of nonesuch… #106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

filipsajdak
Copy link
Contributor

@filipsajdak filipsajdak commented Nov 1, 2022

The current implementation allows for writing below code

args : std::span<Arg> = (argv, argc as std::size_t);
c : std::ifstream = args.back(); // <= here it crashes

It fails as as<std::size_t>(argc) matches as(...) that returns nonesuch.

This change introduces as casts:

  1. as - static asserted casts,
  2. as std::optional<T> - optional cast,
  3. cpp2::unsafe_narrow<T>() - narrow cast,

If the safe cast is not possible, the as cast will fail to compile with information to use optional cast as std::optional<T>.

error: static_assert failed due to requirement 'program_violates_type_safety_guarantee<long, double>' "No safe 'as' cast available please use optional cast `as std::optional<T>`"

The specialization of the as() functions try to match arithmetic types (algorithm based on the gsl::narrow() function). It also removes the version of the as(...) function that silently returns nullptr - now it fails at compile time thanks to static_assert.

After this change, the below code does not compile all narrowing cases, and will trigger static assertion with the information to use two other possible casts:

main: () -> int = {
    max_uint8  := std::numeric_limits<std::uint8_t>::max();
    max_int8   := std::numeric_limits<std::int8_t>::max();
    max_double := std::numeric_limits<double>::max();
    max_float  := std::numeric_limits<float>::max();

    v1  := max_uint8  as std::int8_t;  // static_assert failure
    v2  := -1         as std::uint8_t; // static_assert failure
    v3  := -1.0       as std::uint8_t; // static_assert failure
    v4  := max_double as float;        // static_assert failure

    v5  := -1.0      as long;          // static_assert failure
    v6  := -1        as double;        // OK
    v7  := max_float as double;        // OK
    v8  := max_uint8 as std::uint16_t; // OK
    v9  := max_int8  as std::uint16_t; // static_assert failure
    v10 := 10        as std::size_t;   // OK
    v11 := "string"  as std::string;   // OK
    v12 := 'a'       as int;           // OK
    v13 := 36        as char;          // OK

    std::cout << " v6 = (v6)$" << std::endl;  //  v6 = -1.000000
    std::cout << " v7 = (v7)$" << std::endl;  //  v7 = 340282346638528859811704183484516925440.000000
    std::cout << " v8 = (v8)$" << std::endl;  //  v8 = 255
    std::cout << "v10 = (v10)$" << std::endl; // v10 = 10
    std::cout << "v11 = (v11)$" << std::endl; // v11 = string
    std::cout << "v12 = (v12)$" << std::endl; // v12 = 97
    std::cout << "v13 = (v13)$" << std::endl; // v13 = 36
}

Thanks to optional as as std::optional<T> we can rewrite it to:

main: () -> int = {
    max_uint8  := std::numeric_limits<std::uint8_t>::max();
    max_int8   := std::numeric_limits<std::int8_t>::max();
    max_double := std::numeric_limits<double>::max();
    max_float  := std::numeric_limits<float>::max();

    v1  := max_uint8  as std::optional<std::int8_t>;  // OK
    v2  := -1         as std::optional<std::uint8_t>; // OK
    v3  := -1.0       as std::optional<std::uint8_t>; // OK
    v4  := max_double as std::optional<float>;        // OK

    v5  := -1.0      as std::optional<long>;          // OK
    v6  := -1        as std::optional<double>;        // OK
    v7  := max_float as std::optional<double>;        // OK
    v8  := max_uint8 as std::optional<std::uint16_t>; // OK
    v9  := max_int8  as std::optional<std::uint16_t>; // OK
    v10 := 10        as std::optional<std::size_t>;   // OK
    v11 := "string"  as std::optional<std::string>;   // OK
    v12 := 'a'       as std::optional<int>;           // OK
    v13 := 36        as std::optional<char>;          // OK

    std::cout << " v1 = (v1)$" << std::endl;  //  v1 = (empty) 
    std::cout << " v2 = (v2)$" << std::endl;  //  v2 = (empty)
    std::cout << " v3 = (v3)$" << std::endl;  //  v3 = (empty)
    std::cout << " v4 = (v4)$" << std::endl;  //  v4 = (empty)
    std::cout << " v5 = (v5)$" << std::endl;  //  v5 = (empty)
    std::cout << " v6 = (v6)$" << std::endl;  //  v6 = -1.000000
    std::cout << " v7 = (v7)$" << std::endl;  //  v7 = 340282346638528859811704183484516925440.000000
    std::cout << " v8 = (v8)$" << std::endl;  //  v8 = 255
    std::cout << " v9 = (v9)$" << std::endl;  //  v9 = 127
    std::cout << "v10 = (v10)$" << std::endl; // v10 = 10 
    std::cout << "v11 = (v11)$" << std::endl; // v11 = string
    std::cout << "v12 = (v12)$" << std::endl; // v12 = 97 
    std::cout << "v13 = (v13)$" << std::endl; // v13 = 36 
}

Thanks to unsafe_narrow():

main: () -> int = {
    max_uint8  := std::numeric_limits<std::uint8_t>::max();
    max_int8   := std::numeric_limits<std::int8_t>::max();
    max_double := std::numeric_limits<double>::max();
    max_float  := std::numeric_limits<float>::max();

    v1  := cpp2::unsafe_narrow<std::int8_t>(max_uint8);   // OK
    v2  := cpp2::unsafe_narrow<std::uint8_t>(-1);         // OK
    v3  := cpp2::unsafe_narrow<std::uint8_t>(-1.0);       // OK
    v4  := cpp2::unsafe_narrow<float>(max_double);        // OK

    v5  := cpp2::unsafe_narrow<long>(-1.0);               // OK
    v6  := cpp2::unsafe_narrow<double>(-1);               // OK
    v7  := cpp2::unsafe_narrow<double>(max_float);        // OK
    v8  := cpp2::unsafe_narrow<std::uint16_t>(max_uint8); // OK
    v9  := cpp2::unsafe_narrow<std::uint16_t>(max_int8);  // OK
    v10 := cpp2::unsafe_narrow<std::size_t>(10);          // OK
    v11 := cpp2::unsafe_narrow<std::string>("string");    // OK
    v12 := cpp2::unsafe_narrow<int>('a');                 // OK
    v13 := cpp2::unsafe_narrow<char>(36);                 // OK

    std::cout << " v1 = (v1)$" << std::endl;  //  v1 = -1
    std::cout << " v2 = (v2)$" << std::endl;  //  v2 = 255
    std::cout << " v3 = (v3)$" << std::endl;  //  v3 = 0
    std::cout << " v4 = (v4)$" << std::endl;  //  v4 = inf
    std::cout << " v5 = (v5)$" << std::endl;  //  v5 = -1
    std::cout << " v6 = (v6)$" << std::endl;  //  v6 = -1.000000
    std::cout << " v7 = (v7)$" << std::endl;  //  v7 = 340282346638528859811704183484516925440.000000
    std::cout << " v8 = (v8)$" << std::endl;  //  v8 = 255
    std::cout << "v10 = (v10)$" << std::endl; // v10 = 10
    std::cout << "v11 = (v11)$" << std::endl; // v11 = string
    std::cout << "v12 = (v12)$" << std::endl; // v12 = 97
    std::cout << "v13 = (v13)$" << std::endl; // v13 = 36
}

That will also fix below code:

args : std::span<Arg> = (argv, (argc as std::optional<std::size_t>).value()); // using `argc as std::size_t` will fail with static_assert
c : std::ifstream = args.back(); // no crash, works as expected

Or with unsafe_narrow:

args : std::span<Arg> = (argv, cpp2::unsafe_narrow<std::size_t>(argc));
c : std::ifstream = args.back(); // no crash, works as expected

Closes #104

@JohelEGP
Copy link
Contributor

JohelEGP commented Nov 1, 2022

    v1  := max_uint8  as std::int8_t;   // throws narrowing_error

That seems like a departure from https://wg21.link/P2392, which matches

Otherwise, if x can be bound to a refto(T,x), then x as T means a refto(T,x) bound to x.

I think you can confirm this with Circle.

@hsutter
Copy link
Owner

hsutter commented Nov 1, 2022

Thanks! General timing note (maybe I should paste/link this in a few other PRs/issues if this isn't visible enough):

My responses will be delayed because of the Kona meeting next week, which will be the first face-to-face ISO C++ meeting in nearly three years, and the week before and after the meeting are generally also consumed with travel and committee-related work. I plan to be back to a normal schedule after that and will be catching up with PRs and issues then. Thanks!

@hsutter
Copy link
Owner

hsutter commented Nov 1, 2022

But quick ack on this thread:

My current thought is to require narrow or narrow_cast (semantically or syntactically) for narrowing conversions, and is one of the things still to be added to the initial partial as implementation. Major design choices:

  1. Require the explicit "narrow" syntax because potentially-lossy conversions should stand out, and make as not support narrowing conversions.
  2. The one being suggested here, which is to have as do a narrow which throws on data loss, but this adds a branch and makes the operation throwing.
  3. Have as do a narrow_cast which is a "trust me" operation.

Some considerations:

  • (1) and (2) are safe by default.
  • (2) and (3) may be friendlier for generic code, if experience shows that allowing as T is generally useful even when potentially narrowing.

That currently tilts things in favor of (2) in my mind, which is the direction of this PR, but I'd be happy to see usability evidence/examples arguing for any these options as we consider this PR.

@JohelEGP P2392 is a living document that will be updated based on this and other experience (and committee feedback) and can be expected to be a little out of sync sometimes. But for max_uint8 as std::int8_t I don't think the bullet you quoted relates because you can't bind a uint8_t to an int8_t& (in Cpp1 syntax)? But I'm writing this quickly and may be missing something.

@filipsajdak
Copy link
Contributor Author

Thanks for the feedback. I assumed that you are preparing for the Kona meeting and I am not expecting that you will process these PRs at the moment. I hope I will be able to join one of the committee meetings in the future.

Regarding this PR. I like the cleanliness of the x as T syntax (easy to learn and easy to read). I agree that narrowing operations shall be visible and shall stand out. I was thinking about making it

x narrowed_to T; // or something that you can easily read

The good thing is that we can have static_assert that can inform not to use the as keyword for narrowing cases and to use narrowed_to (similar case as with pointer arithmetic where we recommend using std::span or gsl::span). That approach will make writing generic code harder but will guide the programmer and will stand out like this:

args : std::span<Arg> = (argv, argc narrowed_to std::size_t);
c : std::ifstream = args.back();

@feature-engineer
Copy link

x narrowed_to T;

Maybe x narrowed_as T; is better? or x lossy_as T;?

@filipsajdak
Copy link
Contributor Author

We can brainstorm on the keyword name to make it clear. Let's first agree on the general idea of how to tackle that cases.

@feature-engineer
Copy link

feature-engineer commented Nov 2, 2022

  1. Require the explicit "narrow" syntax because potentially-lossy conversions should stand out, and make as not support narrowing conversions.

    1. The one being suggested here, which is to have as do a narrow which throws on data loss, but this adds a branch and makes the operation throwing.

    2. Have as do a narrow_cast which is a "trust me" operation.

I have a broader language suggestion that may be useful in more cases than just narrow casts:

Maybe we could indicate whether we want to throw on mismatch, static assert, or return an optional instead by adding ? for optional and ! for static assert (leaving throwing as default)?
e.g. x as? T; would return optional which is empty if no match exists, x as! T; would fail at compile time if no match exists (or if x's value is not defined at compile time), and x as T; would throw an exception if no match exists.

The motivation for having built-in defaults for matching when no pattern applies is easily delegating control over what happens to the caller.

In the context of this problem, this would mean we just match on cases where there's no data loss, and have the caller decide if they want an optional, a thrown exception, or a static assertion in case there's data loss by selecting the type of match in the calling site.

@JohelEGP
Copy link
Contributor

JohelEGP commented Nov 2, 2022

I think it makes sense for

    v1  := max_uint8  as std::int8_t;   // throws narrowing_error

to use narrow instead. After all, it's safe, just like when casting std::variants and class hierarchies with as. Perhaps a difference is that the latter two the compiler can optimize when the as is preceded by a corresponding is check. Would the same work with v1? Looking at the paper, in a generic context, is would do something different, as it matches

Otherwise, if C(x) is valid and convertible to bool, then x is C means C(x).

@JohelEGP P2392 is a living document that will be updated based on this and other experience (and committee feedback) and can be expected to be a little out of sync sometimes. But for max_uint8 as std::int8_t I don't think the bullet you quoted relates because you can't bind a uint8_t to an int8_t& (in Cpp1 syntax)? But I'm writing this quickly and may be missing something.

You're right. I think it actually matches

Otherwise, if typeof(x) is implicitly convertible to T, then x as T means to convert x to an rvalue of type
T (e.g., including the case where both are Pointer types and this is a static upcast).

@filipsajdak
Copy link
Contributor Author

filipsajdak commented Nov 3, 2022

Thanks to @feature-engineer I decided to play with exclamation marks and question marks. I don't like as! or as? - what I like with the pure as is that you can read it and it will make sense:

pass(x as std::uint16_t);

I decided to try ! and ? at the end.

So if you want to be strict about casting you can write code adding exclamation marks after the type:

main: () -> int = {
    max_uint8  := std::numeric_limits<std::uint8_t>::max();
    max_int8   := std::numeric_limits<std::int8_t>::max();
    max_double := std::numeric_limits<double>::max();
    max_float  := std::numeric_limits<float>::max();

    v   := max_uint8  as std::uint16_t!; // OK

    v1  := max_uint8  as std::int8_t!;   // compile-time error
    v2  := -1         as std::uint8_t!;  // compile-time error
    v3  := -1.0       as std::uint8_t!;  // compile-time error
    v4  := max_double as float!;         // compile-time error
 
    v5  := -1.0      as long!;           // compile-time error
    v6  := -1        as double!;         // compile-time error
    v7  := max_float as double!;         // OK
    v8  := max_uint8 as std::uint16_t!;  // OK
    v9  := max_int8  as std::uint16_t!;  // compile-time error
    v10 := 10        as std::size_t!;    // compile-time error
    v11 := max_uint8 as std::int16_t!;   // OK
}

If you don't want static asserts but you don't want to have exceptions you can use question marks to get std::optional:

main: () -> int = {
    max_uint8  := std::numeric_limits<std::uint8_t>::max();
    max_int8   := std::numeric_limits<std::int8_t>::max();
    max_double := std::numeric_limits<double>::max();
    max_float  := std::numeric_limits<float>::max();

    v   := max_uint8  as std::uint16_t?; // OK, 255

    v1  := max_uint8  as std::int8_t?;   // OK, (empty)
    v2  := -1         as std::uint8_t?;  // OK, (empty)
    v3  := -1.0       as std::uint8_t?;  // OK, (empty)
    v4  := max_double as float?;         // OK, (empty)
 
    v5  := -1.0      as long?;           // OK, -1
    v6  := -1        as double?;         // OK, -1.000000
    v7  := max_float as double?;         // OK, 340282346638528859811704183484516925440.000000
    v8  := max_uint8 as std::uint16_t?;  // OK, 255
    v9  := max_int8  as std::uint16_t?;  // OK, 127
    v10 := 10        as std::size_t?;    // OK, 10
    v11 := max_uint8 as std::int16_t?;   // OK, 255

    std::cout << "v = (v)$" << std::endl; 
    std::cout << "v1 = (v1)$" << std::endl; 
    std::cout << "v2 = (v2)$" << std::endl; 
    std::cout << "v3 = (v3)$" << std::endl; 
    std::cout << "v4 = (v4)$" << std::endl; 
    std::cout << "v5 = (v5)$" << std::endl; 
    std::cout << "v6 = (v6)$" << std::endl; 
    std::cout << "v7 = (v7)$" << std::endl; 
    std::cout << "v8 = (v8)$" << std::endl; 
    std::cout << "v9 = (v9)$" << std::endl; 
    std::cout << "v10 = (v10)$" << std::endl; 
    std::cout << "v11 = (v11)$" << std::endl; 
}

If you don't want to have static_assert checks nor use std::optional you can stay with just plain as and you will get a runtime check with an exception on narrowing error.

Currently, optional_as is using exceptions (captures them) - it is only a prototype to play with it. We shall use std::expected or something similar to handle narrowing error cases and throw only in specific cases.

My only doubt is that there is a very subtle difference between these cases:

pass(x as std::uint8_t?);
pass(x as std::uint8_t!);
pass(x as std::uint8_t);

probably a better option would be to have something like

pass(x as_optional std::uint8_t);
pass(x as_strict std::uint8_t);
pass(x as std::uint8_t);

or

pass(x as optional<std::uint8_t>);
pass(x as strict<std::uint8_t>);
pass(x as std::uint8_t);

PS: have you noticed how easy it is to print std::optional - it is awesome!

@gregmarr
Copy link
Contributor

gregmarr commented Nov 3, 2022

I don't like the ! and ? after the type, too easy to miss. I was thinking along the lines of x as optional int gives std::optional<int>, x as expected int gives std::expected<int>, x as narrow int allows narrowing conversions.

@jcanizales
Copy link

My understanding of the future of exceptions (per https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0709r4.pdf ) is that std::expected is not going to be a recommended tool. So enshrining it in the language itself might be counterproductive.

@filipsajdak
Copy link
Contributor Author

Ok. Will read that through.

@filipsajdak
Copy link
Contributor Author

filipsajdak commented Nov 4, 2022

Continuing my experiments with the syntax I have pushed the new update that allows using two syntaxes:

pass(x as std::uint8_t?);
pass(x as std::uint8_t!);
pass(x as std::uint8_t);

or

pass(x as std::optional<std::uint8_t>);
pass(x as cpp2::strict<std::uint8_t>);
pass(x as std::uint8_t);

Taking a look at the possible options and current experiments probably strict mode shall be the default one as it is not narrowing.

Possibly narrowing cast shall be emphasized instead of a strict case so maybe we should change it to something similar to:

pass(x as std::optional<std::uint8_t>);
pass(x as std::uint8_t); // static_assert failure if possibly dangerous
pass(x as cpp2::narrowing<std::uint8_t>);

@feature-engineer
Copy link

My understanding of the future of exceptions (per https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0709r4.pdf ) is that std::expected is not going to be a recommended tool. So enshrining it in the language itself might be counterproductive.

I don't think your assessment is correct - according to https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/n4907.pdf, P0323R12 (std::expected) is going to be included in the language for C++23.

It seems that it'd be quite some time before p0709, or P2232, or some other yet to be proposed proposal are going to deprecate std::expected.

@jcanizales
Copy link

I guess it's just a matter of how forward you look. The paper I linked is clear about what you said:

In the committee, we are advancing the proposed std::experimental::expected<SuccessResult,ErrorResult> [P0323R3]. As noted in the paper: “C++ already supports exceptions and error codes, expected would be a third kind of error handling.”

But that paper is Herb writing 60 pages about how he dislikes adding (with std::expected) yet another way of handling errors, and would like instead a unification that covers all requirements.

@filipsajdak filipsajdak force-pushed the fsajdak-add-narrowing-as-cast branch from 3fb4d2f to 145bc7e Compare November 8, 2022 08:10
@maddanio
Copy link

maddanio commented Nov 14, 2022

Thanks to @feature-engineer I decided to play with exclamation marks and question marks. I don't like as! or as? - what I like with the pure as is that you can read it and it will make sense:

Hey, very interesting discussion. Wondering why you don't like them? They are directly at the operator that does the action so they seem very clear and upfront to me. to me putting them at the end feels like trying to form normal language sentences which feels like a bit of an illusion: they are operators in a mathematical sense. I kinda find them very beautiful as they are very short and to the point yet very expressive. a bit in contrast to the second syntax proposal you have that seems to reintroduce the noise that these operators reduce.

@filipsajdak
Copy link
Contributor Author

filipsajdak commented Nov 15, 2022

Yes, you are right. I did the experiments and came to that conclusion as well.

To be honest I am waiting for a feedback from Herb as I don't know if he has something in mind regarding this feature.

Probably I can make all these syntaxes working. I am currently waiting to clarify the defaults and how to distinguish other cases from the default behavior.

@filipsajdak
Copy link
Contributor Author

One thing I have learned is that we probably want to have default version that is statically checked on compile time. If that is not working than we shall be able to choose if we want to have exception or e.g. optional.

@switch-blade-stuff
Copy link

I guess it's just a matter of how forward you look. The paper I linked is clear about what you said:

In the committee, we are advancing the proposed std::experimental::expected<SuccessResult,ErrorResult> [P0323R3]. As noted in the paper: “C++ already supports exceptions and error codes, expected would be a third kind of error handling.”

But that paper is Herb writing 60 pages about how he dislikes adding (with std::expected) yet another way of handling errors, and would like instead a unification that covers all requirements.

From my understanding, std::expected is created as the "modern" way of handling error codes, instead of using a C-style out parameter. While it is an alternative to exceptions, i personally see a case for both types of error handling: exceptions being used for "unexpected" error conditions, while error codes being used when the error condition is "expected".

Ex. when connecting to a web socket, you expect to be returned some kind of a handle object, however you also may expect an error code. IMO error codes fit this use-case more since a network error is an actually expected (although erroneous) result. Normally, you would use a nullable return value or std::optional and an out parameter for the error code, but now you can handle both with std::expected.

Although, granted, out std::expected prevents RVO if you need to further work with the stored value.

@jcanizales
Copy link

The way I read that proposal I linked, "unexpected errors" as in "programmer errors" should be asserting, rather than throwing exceptions that propagate all the way out and terminate the program there.

If you're programmaticaly handling errors, they are by definition errors that you anticipate could happen. Like your socket example. I can't do what I wanted to do (normal control flow), but I can do something about the error. The proposal argues in several places, and I agree, that normal and error control flow in C++ should be separate. I think separable is a better way to call it (as nothing ever stops you from handling errors immediately next to the call site). It has to be that way for constructors, and constructors are central for RAII. And it has to be that way to let you do compositions as simple as chaining methods, f(g(x)).

That said, the new exceptions mechanism proposed is very much like "using std::expected behind the scenes", just with language support to act like exceptions. The details on that behind-the-scenes are in section 4.1.1. Section 4.1.6 shows the new mechanism side-by-side as a replacement for std::expected, and section 5 says one way to measure the success of the new exceptions would be if they actually replace the use of std::expected for handling errors.

So maybe the best course for cppfront is to enshrine std::expected as the way to return errors in the generated C++ code. And then eventually to have the new syntax hide it.

@switch-blade-stuff
Copy link

So maybe the best course for cppfront is to enshrine std::expected as the way to return errors in the generated C++ code. And then eventually to have the new syntax hide it.

I would definitely not want to have some kind of 3d way of handling runtime errors in addition to classic exceptions and std::expected, so this approach seems to be the most optimal as of now.

Perhaps opening a new issue would be best? @jcanizales

@switch-blade-stuff
Copy link

"unexpected errors" as in "programmer errors" should be asserting, rather than throwing exceptions that propagate all the way out and terminate the program there

Personally, I think classic exceptions are better in this case, since they handle RAII and can be used to "catch all non-fatal errors".

Ex. in a game engine's editor you could catch all non-fatal exceptions and log the error message, then pause the game and allow the user to fix it without crashing.

Assertions are great for debugging, but do not allow you to preform this kind of error handling and may cause resource leaks because of unhandled RAII destructors.

@jcanizales
Copy link

Classic exceptions are already forbidden in many types of codebases, for the range of issues the paper describes. The point of the proposal is to fix those issues (like std::expected does) but without losing the advantages they have.

When I wrote "programmer error" I meant "the code is buggy" errors. The user can't recover from that, and trying to recover programmatically is problematic because there's no knowing how much of your program state is already corrupted by the time you detect the issue. Assertion libraries like glog usually do print an error message and a stack trace before terminating the program. Chrome survives assertions in tabs without crashing by isolating each in its own OS process. That said, like with all guidance I'm sure there's no shortage of cases in which it's not the best. New exceptions are also better for this than old exceptions in any case.

@filipsajdak
Copy link
Contributor Author

I was not thinking about runtime assertions but static assertions that will make your build to fail. No impact on runtime error handling. It should prevent developer from writing potentially buggy code. Additionally it should provide guidance what to do instead.

@filipsajdak
Copy link
Contributor Author

filipsajdak commented Nov 26, 2022

I have done more experiments and tests around that topic. I like the idea of having default as as strict as possible that ends with a static assertion if it is a suspicious expression that may not work.

Currently, it works like that:

template< typename C, typename... Ts >
auto as(Ts... ts) -> auto {
  static_assert(program_violates_type_safety_guarantee<Ts...>, "safe 'as' cast is not defined");
}

I realized that this interfered with the inspect implementation as there are used constexpr ifs and requires expressions that are not working when used on expression that ends up with static assertions.

In regression-tests/pure2-inspect-expression-in-generic-function-multiple-types.cpp2 file there is a code:

inspect x -> std::string {
  is int = "integer " + std::to_string(x as int);
  is std::string = '"' + x as std::string + '"';
  is _   = "not an int or string";
}

is int = "integer " + std::to_string(x as int); line compiles to (the original code is oneliner I refortmated it for easier reading)

if (cpp2::is<int>(__expr)) {
  if constexpr( requires{ "integer " + std::to_string(cpp2::as<int>(x)); } )
    if constexpr( std::is_convertible_v<CPP2_TYPEOF( "integer " + std::to_string( cpp2::as<int>(x) ) ),
                                        std::string> )
      return "integer " + std::to_string(cpp2::as<int>(x)); 
    else
      return std::string{};
  else 
    return std::string{}; 
}

If cpp2::as<int>(x) will end up with static assert requires expression will not return false and compilation will fail. I have made an workaround for it - I am preparing the clean commit to update this PR.

TLDR: I have renamed all cpp2::as function to cpp2::internal_as and added additional template argument that specify failure policy (static assertions or throwing expceptions). internal_as is called by one as function with default policy:

enum class cast_failure_policy {
    static_assertions,
    throws
};

// ...

template< typename C, typename X >
auto as(X const& x) -> decltype(auto) {
    return internal_as<cast_failure_policy::static_assertions, C>(x);
}

cppfront emits statement to the string that is later printed using that string:

return_prefix = "{ if constexpr( requires{" + statement + ";} ) if constexpr( std::is_convertible_v<CPP2_TYPEOF((" + statement + "))," + result_type + "> ) return ";

In statement string I am replacing all cpp2::as to cpp2::internal_as<cpp2::cast_failure_policy::throws, - when expression throws inside requires call it will end up with false.

auto internal_statement = replace_all(statement, "cpp2::as<", "cpp2::internal_as<cpp2::cast_failure_policy::throws, ");

@filipsajdak filipsajdak force-pushed the fsajdak-add-narrowing-as-cast branch from 145bc7e to 11c9d47 Compare November 28, 2022 23:53
@hsutter
Copy link
Owner

hsutter commented Jan 5, 2023

Thanks Filip. I appreciate the as std::optional<T> and I was convinced it sounded good for a while, but I'd really like to avoid it for now. Also I hope we can avoid having a separate inspect_as... just is and as ought to be enough, right?

Let me take a browse through the latest tonight and either suggest changes or perhaps take some parts and start merging them in a separate commit... more soon... and thanks again, I know we've iterated a lot on this one and you've put a lot of work into it, thanks for your patience as I wrap my head around the various parts being suggested.

hsutter added a commit that referenced this pull request Jan 5, 2023
@hsutter
Copy link
Owner

hsutter commented Jan 5, 2023

OK, I've taken parts of this in a separate commit linked above... please check that I didn't make mistakes in merging that subset?

I should have the literal not-really-narrowing allowed cases (I think?), and this one now does the assertion:

v2  := -1         as std::uint8_t; // static_assert failure

I didn't take the as optional parts, and I'm still interested in the "nonesuch" removal. More soon...

@filipsajdak
Copy link
Contributor Author

OK, I will collect everything I have learned so far. The std::optional part is just a syntactic sugar. There is a deeper issue here. I will send it in a couple of hours (hopefully before you wake up).

@filipsajdak
Copy link
Contributor Author

I will explain more to you when I finish my other work. TLDR: it probably, will not be possible to use a static asserted version of as() functions as it will break the inspect expression (that is why I have introduced the inspect_as approach.

I have taken a few steps back, reverted some of the changes, and rechecked it. nonsuch approach is not bad - it has its issues, but it makes inspect working.

hsutter added a commit that referenced this pull request Jan 5, 2023
…ntract precondition

Signed/unsigned conversions to a not-smaller type are handled as a precondition, and trying to cast from a source value that is in the half of the value space that isn't representable in the target type is flagged as a `Type` safety contract violation
hsutter added a commit that referenced this pull request Jan 5, 2023
@hsutter
Copy link
Owner

hsutter commented Jan 5, 2023

Thanks. After more thought, I think your use cases are pointing out a useful distinction, including especially your argc as std::size_t which I think will be common and that (as in this case) generally won't fail in practice, but we want a safety guarantee... so my realization is that the right tool is probably a precondition contract. (Disclaimer: I'm not a fan of dynamic checks, hear me out...)

Here's how I'm thinking about it:

Narrowing case Intentional use in real code is Actual value loss is Should be handled as
Integer literals we can check at compile time Common (e.g., 1 as u16) Impossible (if prevented statically) Static check - already done in the first commit above
Float to int, int to float, int to smaller size int Rare? Likely unsafe_narrow - unsafe operation, static check where possible (certainly for int-to-smaller-int), require user to be explicit to self-document and prevent accidental use - added in first commit above
int to same or larger int, differs in signedness only Common (e.g., argc as size_t) Unlikely as with precondition - safe with dynamic check - now implemented in the second commit above, ede2df2)

In the last case I think a dynamic Type safety contract check is warranted. My aim for the contracts design is for the contract check to be cheap in the fast path (the check is branch predictor friendly), but in any case such casts should virtually never appear in a hot loop anyway (and if so can be moved out).

For convenience, here's my current implementation (please point out flaws if I missed anything):

//  Signed/unsigned conversions to a not-smaller type are handled as a precondition,
//  and trying to cast from a value that is in the half of the value space that isn't
//  representable in the target type C is flagged as a Type safety contract violation
template< typename C >
inline constexpr auto as(auto const& x) -> auto
    requires (
        std::is_integral_v<C> &&
        std::is_integral_v<CPP2_TYPEOF(x)> &&
        std::is_signed_v<CPP2_TYPEOF(x)> != std::is_signed_v<C> &&
        sizeof(CPP2_TYPEOF(x)) <= sizeof(C)
    )
{
    const C c = static_cast<C>(x);
    Type.expects(   // precondition check: must be round-trippable => not lossy
        static_cast<CPP2_TYPEOF(x)>(c) == x && (c < C{}) == (x < CPP2_TYPEOF(x){}),
        "dynamic lossy narrowing conversion attempt detected" CPP2_SOURCE_LOCATION_ARG
    );
    return c;
}

With this, your opening example now works:

args : std::span<Arg> = (argv, argc as std::size_t);  // now ok
c : std::ifstream = args.back(); // no longer crashes

And this example works:

x: std::int8_t = 127;
std::cout << x as std::uint8_t;    // now ok

And this example is flagged:

x: std::int8_t = -1;
std::cout << x as std::uint8_t;    // "as" fails

In this last case, the default Type contract violation handler prints the message:

Type safety violation: dynamic lossy narrowing conversion attempt detected

or, with the -a switch, the default message has the file, line number, and function name:

demo.cpp2(7) main: Type safety violation: dynamic lossy narrowing conversion attempt detected

and one nice thing about the Cpp2 contracts system is that this is easy to customize with a violation handler, to integrate with any contract/logging framework a given project is already using.

I think I like the above approach because each of the three cases now feels like it's handled in an appropriate way, and we are still safe by default in all cases -- in the first we guarantee safety statically, in the second the programmer has to explicitly opt into an unsafe_ operation, and in the third we guarantee safety dynamically while allowing the convenience of a nice syntax for common operation that people need often and know if always/nearly-always safe and would be annoyed if it didn't work.

Perhaps this treatment of the third case will take most of the sting out of unsafe_narrow/narrow_cast, and diminish the resistance to using them.

How does this look?


In terms of your longer list of cases:

max_uint8  := std::numeric_limits<std::uint8_t>::max();
max_int8   := std::numeric_limits<std::int8_t>::max();
max_double := std::numeric_limits<double>::max();
max_float  := std::numeric_limits<float>::max();

v1  := max_uint8  as std::int8_t;  // dynamic failure

For v1 we now get a contract violation, where the default message is demo.cpp2(7) main: Type safety violation: dynamic lossy narrowing conversion attempt detected.

v2  := -1         as std::uint8_t; // static_assert failure
v4  := max_double as float;        // static_assert failure
v6  := -1        as double;        // static_assert failure

Does that seem to cover the cases now?

I think the only ones that are not covered now are integer/floating-point narrowing, which still fail statically by yielding nonesuch that leads to template spew when trying to use the result, rather than by giving a nice static_assert message. Giving a nicer message will probably not be possible without something like the inspect_as you were doing. But I think we now have everything but this int/float case, right?

@filipsajdak
Copy link
Contributor Author

filipsajdak commented Jan 5, 2023

Thank you for your patience. I tried to gather all cases I saw regarding arithmetic/literal types.

Case 1

x := 1000 as u8;

Case 2

out_of_scope := -1.0;
x := out_of_scope as u8;

Case 3

x := 1000 as std::optional<u8>; // currently buggy

Case 4

x := 1000;
std::cout << inspect (x) -> std::string {
    is u8 = cpp2::to_string(x as u8);
    is _ = "(no match)";
} << std::endl;

Case 5

x := 1000;
if x is int {
  print(x as u8);
}

Case 6

x := 128;
if x is (in(0,255)) {
  print(x as u8);
}
Case nonsuch static_cast static_cast + inspect_as static_cast + inspect_as + optional
1 ❌ false positive
2 ❌ false positive
3 ❌ false positive, ‼️ bug! (1) ❌ false positive, ‼️ bug! (1) ❌ false positive, ‼️ bug! (1) ✅ (accidentally fix a bug)
4 static_cast in requires expression fails the build
5 ‼️ prints std::any bug! (2)
6 as doesn't know that it is ok as doesn't know that it is ok as doesn't know that it is ok as doesn't know that it is ok

(1) 1000 as std::optional<u8> - matches requires (!std::is_same_v<C, X> && requires { C{x}; }) as it looks like std::optional<std::uint8_t>{x} succedes, see: https://godbolt.org/z/G3WG3s9rM
(2) it prints std::any as there is an overload of cpp2::to_string(std::any const&) that matches everything that does not match other overloads. It makes to_string(...) practically unreachable. It could be solved by changing the problematic overload to:

template<typename T>
    requires std::is_same_v<T, std::any>
inline auto to_string(T const&) -> std::string {
    return "std::any";
}

In case of the nonsuch case, it might be good to provide an additional overload:

inline auto to_string(nonesuch_ const&) -> std::string {
    return "(nonsuch - as cast failed returning nonsuch)";
}

@filipsajdak
Copy link
Contributor Author

Probably I need to add your current solution to the picture.

@filipsajdak
Copy link
Contributor Author

TLDR: everything is fine with the static_cast way until you will try to solve case 4.

Case 6 is something I have yet to find a solution to.

@filipsajdak
Copy link
Contributor Author

I have corrected case 3 as it should be marked as ❌ also for cases static_cast and static_cast+inspect_as.

@filipsajdak
Copy link
Contributor Author

Bug discovered with (1) probably also will give false positives for other types, e.g.

v := 10;
x := v as std::vector<double>;

@filipsajdak
Copy link
Contributor Author

I have run tests on the latest main. v2, v4, and v6 are captured at compile-time. v1 is captured by Type.expects() at runtime.

    // v1  := max_uint8  as std::int8_t;  // Type.expects() failure
    // v2  := -1         as std::uint8_t; // static_assert failure
    v3  := -1.0       as std::uint8_t;    // returns nonsuch, prints std::any
    // v4  := max_double as float;        // static_assert failure

    v5  := -1.0      as long;             // returns nonsuch, prints std::any
    // v6  := -1        as double;        // static_assert failure
    v7  := max_float as double;           // OK v7 = 340282346638528859811704183484516925440.000000
    v8  := max_uint8 as std::uint16_t;    // OK v8 = 255
    v9  := max_int8  as std::uint16_t;    // OK v9 = 127
    v10 := 10        as std::size_t;      // OK v10 = 10
    v11 := "string"  as std::string;      // OK v11 = string
    v12 := 'a'       as int;              // OK v12 = 97
    v13 := 36        as char;             // OK v13 = 36

@filipsajdak
Copy link
Contributor Author

To demonstrate a problem with Case 4 on the current main, you can run:

main: () -> int = {

    x := double(1000);
    std::cout << inspect (x) -> std::string {
        is float = cpp2::to_string(x as float);
        is u8 = cpp2::to_string(x as u8);
        is _ = "(no match)";
    } << std::endl;
}

Expected: print (no match)
Actual:

../tests/as_cast.cpp2... ok (all Cpp2, passes safety checks)

In file included from ../tests/as_cast.cpp:2:
include/cpp2util.h:750:5: error: static_assert failed due to requirement 'program_violates_type_safety_guarantee<float, double>' "No safe 'as' cast from larger to smaller floating point precision - if you're sure you want this unsafe conversion, consider using `unsafe_narrow<T>()` to force the conversion"
    static_assert(
    ^
../tests/as_cast.cpp2:5:85: note: in instantiation of function template specialization 'cpp2::as<float, double>' requested here
        if (cpp2::is<float>(__expr)) { if constexpr( requires{cpp2::to_string(cpp2::as<float>(x));} ) if constexpr( std::is_convertible_v<CPP2_TYPEOF((cpp2::to_string(cpp2::as<float>(x)))),std::string> ) return cpp2::to_string(cpp2::as<float>(x)); else return std::string{}; else return std::string{}; }
                                                                                    ^
../tests/as_cast.cpp2:5:79: error: cannot pass expression of type 'void' to variadic function
        if (cpp2::is<float>(__expr)) { if constexpr( requires{cpp2::to_string(cpp2::as<float>(x));} ) if constexpr( std::is_convertible_v<CPP2_TYPEOF((cpp2::to_string(cpp2::as<float>(x)))),std::string> ) return cpp2::to_string(cpp2::as<float>(x)); else return std::string{}; else return std::string{}; }
                                                                              ^
../tests/as_cast.cpp2:5:168: error: cannot pass expression of type 'void' to variadic function
        if (cpp2::is<float>(__expr)) { if constexpr( requires{cpp2::to_string(cpp2::as<float>(x));} ) if constexpr( std::is_convertible_v<CPP2_TYPEOF((cpp2::to_string(cpp2::as<float>(x)))),std::string> ) return cpp2::to_string(cpp2::as<float>(x)); else return std::string{}; else return std::string{}; }
                                                                                                                                                                       ^
../tests/as_cast.cpp2:5:228: error: cannot pass expression of type 'void' to variadic function
        if (cpp2::is<float>(__expr)) { if constexpr( requires{cpp2::to_string(cpp2::as<float>(x));} ) if constexpr( std::is_convertible_v<CPP2_TYPEOF((cpp2::to_string(cpp2::as<float>(x)))),std::string> ) return cpp2::to_string(cpp2::as<float>(x)); else return std::string{}; else return std::string{}; }
                                                                                                                                                                                                                                   ^
4 errors generated.

@filipsajdak
Copy link
Contributor Author

Bug (1) -> #199
Bug (2) -> #200

@hsutter
Copy link
Owner

hsutter commented Jan 6, 2023

OK, I think the next commit fixes all of these:

x := double(1000);
std::cout << inspect (x) -> std::string {
    is float = cpp2::to_string(x as float);
    is u8 = cpp2::to_string(x as u8);
    is _ = "(no match)";
} << std::endl;

Expected: print (no match)

This now prints (no match).

Case 1

x := 1000 as u8;

This now static_asserts with 'Literal cannot be narrowed using 'as' - if you're sure you want this, use unsafe_narrow() to force the conversion'.

And it doesn't static-assert if the number is, say, 100.

Case 2

out_of_scope := -1.0;
x := out_of_scope as u8;

This now static_asserts with ''as' does not allow unsafe narrowing conversions - if you're sure you want this, use unsafe_narrow() to force the conversion'.

Case 3

x := 1000 as std::optional<u8>; // currently buggy

This now static-asserts with 'No safe 'as' cast available - please check your cast'.

Case 4

x := 1000;
std::cout << inspect (x) -> std::string {
    is u8 = cpp2::to_string(x as u8);
    is _ = "(no match)";
} << std::endl;

This now prints "(no match)". The result is the same even if the value is changed to 100, which is correct because it's not a u8. (In the future the way to match 'can this be converted to u8' will be to write as u8 = ..., but I don't plan to implement that as constraints for inspect the near future.)

Case 5

x := 1000;
if x is int {
  print(x as u8);
}

This now static-asserts with ''as' does not allow unsafe narrowing conversions - if you're sure you want this, use 'unsafe_narrow<T>()' to force the conversion'.

Case 6

x := 128;
if x is (in(0,255)) {
  print(x as u8);
}

This correctly static-asserts with ''as' does not allow unsafe narrowing conversions - if you're sure you want this, use 'unsafe_narrow<T>()' to force the conversion'.

@hsutter hsutter closed this in 313444f Jan 6, 2023
@filipsajdak filipsajdak deleted the fsajdak-add-narrowing-as-cast branch January 12, 2023 21:22
Azmah-Bad pushed a commit to Azmah-Bad/cppfront that referenced this pull request Feb 24, 2023
…ng, pull `unsafe_narrow` into `cpp2::`
Azmah-Bad pushed a commit to Azmah-Bad/cppfront that referenced this pull request Feb 24, 2023
…" as contract precondition

Signed/unsigned conversions to a not-smaller type are handled as a precondition, and trying to cast from a source value that is in the half of the value space that isn't representable in the target type is flagged as a `Type` safety contract violation
Azmah-Bad pushed a commit to Azmah-Bad/cppfront that referenced this pull request Feb 24, 2023
Azmah-Bad pushed a commit to Azmah-Bad/cppfront that referenced this pull request Feb 24, 2023
Azmah-Bad pushed a commit to Azmah-Bad/cppfront that referenced this pull request Feb 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[SUGGESTION] How to cast int to std::size_t
8 participants