Skip to content

[SUGGESTION][FIX] Add safety check support for cpp1 pointers and deduced from other cpp2 declarations #96

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

filipsajdak
Copy link
Contributor

@filipsajdak filipsajdak commented Oct 27, 2022

The current implementation has no safety checks for:

  • deduced pointers from cpp1 or functions that deduce return types,
  • deduced pointers from cpp2 variables or functions,

This change brings support for these cases and makes the below code fail to compile due to safety check errors:

#include <memory>
#include <array>

int* to_pointer(int& i) {
    return &i;
}

std::unique_ptr<int> to_up(int i) {
    return std::make_unique<int>(i);
}

auto to_array() {
    std::array<int, 64> out;
    for(int i = 0; i < std::ssize(out); ++i)
        out[i] = i;
    return out;
}

to_pointer2: (inout i: int) -> auto = {
    return i&;
}

to_value: (inout i: int) -> int = {
    return i;
}

fun_ptr: (p : *int) -> int = {
    return p*;
}

int   g = 44;
int* pg = &g;

main: () -> int = {
    i : int = 42;

    pi  := to_pointer(i);
    pi2 := to_pointer2(i);

    pi   = 0; // caught on cpp1 side
    pi2 += 1; // caught on cpp1 side

    l := :(inout x : _) = x&;

    lpg := pg;

    lpg = 0;  // caught on cpp1 side

    ri := i;
    ri2 := to_value(i);

    up := to_up(123);

    ar := to_array();

    v := fun_ptr(to_pointer(i));
}

cppfront will identify which initializations are suspicious and put them into the cpp2::safety_check() function. Suspicious initialization is initialization by a function or identifier that cppfront has no information about (i.e. don't know the return type or the variable is from cpp1 and we have no sema information about it).

The function checks if it deals with the pointer type. If it is then the pointer is packed into cpp2::safetychecked_pointer that has static_asserts that check if lifetime safety guarantee rules are followed by overloading most operators.

The above code will compile to:

// ----- Cpp2 support -----
#include "cpp2util.h"

#line 1 "../tests/pointer-from-cpp1.cpp2"
#include <memory>
#include <array>

int* to_pointer(int& i) {
    return &i;
}

std::unique_ptr<int> to_up(int i) {
    return std::make_unique<int>(i);
}

auto to_array() {
    std::array<int, 64> out;
    for(int i = 0; i < std::ssize(out); ++i)
        out[i] = i;
    return out;
}

[[nodiscard]] auto to_pointer2(int& i) -> auto;
#line 23 "../tests/pointer-from-cpp1.cpp2"
[[nodiscard]] auto to_value(int& i) -> int;
#line 27 "../tests/pointer-from-cpp1.cpp2"
[[nodiscard]] auto fun_ptr(cpp2::in<int *> p) -> int;
#line 30 "../tests/pointer-from-cpp1.cpp2"

int   g = 44;
int* pg = &g;

[[nodiscard]] auto main() -> int;

//=== Cpp2 definitions ==========================================================

#line 18 "../tests/pointer-from-cpp1.cpp2"

[[nodiscard]] auto to_pointer2(int& i) -> auto{
    return &i; 
}

[[nodiscard]] auto to_value(int& i) -> int{
    return i; 
}

[[nodiscard]] auto fun_ptr(cpp2::in<int *> p) -> int{
    return *p; 
}
#line 33 "../tests/pointer-from-cpp1.cpp2"

[[nodiscard]] auto main() -> int{
    int i { 42 }; 

    auto pi { cpp2::safety_check(to_pointer(i)) }; 
    auto pi2 { to_pointer2(i) }; 

    pi   = 0; // caught on cpp1 side
    pi2 += 1; // caught on cpp1 side

    auto l { [](auto& x) { return &x; } }; 

    auto lpg { cpp2::safety_check(pg) }; 

    lpg = 0;  // caught on cpp1 side

    auto ri { i }; 
    auto ri2 { to_value(i) }; 

    auto up { cpp2::safety_check(to_up(123)) }; 

    auto ar { cpp2::safety_check(to_array()) }; 

    auto v { fun_ptr(cpp2::safety_check(to_pointer(i))) }; 
}

When you will try to compile it with the cpp1 compiler you will end with an error:

% clang++ -std=c++20 -Iinclude/ ../tests/pointer-from-cpp1.cpp        
In file included from ../tests/pointer-from-cpp1.cpp:2:
include/cpp2util.h:568:48: error: static_assert failed due to requirement 'program_violates_lifetime_safety_guarantee<int>' "pointer assignment from integer is illegal"
    constexpr void operator=(X lhs) noexcept { static_assert(program_violates_lifetime_safety_guarantee<X>, "pointer assignment from integer is illegal"); }
                                               ^             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../tests/pointer-from-cpp1.cpp2:22:10: note: in instantiation of function template specialization 'cpp2::safetychecked_pointer<int *>::operator=<int>' requested here
    pi   = 0; // caught on cpp1 side
         ^
In file included from ../tests/pointer-from-cpp1.cpp:2:
include/cpp2util.h:530:54: error: static_assert failed due to requirement 'program_violates_lifetime_safety_guarantee<int>' "pointer arithmetic is illegal - use std::span or gsl::span instead"
    template <typename X> void operator+= (X) const {static_assert(program_violates_lifetime_safety_guarantee<X>, "pointer arithmetic is illegal - use std::span or gsl::span instead");}
                                                     ^             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../tests/pointer-from-cpp1.cpp2:23:9: note: in instantiation of function template specialization 'cpp2::safetychecked_pointer<int *>::operator+=<int>' requested here
    pi2 += 1; // caught on cpp1 side
        ^
2 errors generated.

Not all errors are listed at once - as the compilation process stops on errors. Errors are done in a way that they mimic errors that comes from cppfront:

'program_violates_lifetime_safety_guarantee<int>' "pointer arithmetic is illegal - use std::span or gsl::span instead"

Update 2022-10-30

After moving pointer decorators to unqualified_id_node we can now deduce the types based on a declaration of variables and functions that are used to initialize the variable.

fun: (inout i : int) -> * int = {
    return i&;
}

main: () -> int = {
    i := 42;
    p := fun(i);

    fun(i)++;  // captured
    fun(i)--;  // captured
    fun(i)[0]; // captured
    fun(i)~;   // captured

    p = 0; // captured
    p++;   // captured
    p--;   // captured
    p[0];  // captured
    p~;    // captured

    fun(i)* = 0; // works
    p* = 42;     // works
}

Link to previously used working prototypes:

Comment on lines 585 to 591
#define CPP2_SAFETYCHECK(x) []<typename T>(T&& o) { \
if constexpr (std::is_pointer_v<std::remove_cvref_t<T>>) { \
return cpp2::safetychecked_pointer(o); \
} else { \
return std::forward<T>(o); \
} \
}(x)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think cppfront does this elsewhere. But since you're using a macro anyways, I think it'd be better to rely on guaranteed copy elision and return x directly. There are types for which this current implementation will have bad performance, like big aggregates or those for which move is copy.

Suggested change
#define CPP2_SAFETYCHECK(x) []<typename T>(T&& o) { \
if constexpr (std::is_pointer_v<std::remove_cvref_t<T>>) { \
return cpp2::safetychecked_pointer(o); \
} else { \
return std::forward<T>(o); \
} \
}(x)
#define CPP2_SAFETYCHECK(x) []() { \
if constexpr (std::is_pointer_v<std::remove_cvref_t<decltype(x)>>) { \
return cpp2::safetychecked_pointer(x); \
} else { \
return x; \
} \
}()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to pass x to the lambda somehow - your proposed code probably will not work as x is not a global (need to check that). What I did was not use moving but perfect forwarding - that is why the lambda is a template.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, I forgot [&]. Would that suffice?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, that has problems with bit-fields, and structured bindings in C++20.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I did was not use moving but perfect forwarding - that is why the lambda is a template.

Yes, but that'll compile to a copy for big PODs. Actually a move, but for big PODs, there's no difference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I will test that. If you can think about test cases I can verify them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something like this could work without tripping due to big PODs, bit-fields and structured bindings: https://godbolt.org/z/bovP3vfvo.

template<class U>
using safetychecked_pointer_impl = decltype([]<typename T>(T&& o) {
    if constexpr (std::is_pointer_v<T>) {
        return safetychecked_pointer(o);
    } else {
        return std::forward<T>(o);
    }
}(std::declval<U>()));

#define CPP2_SAFETYCHECK(x) safetychecked_pointer_impl<decltype((x))>(x)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This macro will be used only in declaring variables - to help deduce the type and introduce safety checks. So probably there will be no issue with coping as there will be coping in the end. But current implementation still has issues with assignments to reference.

std::array<double, 64> tmp {1,2,3,4,5,6,7,8,9,0};

void fun_ref() {
    auto& a = CPP2_SAFETYCHECK(tmp);
}

will end up with an error:

error: cannot bind non-const lvalue reference of type 'std::array<double, 64>&' to an rvalue of type 'std::array<double, 64>'
   90 | #define CPP2_SAFETYCHECK(x) []<typename T>(T&& o) { \
      |                             ~~~~~~~~~~~~~~~~~~~~~~~~~
   91 |     if constexpr (std::is_pointer_v<T>) { \
      |     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   92 |         return safetychecked_pointer(o); \
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   93 |     } else { \
      |     ~~~~~~~~~~
   94 |         return std::forward<T>(o); \
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   95 |     } \
      |     ~~~
   96 | }(x)
      | ~^~~
<source>:101:15: note: in expansion of macro 'CPP2_SAFETYCHECK'
  101 |     auto& a = CPP2_SAFETYCHECK(tmp);

That can be solved by using decltype(auto):

#define CPP2_SAFETYCHECK(x) []<typename T>(T&& o) -> decltype(auto) { \
    if constexpr (std::is_pointer_v<T>) { \
        return safetychecked_pointer(o); \
    } else { \
        return std::forward<T>(o); \
    } \
}(x)

The working prototype is here: https://godbolt.org/z/reTqKs18h

I will check how it works with the other types you mentioned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be no copying in some cases, but when x is a prvalue, you've initialized o from x, before o is used to initialize the user-declared variable. I think that'll fail for non-movable non-copyable types. My suggestion to fold the return type of the lambda should be able to make the user-declared variable be initialized directly from x when the type of x is not a pointer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thank you for your feedback. I decided to remove the macro entirely and switch to the is()/as() functions style.

Instead of the CPP2_SAFETYCHECK macro now there are cpp2::safety_check() functions with overloads that check requirements for types. Probably we can tweak the performance. My goal for now is to make it working for all cases.

I have tested the implementation with bitfields and unique_ptr:

auto up = std::make_unique<int>(42);

void fun_noncopyable() {
    auto a = safety_check(up);
    auto b = safety_check(std::make_unique<int>(44));
}

struct S {
    unsigned int b : 3;
} s{ 3 };

void fun_bitfields() {
    auto a = safety_check(s.b);
}

Previous cases still work:

std::array<double, 64> tmp {1,2,3,4,5,6,7,8,9,0};

void fun_ref() {
    auto& a = safety_check(tmp);
}

void fun_copy() {
    auto a = safety_check(tmp);
}

void fun_rvalue() {
    auto a = safety_check(std::array<double, 64>{9,8,7,6,5,4,3,2,1});
}

void fun_ptr() {
    auto a = safety_check(&tmp);
}

Working prototype: https://godbolt.org/z/4n5Yraend

@filipsajdak
Copy link
Contributor Author

Forget to add link to working prototype: https://godbolt.org/z/Gc8K4vnrM

@filipsajdak filipsajdak force-pushed the fsajdak-add-support-for-cpp1-pointers branch from 50e2993 to 401834d Compare October 27, 2022 22:45
@filipsajdak
Copy link
Contributor Author

This change has more issues (e.g. pointers to functions, moving of lvalue, or problem with lack of closing parentheses when initialization happens together with UFCS).

I already have a fix. Will push them soon.

@filipsajdak filipsajdak changed the title [SUGGESTION][FIX] Add safety check support for cpp1 pointers Draft: [SUGGESTION][FIX] Add safety check support for cpp1 pointers Oct 29, 2022
@filipsajdak filipsajdak force-pushed the fsajdak-add-support-for-cpp1-pointers branch from 401834d to 1e99b0d Compare October 30, 2022 00:49
@filipsajdak filipsajdak changed the title Draft: [SUGGESTION][FIX] Add safety check support for cpp1 pointers [SUGGESTION][FIX] Add safety check support for cpp1 pointers Oct 30, 2022
@filipsajdak
Copy link
Contributor Author

The last push solves all issues I have found. Also brings support for deducing pointer types of the cpp2 functions or objects.

After this change cppfront will be able to capture safety violations presented below:

fun: (inout i : int) -> * const int = {
    return i&;
}

main: () -> int = {
    i := 42;
    p := fun(i);

    fun(i)++;  // captured
    fun(i)--;  // captured
    fun(i)[0]; // captured
    fun(i)~;   // captured

    p = 0; // captured
    p++;   // captured
    p--;   // captured
    p[0];  // captured
    p~;    // captured
}

Thanks to better introspection of the cpp2 functions and variables that are known to cppfront (and which return type is explicitly stated). Variables that are initialized using cpp2 variables are known and it is clear to cppfront if it is a pointer or not. That means that fewer functions and variables will be surrounded by cpp2::safety_check() (and will be checked by the cpp1 compiler).

Also, function pointers are now secured with cpp2::safety_check(). If you will try to use carray you will also get an error:

pointer arithmetic is illegal - use std::span or gsl::span instead

@filipsajdak
Copy link
Contributor Author

While working on this PR I found a bug in the get_declaration_of() method - sometimes it blocks deduction of the type. Issue is solved here: #102 and will be needed for this PR to work in all cases.

@filipsajdak
Copy link
Contributor Author

As I reworked handling pointer decorators this PR replaces implementation from: #99

I leave both as one #99 is simple and can be used without this big PR.

@filipsajdak filipsajdak force-pushed the fsajdak-add-support-for-cpp1-pointers branch from 1e99b0d to 6a7c818 Compare October 30, 2022 23:05
@filipsajdak filipsajdak changed the title [SUGGESTION][FIX] Add safety check support for cpp1 pointers [SUGGESTION][FIX] Add safety check support for cpp1 pointers and deduced from other cpp2 declarations Oct 30, 2022
@filipsajdak filipsajdak force-pushed the fsajdak-add-support-for-cpp1-pointers branch 4 times, most recently from 11424f0 to b4ab3e0 Compare November 5, 2022 00:43
@filipsajdak filipsajdak force-pushed the fsajdak-add-support-for-cpp1-pointers branch 2 times, most recently from aafe492 to a04ffe4 Compare November 7, 2022 23:35
@filipsajdak filipsajdak force-pushed the fsajdak-add-support-for-cpp1-pointers branch 7 times, most recently from 59aca2c to 68316dd Compare December 1, 2022 23:36
@filipsajdak
Copy link
Contributor Author

@hsutter in #102 you renamed the function get_declaration_of to get_local_declaration_of. In this PR, I have added a parameter to the old function that changes the behavior of the function to look outside of the named function.

My code looks like this:

auto get_declaration_of(token const& t, bool look_beyond_current_function = false) -> declaration_sym const*
{
// skipping some code
        if (
            decl.declaration->type.index() == declaration_node::function //  Don't look beyond the current function
            && !look_beyond_current_function
        ) {
            return nullptr;
        }
// skipping some code
)

Now, it does not make sense to add such a flag to the get_local_declaration_of - the name already explains that it is all about local declarations. Should I rename the function and use my version with the flag, or should I create a new function?

@filipsajdak filipsajdak force-pushed the fsajdak-add-support-for-cpp1-pointers branch from 68316dd to 7969b7b Compare December 4, 2022 01:44
@filipsajdak filipsajdak force-pushed the fsajdak-add-support-for-cpp1-pointers branch from 7969b7b to 2cbb57e Compare December 6, 2022 22:20
Thanks to relocating pointer_declarators this change introduce deduction
of pointer types also from functions - if function or variable is defined
in cpp2 then cppfront has all information to deduce the proper type
during compilation.
@filipsajdak filipsajdak force-pushed the fsajdak-add-support-for-cpp1-pointers branch from 2cbb57e to 505a088 Compare December 10, 2022 21:22
@hsutter
Copy link
Owner

hsutter commented Dec 26, 2022

See #93 comment... but a note specifically about * pointer-declarators, those should move to type_id_node.

@filipsajdak
Copy link
Contributor Author

Yes, I know. I was trying to move it. I will make that change.

@hsutter
Copy link
Owner

hsutter commented Jan 7, 2023

Thanks again for this. It's a big PR and it would take me a lot of time to review, and perhaps the implementation will be easier a few months from now that we have a clearer type-id node and soon actual types.

So I'm going to close this for now, but if you still feel you want to pursue it perhaps we could revisit this in the summer with fresh PR? Thanks for understanding, and thanks again for all your contributions!

@hsutter hsutter closed this Jan 7, 2023
@filipsajdak
Copy link
Contributor Author

I was going to close it too.

@hsutter
Copy link
Owner

hsutter commented Jan 7, 2023

@filipsajdak Thanks. Same for #196?

@filipsajdak
Copy link
Contributor Author

filipsajdak commented Jan 7, 2023

@hsutter I have ported #196 to the type_id node and simplified the algorithm for deducing pointers.

Could you take a look at it? It deduces pointers pretty well, and if it does not collide with your other work, cppfront will benefit from having it working.

Of course, if it collides, do not hesitate to postpone it.

@filipsajdak filipsajdak deleted the fsajdak-add-support-for-cpp1-pointers branch January 12, 2023 21:22
@filipsajdak filipsajdak restored the fsajdak-add-support-for-cpp1-pointers branch March 22, 2023 08:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants