Skip to content

[libc++] Make drop_view::begin constant time (#72883) #72929

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 28, 2023
Merged

Conversation

casavaca
Copy link
Contributor

@casavaca casavaca commented Nov 21, 2023

As pointed out in #72883, the implementation only needs to return the
value of ranges::next and does not need to obtain the value through
ranges::advance, which causes it to have O(n) complexity in the case
of random-access-sized but non-common range.

Fixes #72883

@casavaca casavaca requested a review from a team as a code owner November 21, 2023 00:40
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Nov 21, 2023
@llvmbot
Copy link
Member

llvmbot commented Nov 21, 2023

@llvm/pr-subscribers-libcxx

Author: Hongyu Ouyang (casavaca)

Changes

as pointed out from the github issue #72883 :
> ... the implementation only needs to return the value of ranges::next
and does not need to obtain the value through ranges::advance, which will have
O(n) complexity in the case of random-access-sized but non-common range.


Full diff: https://github.com/llvm/llvm-project/pull/72929.diff

1 Files Affected:

  • (modified) libcxx/include/__ranges/drop_view.h (+2-1)
diff --git a/libcxx/include/__ranges/drop_view.h b/libcxx/include/__ranges/drop_view.h
index f10476f0011e739..2557d030fd61145 100644
--- a/libcxx/include/__ranges/drop_view.h
+++ b/libcxx/include/__ranges/drop_view.h
@@ -104,7 +104,8 @@ namespace ranges {
     constexpr auto begin() const
       requires random_access_range<const _View> && sized_range<const _View>
     {
-      return ranges::next(ranges::begin(__base_), __count_, ranges::end(__base_));
+      auto __dist = ranges::distance(__base_);
+      return ranges::begin(__base_) + std::min<range_difference_t<_View>>(__count_, __dist);
     }
 
     _LIBCPP_HIDE_FROM_ABI

@EricWF
Copy link
Member

EricWF commented Nov 21, 2023

Please add tests or benchmarks of some sort.

@hawkinsw
Copy link
Contributor

hawkinsw commented Nov 21, 2023

Are we missing another optimization here, too?

For the non-const version of begin(), it could be possible for __simple_view<_View> to be the only one of the three conditions to fail. That would get us to exactly the same place as the requirements for the const version of begin(), just called on a non-const drop_view. So, should there be an optimization here, too?

@casavaca
Copy link
Contributor Author

Upon further investigation, I found that the library correctly implement O(1) behavior. The call sequence is

ranges::next --> ranges::advance (with sentinel) --> ranges::advance (without sentinel) --> random_access_iterator constexpr branch

So, the original behavior is correct, and I'm closing this pull request.

@casavaca casavaca closed this Nov 21, 2023
@casavaca
Copy link
Contributor Author

Oops, I didn't pay attention to non-common range...

@casavaca casavaca reopened this Nov 21, 2023
@hawkinsw
Copy link
Contributor

Oops, I didn't pay attention to non-common range...

Which would get us to a position where

  1. ranges::end(__base_) is of a different type than ranges::begin(__base_)
  2. therefore, ranges::end(__base_) might not model random_access_iterator
  3. therefore, it would not meet the constant-time-complexity requirements for std::ranges::next (see (4) at https://en.cppreference.com/w/cpp/iterator/ranges/next).

I just wanted to double check the reasoning. Thank you for doing this work!
Will

@casavaca
Copy link
Contributor Author

OK. Hopefully I did it right this time. First time contributor here.

  • git clang-format
  • 2 changes at drop_view.h are covered by 2 newly added test cases respectively.

@@ -90,6 +90,10 @@ namespace ranges {
requires (!(__simple_view<_View> &&
random_access_range<const _View> && sized_range<const _View>))
{
if constexpr (random_access_range<const _View> && sized_range<const _View>) {
auto __dist = ranges::distance(__base_);
return ranges::begin(__base_) + std::min<range_difference_t<_View>>(__count_, __dist);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, please double check if this file has undef the min/max macro on the top

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not found anything similar. Besides, std::min is also used in other places of this file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes there is already push/pop macros in this file

static_assert(std::ranges::sized_range<const NonCommonView>);
std::ranges::drop_view dropView9(NonCommonView{{}, 0, &sentinel_cmp_calls}, 4);
assert(dropView9.begin() == globalBuff + 4);
assert(sentinel_cmp_calls == 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests under libcxx/std are used by other implementations too (e.g. MSVC STL). It is not required by the spec that it has to be called zero times. I would instead relax this check a bit. For example , calling begin in the loop and check that we is not grow linearly ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I can relax the test by comparing if sentinel_cmp_calls is a constant, but I have a question:
Since this is not required by the standard, how should I add test like this? I mean, should we use
something like if defined(LIBCXX_ONLY_TESTS)? or other other way? @huixie90

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this one i think it would be good to keep it once you relaxed it as the constant time begin is required by the standard.

In general if you have Libc++ specific test, you can put it under test/libcxx

@casavaca casavaca requested a review from huixie90 November 21, 2023 23:21
Copy link
Contributor

@hewillk hewillk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think explicitly specifying the type improves readability, ranges::distance already implies signed values.

@@ -90,6 +90,10 @@ namespace ranges {
requires (!(__simple_view<_View> &&
random_access_range<const _View> && sized_range<const _View>))
{
if constexpr (random_access_range<_View> && sized_range<_View>) {
range_difference_t<_View> __dist = ranges::distance(__base_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
range_difference_t<_View> __dist = ranges::distance(__base_);
const auto __dist = ranges::distance(__base_);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was like why don't you use ranges::size here and that is exactly what you want to call since that is what is constrained on. then I realised that oh you want to call std::min on two signed types so you used ranges::distance just for converting the types. maybe
range_difference_t<_View> __dist = ranges::size(__base_);
is the actual intent of the code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe
range_difference_t<_View> __dist = ranges::size(__base_);
is the actual intent of the code?

The standard does not guarantee that the above implicit conversion is well-formed. ranges::distance is always preferred over ranges::size when deal with iterators.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe
range_difference_t<_View> __dist = ranges::size(__base_);
is the actual intent of the code?

The standard does not guarantee that the above implicit conversion is well-formed. ranges::distance is always preferred over ranges::size when deal with iterators.

Only narrowing conversion can be explicit-only. IIUC if size_t -> difeerence_t is narrowing conversion, I think we have other problems. The problem of ranges::distance is that it is not for the purpose of calculating the size in constant time. It is a utility tool to calculate the size in case you cannot caluculate it in constant time, which is the wrong tool here imo. And I strongly against using ‘auto’ when dealing with difference type for the exact reason we are discussing: narrowing conversion.
If the spec does not say size -> difference_t cannot be narrowing, I’d prefer to use either parenthesis or static_cast to do the comversion

Copy link
Contributor

@hewillk hewillk Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only narrowing conversion can be explicit-only. IIUC if size_t -> difeerence_t is narrowing conversion, I think we have other problems.

The standard does not require the signedness of the return value of ranges::size. The only guarantee is that it is an integer-like type. So in theory, the types returned by ranges::size and ranges::distance can be unrelated and only be explicitly converted.

The problem of ranges::distance is that it is not for the purpose of calculating the size in constant time.

This is why the sized_range needs to be satisfied, isn't it? in which case ranges::distance is guaranteed to be constant-time by its definition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only narrowing conversion can be explicit-only. IIUC if size_t -> difeerence_t is narrowing conversion, I think we have other problems.

The standard does not require the signedness of the return value of ranges::size. The only guarantee is that it is an integer-like type. So in theory, the types returned by ranges::size and ranges::distance can be unrelated and only be explicitly converted.

The problem of ranges::distance is that it is not for the purpose of calculating the size in constant time.

This is why the sized_range needs to be satisfied, isn't it? in which case ranges::distance is guaranteed to be constant-time by its definition.

If you have a helper

Void helper() {
if (c)
F();
else
G();
}

When you definitely don’t want to call G(), why do you write

if(c)
helper()

Rather than call f() directly ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only narrowing conversion can be explicit-only. IIUC if size_t -> difeerence_t is narrowing conversion, I think we have other problems.

The standard does not require the signedness of the return value of ranges::size. The only guarantee is that it is an integer-like type. So in theory, the types returned by ranges::size and ranges::distance can be unrelated and only be explicitly converted.

The problem of ranges::distance is that it is not for the purpose of calculating the size in constant time.

This is why the sized_range needs to be satisfied, isn't it? in which case ranges::distance is guaranteed to be constant-time by its definition.

Anyway I think our discussion about coding style should not block this patch to be landed

Copy link
Contributor

@hewillk hewillk Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a helper

Void helper() { if (c) F(); else G(); }

When you definitely don’t want to call G(), why do you write

if(c) helper()

Rather than call f() directly ?

This example is not quite accurate because the return type of ranges::distance is what we want, which reduces the type conversion since the former has already been done for us. Using ranges::size requires an additional static_cast.

I believe the standard prefers using rangs::begin(r) + rangs::distance(r) rather than rangs::begin(r) + static_cast<range_difference_t<R>>(rangs::size(r)), given the former's heavy presence in the <ranges>'s wording. (which may be the reason why my first resolution in LWG 3717 was not accepted))

Copy link
Member

@huixie90 huixie90 Nov 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with both. But I think the spec is describing the behaviour and not necessarily the way one should code. e.g this particular function here the spec is simply describing what the result should be, “Returns” , and one should not use the exact wording for implementation . (I saw you create an LWG issue for this but lots of us think it is not a defect because the spec does not say “equivalent to”. But I agree this leads to buggy implementations )

@@ -104,7 +108,8 @@ namespace ranges {
constexpr auto begin() const
requires random_access_range<const _View> && sized_range<const _View>
{
return ranges::next(ranges::begin(__base_), __count_, ranges::end(__base_));
range_difference_t<_View> __dist = ranges::distance(__base_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
range_difference_t<_View> __dist = ranges::distance(__base_);
const auto __dist = ranges::distance(__base_);

@casavaca
Copy link
Contributor Author

@huixie90 Is there anything I need to do? do you want me to change range_difference_t<_View> __dist to const auto __dist?

@casavaca
Copy link
Contributor Author

casavaca commented Dec 7, 2023

@huixie90 any update on this?

Copy link
Member

@huixie90 huixie90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with green CI

@@ -104,7 +108,8 @@ namespace ranges {
constexpr auto begin() const
requires random_access_range<const _View> && sized_range<const _View>
{
return ranges::next(ranges::begin(__base_), __count_, ranges::end(__base_));
range_difference_t<_View> __dist = ranges::distance(__base_);
return ranges::begin(__base_) + std::min(__count_, __dist);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit:
the type of (__count_) is range_difference_t<_View> const& and the type of (__dist) is range_difference_t<const _View> & and I am not sure range_difference_t<_View> and range_difference_t<const _View> have to be the same type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own edification, I attempted to answer this question ... it looks like the type is stripped of const/&?

template <range _Rp>
using range_difference_t = iter_difference_t<iterator_t<_Rp>>;

// ...

template <class _Ip>
using iter_difference_t = typename conditional_t<__is_primary_template<iterator_traits<remove_cvref_t<_Ip> > >::value,
                                                 incrementable_traits<remove_cvref_t<_Ip> >,
                                                 iterator_traits<remove_cvref_t<_Ip> > >::difference_type;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own edification, I attempted to answer this question ... it looks like the type is stripped of const/&?

template <range _Rp>
using range_difference_t = iter_difference_t<iterator_t<_Rp>>;

// ...

template <class _Ip>
using iter_difference_t = typename conditional_t<__is_primary_template<iterator_traits<remove_cvref_t<_Ip> > >::value,
                                                 incrementable_traits<remove_cvref_t<_Ip> >,
                                                 iterator_traits<remove_cvref_t<_Ip> > >::difference_type;

I think this is unrelated. off course these traits needs to strip off cvref, but iterator_t<_Rp> and iterator_t<const _Rp> can well be two different types. (although one would argue that it is evil to let them have two different difference_type)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I missed the nested iterator_t inside the definition of the range_difference_t.

And, you're right, that would be evil!

@casavaca
Copy link
Contributor Author

Rebased on llvm/main. Also made stylish chanege to use const auto, similar to microsoft STL in
https://github.com/microsoft/STL/blob/cf1313c39169dc376761eddee23c5e408e01aaa9/stl/inc/ranges#L3359

Let's see if CI is happy.

@ldionne ldionne changed the title [libc++] optimization on ranges::drop_view::begin (#72883) [libc++] Make drop_view::begin constant time (#72883) Dec 18, 2023
@huixie90
Copy link
Member

Hi, all CI passed now. could you please rebase and resolve the conflicts? It should be just a formatting change that had the conflicts. Thank you for your contribution

The optimization is specifically for non common range,
as pointed out from the github issue:
> ... the implementation only needs to return the value of ranges::next
  and does not need to obtain the value through ranges::advance, which will have
  O(n) complexity in the case of random-access-sized but non-common range.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

<ranges>: drop_view::begin const has O(n) complexity
6 participants