-
Notifications
You must be signed in to change notification settings - Fork 13.5k
CharIterator changes for std::str #8590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This looks pretty good to me, I don't totally agree with It took me a second to figure out why you split the iterators into two structs, and it's kinda unfortunate that we have to. I suppose that Could you also add some comments how the iterators are implemented? These would just be for the implementation details that most people wouldn't need to see, but it'd be nice to easily read that the Other than that, nice job and nice speed improvements! |
Thanks for the comments! We could leave I didn't really mind another struct and Iterator implementation, Do you think the right thing to do is don't solve clonability now, but wait for later? |
I agree that you've implemented the two structures well, and I'd be more in favor of what you've done than having another map iterator defined separately for I think it may also be a good idea to rename |
Implement char_range_at_reverse similarly to char_range_at, instead of re-using that method.
Add a function like raw::slice_bytes, but it doesn't check slice boundaries. For iterator use where we always know the begin, end indices are in range.
Let CharIterator be a separate type from CharOffsetIterator (so that CharIterator can be cloned, for example). Implement CharOffsetIterator by using the same technique as the method subslice_offset.
Removed the word_iter rename |
I added a cleanup of CharSplitIterator that's quite funky |
Embed an iterator in the CharSplitIterator struct, and combine that with the former bool `only_ascii`; so use an enum instead.
As one final thing, could you write a test which exercises the clone implementation of |
Added test for clone |
Implement CharIterator as a separate struct, so that it can be .clone()'d. Fix `.char_range_at_reverse` so that it performs better, closer to the forwards version. This makes the reverse iterators and users like `.rfind()` perform better. Before test str::bench::char_iterator ... bench: 146 ns/iter (+/- 0) test str::bench::char_iterator_ascii ... bench: 397 ns/iter (+/- 49) test str::bench::char_iterator_rev ... bench: 576 ns/iter (+/- 8) test str::bench::char_offset_iterator ... bench: 128 ns/iter (+/- 2) test str::bench::char_offset_iterator_rev ... bench: 425 ns/iter (+/- 59) After test str::bench::char_iterator ... bench: 130 ns/iter (+/- 1) test str::bench::char_iterator_ascii ... bench: 307 ns/iter (+/- 5) test str::bench::char_iterator_rev ... bench: 185 ns/iter (+/- 8) test str::bench::char_offset_iterator ... bench: 131 ns/iter (+/- 13) test str::bench::char_offset_iterator_rev ... bench: 183 ns/iter (+/- 2) To be able to use a string slice to represent the CharIterator, a function `slice_unchecked` is added, that does the same as `slice_bytes` but without any boundary checks. It would be possible to implement CharIterator with pointer arithmetic to make it *much more efficient*, but since vec iterator is still improving, it's too early to attempt to re-implement it in other places. Hopefully CharIterator can be implemented on top of vec iterator without any unsafe code later. Additional changes fix the documentation about null termination.
Implement CharIterator as a separate struct, so that it can be .clone()'d. Fix
.char_range_at_reverse
so that it performs better, closer to the forwards version. This makes the reverse iterators and users like.rfind()
perform better.To be able to use a string slice to represent the CharIterator, a function
slice_unchecked
is added, that does the same asslice_bytes
but without any boundary checks.It would be possible to implement CharIterator with pointer arithmetic to make it much more efficient, but since vec iterator is still improving, it's too early to attempt to re-implement it in other places. Hopefully CharIterator can be implemented on top of vec iterator without any unsafe code later.
Additional changes fix the documentation about null termination.