-
Notifications
You must be signed in to change notification settings - Fork 175
ColumnString and ColumnFixedString performance fix #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of questions are pending.
clickhouse/columns/string.cpp
Outdated
} | ||
|
||
return result; | ||
return std::move(result); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this std::move
is right thing to do here? Wouldn't it prevent the "natural" copy elision? Take a look at this: https://stackoverflow.com/questions/17473753/c11-return-value-optimization-or-move
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, let me check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good find, thank you!
} | ||
|
||
data_.push_back(std::move(s)); | ||
data_.resize(string_size_ * rows); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original code suggests appending to data_
(i.e., as a reader, I'll assume that there already could be some data in data_
), the new code writes from the beginning as if there is no data in data_
ever, at this point. Pointing this out to just get a confirmation, that this new behavior is correct conceptually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, resizing a string assumes implicit zero-filling, which could become significant, but I am not sure there is a quick and easy way to avoid it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately there is no easy way (not even hard way) to avoid pre-filling std::string, and that is why I use Block
for ColumnString
below.
As for assumptions: somehow those assumptions are opposite for rest of column types, which erase previous data. So I had to unify behaviors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but here, that zero-filling still takes place.
If not now, maybe in future, this could be used. Looks like the header itself is standalone:
https://github.com/facebook/folly/blob/master/folly/memory/UninitializedMemoryHacks.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is not a big problem for FixedString
, but yes, we might want to use Block here too/ That facebook hack...
} | ||
|
||
void ColumnString::Append(const std::string& str) { | ||
data_.push_back(str); | ||
ColumnString::~ColumnString() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like an unused leftover from initial edits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, just a trick to explicitly instantiate d-tor (and hence d-tor of std::unique_ptr<Block[]>
) in the context where Block is defined. Otherwise said unique_ptr
fails to compile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, strange. Doesn't ~ColumnString() = default
help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC that does not prescribe where the d-tor is instantiated, which might be anywhere, like in every other translation unit.
|
||
size_t size; | ||
const size_t capacity; | ||
std::unique_ptr<CharT[]> data_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not std::string
? With its capacity and size management.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish I could use std::string here, the problem is that there is no way of pre-allocating buffer without pre-filling it with data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Questions answered.
Improved performance of the the
ColumnFixedString
andColumnString
loading (reading values from binary stream, typically done on reading result values fromSELECT
) and saving (writing values to binary stream, typically done on writing input values withINSERT
).This was achieved by:
std::string::resize
) with zeroes forColumnString
.ColumnFixedString
.Updated API of the
ColumnFixedString
/ColumnString
to return\takestd::string_view
instead ofstd::string
to allow those optimizations.Final results, exuding any networking or any other I/O effects as much as possible, by encoding/decoding in-memory buffers corresponding to 1 million items of each kind.
No client-server protocol modifications were made, since the simplest one of prepending column with it's binary size on the stream lead only to about 1-2% improvement of load time, but was too invasive.
However, in order to achieve the best performance of reading strings from server, we may want to change the serialization format for strings significantly by writing first the stream of lengths, and then continuous stream of data. That would reduce operations required from approx:
(L + S + z) * N
(this pr)to
(L + z) * N + X
(two continuous streams: lengths and data)and even to
Y + X
(if lengths are serialized as 64-bit values, but not asVarInt
s)where:
N
is column sizeX
is time required to copy string data withmemcpy
(entire column)Y
is time to copy lengths data withmemcpy
(entire column)L
is time to decodeUInt64_t
from VarInt binary representationS
is time to read single string value (boils down to cost ofmemcpy
+ dust)z
is loop iteration overhead + other minor factors