Skip to content

ColumnString and ColumnFixedString performance fix #29

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 4, 2020

Conversation

Enmk
Copy link
Contributor

@Enmk Enmk commented Jan 22, 2020

Improved performance of the the ColumnFixedString and ColumnString loading (reading values from binary stream, typically done on reading result values from SELECT) and saving (writing values to binary stream, typically done on writing input values with INSERT).

This was achieved by:

  • reducing number of allocations (utilizing large block of memory to store data) and not pre-filling target memory (see std::string::resize) with zeroes for ColumnString.
  • Streamlining serialization/de-serialization for ColumnFixedString.

Updated API of the ColumnFixedString/ColumnString to return\take std::string_view instead of std::string to allow those optimizations.

Final results, exuding any networking or any other I/O effects as much as possible, by encoding/decoding in-memory buffers corresponding to 1 million items of each kind.

FixedString(8) OLD time NEW time DIFF
Saving: 12023.4 1645.3 -86.32%
Loading: 11508.1 1389.3 -87.93%
String OLD time NEW time DIFF
Saving: 23925.7 23008.1 -3.84%
Loading: 20335.9 14474.1 -28.82%

No client-server protocol modifications were made, since the simplest one of prepending column with it's binary size on the stream lead only to about 1-2% improvement of load time, but was too invasive.

However, in order to achieve the best performance of reading strings from server, we may want to change the serialization format for strings significantly by writing first the stream of lengths, and then continuous stream of data. That would reduce operations required from approx:

(L + S + z) * N (this pr)
to
(L + z) * N + X (two continuous streams: lengths and data)
and even to
Y + X (if lengths are serialized as 64-bit values, but not as VarInts)

where:

  • N is column size
  • X is time required to copy string data with memcpy (entire column)
  • Y is time to copy lengths data with memcpy (entire column)
  • L is time to decode UInt64_t from VarInt binary representation
  • S is time to read single string value (boils down to cost of memcpy + dust)
  • z is loop iteration overhead + other minor factors

@Enmk Enmk changed the title ColumnString and ColumnFixedString performance fix WIP: ColumnString and ColumnFixedString performance fix Jan 22, 2020
@Enmk Enmk changed the title WIP: ColumnString and ColumnFixedString performance fix ColumnString and ColumnFixedString performance fix Jan 28, 2020
@Enmk Enmk requested a review from alexey-milovidov January 29, 2020 00:30
Copy link
Contributor

@traceon traceon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of questions are pending.

}

return result;
return std::move(result);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this std::move is right thing to do here? Wouldn't it prevent the "natural" copy elision? Take a look at this: https://stackoverflow.com/questions/17473753/c11-return-value-optimization-or-move

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, let me check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good find, thank you!

}

data_.push_back(std::move(s));
data_.resize(string_size_ * rows);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code suggests appending to data_ (i.e., as a reader, I'll assume that there already could be some data in data_), the new code writes from the beginning as if there is no data in data_ ever, at this point. Pointing this out to just get a confirmation, that this new behavior is correct conceptually.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, resizing a string assumes implicit zero-filling, which could become significant, but I am not sure there is a quick and easy way to avoid it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately there is no easy way (not even hard way) to avoid pre-filling std::string, and that is why I use Block for ColumnString below.

As for assumptions: somehow those assumptions are opposite for rest of column types, which erase previous data. So I had to unify behaviors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but here, that zero-filling still takes place.

If not now, maybe in future, this could be used. Looks like the header itself is standalone:
https://github.com/facebook/folly/blob/master/folly/memory/UninitializedMemoryHacks.h

Copy link
Contributor Author

@Enmk Enmk Feb 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is not a big problem for FixedString, but yes, we might want to use Block here too/ That facebook hack...

}

void ColumnString::Append(const std::string& str) {
data_.push_back(str);
ColumnString::~ColumnString()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an unused leftover from initial edits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, just a trick to explicitly instantiate d-tor (and hence d-tor of std::unique_ptr<Block[]>) in the context where Block is defined. Otherwise said unique_ptr fails to compile.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, strange. Doesn't ~ColumnString() = default help?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC that does not prescribe where the d-tor is instantiated, which might be anywhere, like in every other translation unit.


size_t size;
const size_t capacity;
std::unique_ptr<CharT[]> data_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not std::string? With its capacity and size management.

Copy link
Contributor Author

@Enmk Enmk Feb 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish I could use std::string here, the problem is that there is no way of pre-allocating buffer without pre-filling it with data.

Copy link
Contributor

@traceon traceon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Questions answered.

@Enmk Enmk merged commit 7c67c6d into ClickHouse:master Feb 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants