Skip to content

Port FSTLevelDBKey to C++ #937

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 22, 2018
Merged

Port FSTLevelDBKey to C++ #937

merged 9 commits into from
Mar 22, 2018

Conversation

wilhuff
Copy link
Contributor

@wilhuff wilhuff commented Mar 16, 2018

This is just the main class and tests but none of the usages. I'll migrate those separately.

wilhuff added 6 commits March 16, 2018 16:49
These operations universally operate on keys obtained from leveldb so
it's actually unhelpful to force all the callers to make
absl::string_views from them.
@wilhuff wilhuff force-pushed the wilhuff/leveldb-key branch from 43c41d1 to 16a9bc7 Compare March 18, 2018 00:27
Copy link
Contributor

@var-const var-const left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor nits.

* Creates a key prefix that points just before the first key for the given
* user_id.
*/
static std::string KeyPrefix(const absl::string_view user_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: remove const from declarations (I think applies to all usage of absl::string_view as a parameter in this file). (I would guess these were const references at some point)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. They weren't const references before, but this was a style that Zhi was applying that I carried over.

* Decodes the given complete key, storing the decoded values as properties of
* the receiver.
*
* @return true if the key successfully decoded, NO otherwise. If NO is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "NO" come from an Objective-C comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry: BOOL, YES, and NO are all the Objective-C spellings of bool, true, and false. These files were copied from Objective-C and then massaged until they worked.

Fixed.

*
* @return true if the key successfully decoded, NO otherwise. If NO is
* returned, the properties of the receiver are in an undefined state until
* the next call to -decode_key:.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"-decode_key:" looks a little weird. There are several similar "-foo:" names in comments in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

static std::string KeyPrefix();

/**
* Creates a complete key that points to a specific document. The document_key
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: there is mismatch between "document_key" here and the name of the argument which is just key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. This should have been document_key. This was the only instance of this I could find.

* Key()).
*
* @return true if the key successfully decoded, false otherwise. If false is
* returned, the properties of the receiver are in an undefined state until
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: is "receiver" a carryover from Objective-C?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Fixed.

* BatchId is a locally assigned ID for a batch of mutations that have been
* applied.
*/
typedef int32_t BatchId;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: consider using using instead of typedef (style guide recommends using for new code).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -48,6 +48,11 @@ std::string ImmediateSuccessor(absl::string_view s) {
return out;
}

bool StartsWith(const std::string &value, const std::string &prefix) {
return prefix.size() <= value.size() &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use absl::StartsWith.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

* natively has concepts of tables and indexes. Where there's overlap, a comment
* denotes the server value from the storage_format_internal.proto.
*/
enum ComponentLabel {
Copy link
Contributor

@var-const var-const Mar 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: ToTW 86 recommends enum class for new code. If you feel it's more verbose for little gain, I'm fine with plain enum as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I explicitly chose an old-school enum specifically because it's weakly typed and implicitly converts to an integer. I don't see much value in forcing static_cast in every caller.

bool LevelDbMutationKey::Decode(leveldb::Slice key) {
user_id_.clear();

return ReadTableNameMatching(&key, kMutationsTable) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish some of this boilerplate could be reduced, but don't really have a good suggestion. The best I can come up with is having a helper method that represents each of the Key classes as a tuple, and then Encode/Decode methods could delegate to a variadic template function like void LevelDbMutationKey::Encode() { EncodeImpl(AsTuple(*this)); } However, it would require too much reworking, and will also increase binary size somewhat.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with a template mechanism each of these function invocations becomes an instantiation of the placeholder in the tuple. At best I think it ends up removing the "Read", the "&key, " and changing the "&&" into "," which isn't really enough savings to justify this kind of reengineering.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gain would be removing the duplication between Decode and Key (the common source for errors in serializing/deserializing is when those get out of sync) - both would delegate the tuple representation to a common helper. But yeah, it still won't outweigh the downsides.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean. That's a significant chunk of work and I'm still not entirely sure how I'd make partial keys work there.

// it's vital that partial segments in common not be prefixes.
ASSERT_TRUE(util::StartsWith(foo_user_key, table_key));

// Here even though "" is a prefix of "foo" that prefix is within a segment so
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: commas before "that" and before "so".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@var-const var-const left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

static std::string KeyPrefix(const absl::string_view user_id);

/** Creates a complete key that points to a specific user_id and batch_id. */
static std::string Key(const absl::string_view user_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, a little, but feel free to leave as is. My take is that these FooKey classes are domain objects that are serialized to/from strings. But it's a small enough nitpick to just leave as is.

* returned, the properties of the receiver are in an undefined state until
* the next call to -decode_key:.
*/
bool Decode(leveldb::Slice key);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just that in general, so far it seems that our code base leaves towards immutable classes, so an outlier stands out. Thanks for expanding the comments.


private:
std::string user_id_;
model::BatchId batch_id_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an uninitialized variable always has a potential of becoming an issue in the future in case code is refactored. There would be indeed a double initialization, but is this really so performance-critical that the cost of initializing an int is a concern (string is probably more expensive to initialize anyway)? OTOH, since this is by design in current code, I'm okay if you leave it as is (perhaps add a short comment like "Deliberately uninitialized - will be deserialized").

* A key in the mutation_queues table.
*
* Note that where mutation_queues contains one row about each queue, mutations
* contains the actual mutation batches themselves.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I missed that, thanks.

static std::string KeyPrefix(const model::ResourcePath& resource_path);

/**
* Decodes the contents of a remote document key into properties on this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about that. I like the new version.

* If the read is successful, returns true and contents will be updated to the
* next unread byte.
*/
bool ReadComponentLabelMatching(leveldb::Slice *contents,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these are local to this file, I'm fine with leaving it be.

leveldb::Slice complete_segments = *contents;

std::string segment;
std::vector<std::string> path_segments{};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just redundant in this case, but feel free to leave them in, it's no big deal.

Default- vs value-initialization matters for primitives and for classes with no default constructor containing primitives:

struct Pod1 { int x; };
Pod1 foo;
assert(foo.x == 0); // Undefined; x is uninitialized
Pod1 bar{};
assert(bar.x == 0); // True

struct Pod2 { int x = 42; };
Pod2 spam;
assert(spam.x == 42); // True
Pod2 eggs{};
assert(eggs.x == 42); // True

But both vector and string are well-behaved classes.

bool LevelDbMutationKey::Decode(leveldb::Slice key) {
user_id_.clear();

return ReadTableNameMatching(&key, kMutationsTable) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gain would be removing the duplication between Decode and Key (the common source for errors in serializing/deserializing is when those get out of sync) - both would delegate the tuple representation to a common helper. But yeah, it still won't outweigh the downsides.

@wilhuff wilhuff force-pushed the wilhuff/leveldb-key branch from 3b40fa7 to e5cffe2 Compare March 22, 2018 01:00
@wilhuff wilhuff merged commit cf630bf into master Mar 22, 2018
minafarid pushed a commit to minafarid/firebase-ios-sdk that referenced this pull request Jun 6, 2018
@paulb777 paulb777 deleted the wilhuff/leveldb-key branch August 8, 2018 17:08
@firebase firebase locked and limited conversation to collaborators Nov 6, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants