Skip to content

Fix non-rvalue Json::Value assignment operator (should copy, not move) #640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 1, 2017

Conversation

cfyzium
Copy link

@cfyzium cfyzium commented Jul 31, 2017

Pull request #635 has changed the behavior of non-rvalue assignment operator from copying the 'other' value to swapping with it. There is a lot of wrong with it.

This pull request reverts to the earlier assignment operator, with 'other' passed by value rather than const&. Now that Json::Value has a move constructor, such assignment operator covers both move and non-move cases.

@ghost
Copy link

ghost commented Jul 31, 2017

@cfyzium #635 still performed swap for copy assignment constructor. But a proper move operation for move assignment constructor.

From what I understand you are reverting the behavior of the move assignment operator from move constructing to swap.

I am not trying to dispute this change but just enquiring to clarify the change.

IMO the swap on copy construction was wrong but I am not the library maintainer and it's their call. Though I don't think move assignment should be subjected to the same behavior of copy assignment as it's a new addition. But as I said that's my opinion.

@cfyzium
Copy link
Author

cfyzium commented Jul 31, 2017

@dark-passenger no, #635 performed a swap on the argument itself (after casting it from const Json::Value& to Json::Value&), not a swap on a copy. This essentially changed copy assignment to move assignment. Besides, stripping away const from an argument is wrong by itself since nobody expects a method to modify a const& argument.

The original assignment operator (swapping the argument passed by value) now works in both cases. In case of non-move assignment operator, it will make a deep copy to construct the argument and then swap with it, as intended. In case of move assignment, the rvalue will be moved into the argument via the move constructor and then swapped with the object. AFAIK it is a pretty common idiom.

@ghost
Copy link

ghost commented Jul 31, 2017

Oh my god. I am such an idiot. 😅 I really didn't notice that blunder on my part. Though one point still remains the cost of a copy still doesn't sit well with me. The move assignment is okay (without actually using a swap since that's actually 3 moves). Rather a new function to move and using a copy for copy assignment.

I realize that a single function is very common but I think seperate implementation could be beneficial. Of course this is all theoretical based of Scott Meyers book. :D

@cfyzium
Copy link
Author

cfyzium commented Jul 31, 2017

Ugh. I've looked into separate move-assignment vs generic copy/move-and-swap and it looks like you've also introduced a memory leak there. The move-assignment from #635 looks like this:

Value& Value::operator=(Value&& other) {
  initBasic(nullValue);
  swap(other);
  return *this;
}

However, initBasic initializes (in this case, clears) some fields. Specifically, it clears the comments_ which is an unmanaged pointer to an array of CommentInfo. If the object has some comments this operator will wipe the pointer and leak memory, right?

Same thing with the added copy method void Value::copy(const Value& other), the field is not properly copied and may leak memory. Though this method is not used anywhere in the library, what was it introduced for?

As for the cost of generic copy/move-and-swap idiom... Well, it is true it first swapped during the move-construction and then again swapped manually during the assignment. But compilers nowadays are very good at optimizing such stuff away. The final assembly of the generic assignment operator is one measly mov longer than the assembly of a single swap function.

@cdunn2001 cdunn2001 merged commit 7354da8 into open-source-parsers:master Aug 1, 2017
@tomalakgeretkal
Copy link

tomalakgeretkal commented Oct 14, 2017

Additional note on the above: the const_cast was very dangerous and made any assignment to Json::Value potentially have undefined behaviour (since the subsequent swap assuredly modifies the RHS, and the RHS may not be mutable). You should never cast away constness in C++, or, if you do, you must explain thoroughly in comments alongside the code why it was necessary. Odds are good that, while writing said comments, you will realise that it wasn't. :)

@cdunn2001
Copy link
Contributor

I'm a bit confused by github now. I got this link in my "notifications" inbox:

But why? That is neither a comment nor a PR.

Anyway, I see what @tomalakgeretkal is saying. The old code was:

Value& Value::operator=(Value other) {
   swap(other);
   return *this;
 }

The new is:

Value& Value::operator=(const Value& other) {
  swap(const_cast<Value&>(other));
  return *this;
}

That's clearly wrong. Do we need to revert this whole PR?

@tomalakgeretkal
Copy link

Sorry about the notification; I did originally write a comment on the offending line of code, but decided to remove it (as I don't generally use GitHub and thus don't really grok it) and write my comment above instead!

@tomalakgeretkal
Copy link

Anyway I believe that you guys have already fixed the problem by reverting the offending code.

@cdunn2001
Copy link
Contributor

@thelema, if the argument is an rvalue, then Value::operator=(Value other) will first perform a move-construction implicitly. I think we're fine. (See "Overloading functions" here).

Btw, thanks for looking into these issues.

@BillyGoto
Copy link

BillyGoto commented Jan 1, 2018 via email

@BillyGoto
Copy link

BillyGoto commented Jan 1, 2018 via email

@BillyGoto
Copy link

BillyGoto commented Jan 1, 2018 via email

@tomalakgeretkal
Copy link

Actually to be fair that's a false statement because the standard library had to maintain backward compatibility to a degree. Additionally, it's not a move assignment operator if it takes by-value and not as a universal or rvalue ref. I should probably read the conversation I'm commenting on before commenting on it!

Moving on from that, what problem do you actually perceive here?

@BillyDonahue
Copy link
Contributor

As Jsoncpp has an ABI compatibility concern, it should present the full overload set we intend to support. In the (likely IMO) event that we discover an advantage of true move operations over copy-assignment, we should not have to change the ABI to separate operator=(Value) into
the operator=(Value&&) and operator=(const Value&) overloads again.

There's also an efficiency concern in that a moved-from object can retain some capacity which can be reused to make subsequent value changes faster. After d = std::move(s);, s could retain the allocation previously belonging to d, but if we implement with copy and swap, the pre-assignment d allocation is given to the temporary and destroyed at close of scope.

That's to say nothing of the actual cost of doing the move ops, and we're performing more of those than we need to with this patch. Even if they're cheap, it's a frivolous waste.

@tomalakgeretkal
Copy link

@BillyDonahue I would actually generally agree with your first paragraph.

Your second paragraph doesn't seem to follow, though, as we would (potentially) be talking about a move construction, not a copy construction followed by a swap.

Your third I still can't agree with; this is 2017, and your compiler knows full well what to do with one extra pointer or integer swap when one will do. It's not a realistic concern. We should focus on the usability and ABI concerns instead. Fortunately we seem to have arrived at the conclusion that the concerns are not in conflict. ;)

@tomalakgeretkal
Copy link

Or, 2018, whatever 🥇

@BillyDonahue
Copy link
Contributor

I'm glad to have found a compromise on the ABI issue, but I would want to better understand the objection to my second paragraph. I didn't understand your point as phrased. I'm saying that an allocation is a reusable resource that can be retained after a pure move assignment operator=(Value&&) but cannot be retained after a copy-and-swap assignment operator=(Value). That should be clear, right?

The cost of the extra move might be "one extra pointer or integer swap", or it could be something more costly, depending on the details of the members of the type being moved. For example, if the type holds member pointers that point into neighboring data members, those identity-based pointers would have to be readjusted with each move. We shouldn't hardcode the assumption that moves optimize to zero-cost into the API when we don't have to. It can limit our options when refactoring the internals.

@tomalakgeretkal
Copy link

I suppose I can agree with the principle of that, as long as we bear in mind that the entire purpose of move operations is to have negligible cost, or at least a cost substantially less than the alternative copy operation. If a type's move operation is costly enough that this is worth worrying about more so than the design of the ABI or the common sense of usability, I'd say the type has been implemented wrong.

As for the other question, I still don't really understand what is being claimed. If we're talking about passing an rvalue into the operator=(Value) (and we are) vs passing one into operator=(Value&&) you still get a chain of moves (however those are implemented). Why should the semantics or the mechanics be any different? And even if you std::move something into the operator=(Value), that initial implicit move construction can perform the selfsame resource swap that you're talking about. That's the entire purpose of move semantics. So I don't currently see where your reusable resource is lost or wasted. Perhaps a code example will make it clear. I wouldn't spend too much time on it though as again I believe we're all in agreement (for various reasons) about what should actually be done here.

@BillyDonahue
Copy link
Contributor

BillyDonahue commented Jan 1, 2018

A class with a single operator=(T) isn't easier or harder to use than a class with operator=(T&) and operator=(T&&), it's just sometimes easier to write.

My claim regards the state of the source object after the move assignment is completed.
Check out this gist:

After the assignment, my 'x' is left with no capacity in the ByValue case.
In the ByRef case, it retains a buffer size of 10000 and can be reassigned to take advantage of that allocation. I don't think this can be overcome. Whatever's in the temporary when it goes out of scope is lost and cannot be exported back to the caller's argument.

@tomalakgeretkal
Copy link

tomalakgeretkal commented Jan 2, 2018

That's because your AssignByValue type is broken. It moves in one way for assignment and another way for construction. If your semantics were consistent then the capacity would remain in both cases. You conveniently skipped the inconsistent (defaulted) function in the AssignByRef case (by virtue of the constructor of course not being necessary), making it in no way a fair test. Conclusion: not an argument for or against passing by value, but against confusing and broken implementations. Which is what I said before :)

Going further, your example is actually not as clear cut as it seems anyway, since the state of the member string is "valid but unspecified". Obviously you should not rely on the type's move semantics being so weak that a copy of the resource was actually made (which is the only way keeping capacity on both LHS and RHS can possibly work; an example is where the move had to fall back on a copy due to not having any actual moveable resources, such as a string using SSO).

And I still don't understand what you're claiming with the temporary going out of scope. If you're passing a temporary in, it's gone no matter how you perform the move. If you're not passing a temporary in, then you're not passing a temporary in. You seem to be comparing apples to oranges in a few places.

Regardless I suggest we get back to the topic at hand... i.e. JsonCpp design.

@BillyGoto
Copy link

BillyGoto commented Jan 2, 2018 via email

@BillyDonahue
Copy link
Contributor

The copy operations were omitted because they're boring defaults, and of course not to misdirect or bias the experiment. They are:

    AssignByRef(const AssignByRef& src) : s(src.s) {}
    AssignByRef(AssignByRef&& src) : s(std::move(src.s)) {}

    AssignByValue(const AssignByValue& src) : s(src.s) {}
    AssignByValue(AssignByValue&& src) : s(std::move(src.s)) {}

Adding these definitions doesn't change the outcome of the gist.

Going further, your example is actually not as clear cut as it seems anyway, since the state of the member string is "valid but unspecified". Obviously you should not rely on the type's move semantics being so weak that a copy of the resource was actually made (which is the only way keeping capacity on both LHS and RHS can possibly work; an example is where the move had to fall back on a copy due to not having any actual moveable resources, such as a string using SSO).

Did you mean "counter-example" here? We both know an SSO string move might just be a copy. Your strongly worded assertion about the only way to keep capacity is false. Metadata like capacity and allocations are not part of a value. They can survive in an object whose value is assigned to or moved from. The "valid but unspecified" semantics on a moved-from value usually allow at least for reassignment. I tried to find a typical example with my Gist's structs holding std::string. I think they're similar to the kind of allocation handles a Json::Value could hold.

I still would like to know how my AssignByRef or AssignByValue structs were "broken" and what a better capacity-preserving formulation would have been. You say there's a way to do it better. What is that way?

@cdunn2001
Copy link
Contributor

@BillyDonahue, your gist is a very clear example. I've been away a couple weeks, not ignoring the discussion. I don't think you two are arguing over how to code that example, but rather over whether it's reasonable to expect the "capacity" of a string to remain unchanged, as it's a low-level optimization, a "hint".

So what's the recommendation now? Provide an extra operator? Could someone submit a PR? I'll try to catch up on the Pull-Requests here next week...

@cdunn2001
Copy link
Contributor

As Jsoncpp has an ABI compatibility concern ...

@BillyDonahue, I hate to go off-topic, but this is something I've been thinking about. Lots of people (including you, I think) prefer the convenience of Reader/Writer over the configurability of CharReader/StreamWriter. Maybe we should update the old "deprecated" Reader/Writer to be implemented in terms of the new stuff. That would break ABI compatibility, but I think it might be time for a major version bump. Besides, it seems that people today prefer to rely on the soversion number, which we regularly bump anyway. (See soversion in meson.build. And Meson docs.) In other words, since we always bump "soversion" on any ABI change, and since we do not even bother to "version" to Meson when we build the JsonCpp library, I'm not sure we have to worry about ABI compatibility at all anymore. ("version" is now used only when Meson checks library dependencies.) At any rate, I'm fine with a major bump. Opinion?

@BillyGoto
Copy link

BillyGoto commented Jan 16, 2018 via email

@BillyGoto
Copy link

BillyGoto commented Jan 16, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants