Skip to content

CXX-2790 Redeclare bsoncxx::v_noabi as a non-inline namespace #1066

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Dec 12, 2023

Conversation

eramongodb
Copy link
Contributor

@eramongodb eramongodb commented Dec 4, 2023

Description

This PR partially resolves CXX-2790. Verified by this patch.

Although this PR and description is focused only on the bsoncxx library, many points equally apply to the mongocxx library. A followup PR will apply the same refactor to mongocxx.

This PR SHOULD NOT break source or binary compatibility. Verified by this patch (pending ABI compatibility check scripts); specifically, the source compatibility report uploaded by the abi-compliance-check task. Concerning the added/removed bsoncxx::detail symbols, see the section below regarding type traits.

Context

The following distinctions must be defined first:

  • A binary-breaking change requires bumping the ABI version number due to a change or removal of a symbol in the stable ABI.
  • A binary-incompatible feature may be an addition of a symbol to the stable ABI, thus does NOT necessarily require bumping the ABI version number.
  • A source-breaking change requires bumping the API major version number due to a change or removal of an entity in the API.
  • A source-incompatible feature may be an addition of an entity to the API, thus does NOT necessarily require bumping the API major version number.

For example, given the following initial state (assume no other overloads exist):

namespace bsoncxx {
  inline namespace vA {
    std::int32_t foo();
  }
}

The following introduces a potentially source-incompatible, binary-incompatible feature without being a source-breaking or binary-breaking change (it is purely additive):

namespace bsoncxx {
  inline namespace vA {
    std::int32_t foo(); // Old, deprecated.
  }

  namespace vB {
    std::int64_t foo(); // New!
  }
}

Source code that refers to bsoncxx::foo or bsoncxx::vA::foo continue to compile with the new library release. Binaries that expect the symbol bsoncxx::vA::foo continue to link with the new library release. This type of change allows library maintainers to introduce new features without concern for breaking source or binary compatibility.

By contrast, the following is a non-binary-breaking, potentially source-breaking change:

namespace bsoncxx {
  namespace vA {
    std::int32_t foo(); // Old, deprecated.
  }

  inline namespace vB {
    std::int64_t foo(); // New!
  }
}

Source code that refers to bsoncxx::vA::foo continue to compile with the new library release, and binaries that expect the symbol bsoncxx::vA::foo continue to link with the new library release. However, the meaning of source code referring to bsoncxx::foo has changed: it now refers to bsoncxx::vB::foo instead of bsoncxx::vA::foo. This type of change allows users to benefit from binary-incompatible features without requiring binary-breaking changes.

The important takeway from these examples is that potentially source-breaking changes can be introduced as binary-incompatible features without necessarily breaking source or binary compatibility. This allows forward-progress of the library interface and provides users the Quality of Experience (QoE) of being able to benefit from incremental upgrades, while also providing incremental backward and forward compatibility in the case of source-incompatible changes via explicit ABI namespace qualification: a user may opt-into using a potentially source-incompatible feature bsoncxx::vB::foo early, or may opt-out of a potentially source-breaking change by continuing to use bsoncxx::vA::foo instead.

Inline ABI Namespaces

The bsoncxx library was originally designed with the intent of controlling binary compatibility via a BSONCXX_ABI_VERSION variable. Both the SOVERSION and the ABI namespace were defined in terms of this variable, such that when the ABI version number is bumped from A to B, the inline ABI namespace within which all bsoncxx entities are declared is also renamed from vA to vB. This has the effect of migrating the entirety of the bsoncxx library from the vA to the vB namespace alongside the bumped SOVERSION.

This design was considered undesirable as, in practice, it does not easily allow for the type of changes described above. This is due to the fact that an inline namespace cannot be redeclared as non-inline, and a non-inline namespace cannot be redeclared as inline. This means bsoncxx library maintainers are forced into an all-or-nothing approach to binary compatibility when considering potentially source-breaking changes:

namespace bsoncxx {
  // Cannot declare as non-inline without affecting *all* entities within the
  // namespace.
  inline namespace vA {
    std::int32_t foo(); // Cannot selectively "undeclare" this entity in the
                        // root namespace.

    void bar(); // Unrelated entities in the inline namespace are forcibly
                // involved in root namespace declaration considerations.
  }
}

This limitation discourages both backward and forward compatibility in three major ways:

  • Because the current vA::foo cannot be selectively "undeclared" in the root namespace, bsoncxx::foo cannot refer to vB::foo without forcing an ABI version bump. This means users are unable to benefit from the addition of vB::foo unless they explicitly qualify their references with vB, which largely limits the utility of providing root namespace declarations (can only be exercised during binary-breaking releases, which are expected to be very infrequent).
namespace bsoncxx {
  inline namespace vA {
    std::int32_t foo(); // Cannot "undeclare" `vA::foo` in root namespace.
  }

  /* inline */ // error: conflicts with `vA::foo`.
  namespace vB {
    std::int32_t foo();
  }

  /* using vB::foo; */ // error: conflicts with `vA::foo`.
}
  • Because the ABI version bump is fundamentally linked to the redeclaration of ABI namespaces as (not-)inline, binary-incompatible changes are naturally forced into being treated like they are binary-breaking changes. vA::foo cannot be declared within a non-inline vA namespace until the vA -> vB rename is already in-progress. Therefore, preserving binary-compatibility with vA::foo -> implies preserving vA as the inline ABI namespace -> implies preserving the current ABI version number.
namespace bsoncxx {
  inline namespace vB {
    std::int64_t foo(); // Even if possible...
  }

  namespace vA {
    std::int32_t foo(); // ... cannot declare/define this...
  }

  inline namespace vA {
    void bar(); // ... while any instance of an inline `vA` still exists.
  }
}
  • Because the inline ABI namespace is renamed vA to vB in the process of an ABI version bump, all entities in the vA namespace are redeclared in the vB namespace regardless of their intended ABI version compatibility. This inevitably triggers a conflict between the intended vB::foo and the former vA::foo (now incorrectly renamed to vB::foo) during the ABI version number bump. Library maintainers must then go through the laborous effort of identifying and resolving these conflicts at the time of the ABI version bump (this cannot be mitigated beforehand other than via "TODO" documentation). The difficulty of this process is proportional to the number of conflicting vB symbols and the number of vA symbols that must be removed or redeclared (furthermore, these two sets of symbols are not 1-to-1!).
namespace bsoncxx {
  inline namespace vA {
    class X; // Not immediately clear which entities
    class Y; // are intended to be removed during the
    class Z; // transition to vB.
  }

  namespace vB {
    class X; // Not immediately clear which entities
    class Y; // are intended to replace *or* be replaced
    class Z; // by a counterpart in `vA`.
  }
}

These three bullet points overlap under a common theme: binary-incompatible features are a major pain to maintain even if they do not immediately break binary-compatibility.

Using-Declarations

To address the issues with inline namespaces above, this PR follows up on #1034 (CXX-2750) by proposing the use of using-declarations as the method to redeclare ABI namespace entities in the root namespace:

namespace bsoncxx {
  namespace vA {
    std::int32_t foo();
    void bar();
  }

  namespace vB {
    std::int64_t foo();
  }

  using vA::foo; // Consider effect of updating to `vB::foo`.
  using vA::bar;
}

This method is believed to address all of the issues described above:

  • The introduction of binary-incompatible changes in vB has no immediate effect or conflict with existing symbols in vA (it is not all-or-nothing).
  • All symbols associated with a given ABI version vA are always declared in the appropriate namespace and can be maintained independently from symbols in vB (no namespace renaming shenanigans).
  • Root namespace declarations can be independently controlled per-symbol, permitting incremental potentially source-breaking changes without forcing binary-breaking changes (API major/minor version bumps are independent from ABI version bumps).

This method facilitates both forward and backward compatibility, allows maintainers more freedom of design, and makes upgrades more accessible to users. The combination of all the above allows maintainers to selectively and incrementally migrate entities from vA to vB, without breaking source or binary compatibility while minimizing code duplication.

Furthermore, this refactor is NOT a source or binary breaking change. The identity of symbols remains unchanged, as all symbols are is still declared within their original ABI namespace (bsoncxx::v_noabi::X is still bsoncxx::v_noabi::X). All references (or lack thereof) to root namespace declarations, formerly imported via inline namespaces, are now fulfilled by the explicit using-declarations. Even argument-dependent lookup (ADL) should not be affected by this refactor given correct all the relevant using-declarations are implemented.

Changes

The volume of diffs in this PR is large, but the patterns applied are fairly simple in nature.

Replace inline namespace v_noabi with namespace v_noabi

The primary intent of this PR. All instances of inline namespace in the bsoncxx library are refactored to be non-inline. Due to the all-or-nothing nature of inline namespaces, this PR is forced to apply this refactor to all instances simulatenously.

Add Using-Declarations for Public Interfaces

All entities intended to be part of the public API are redeclared in the corresponding root namespace via a using-declaration. For example, for bsoncxx::document::value, the forward header looks as follows:

namespace bsoncxx {
  namespace v_noabi {
    namespace document {
      class value;
    }
  }
}

namespace bsoncxx {
  namespace document {
    using ::bsoncxx::v_noabi::document::value;
  }
}

And the normal header looks as follows:

namespace bsoncxx {
  namespace v_noabi {
    namespace document {
      class value { ... };

      BSONCXX_INLINE bool operator==(const value& lhs, const value& rhs) { ... }
      BSONCXX_INLINE bool operator!=(const value& lhs, const value& rhs) { ... }
    }
  }
}

namespace bsoncxx {
  namespace document {
    using ::bsoncxx::v_noabi::document::operator==;
    using ::bsoncxx::v_noabi::document::operator!=;
  }
}

Note: "hidden friends" do not require a using-declaration in the root namespace, as by-design they can only be found via ADL. This applies to functions and operator overloads such as bsoncxx::document::view::operator== that do not have an explicit declaration in namespace scope.

The completely separate redeclaration of the bsoncxx namespace is intentional to ensure the root namespace declarations are easily identifiable and grouped together. It should be plainly obvious at all times that any given entity declared in the root namespace is referring to the entity declared in this specific ABI namespace. It was deemed undesirable for this information to be mixed within ABI-specific declarations and definitions. Thus all headers now follow this approximate pattern:

#pragma once

#include <...>

#include <bsoncxx/config/prelude.hpp>

// ABI namespace declarations and definitions.
namespace bsoncxx {
  namespace vA {
    ...
  }
}

// Root namespace using-declarations.
namespace bsoncxx {
  using bsoncxx::vA::...;
}

#include <bsoncxx/config/postlude.hpp>

Note, this also has the additional side-effect of providing an interface boundary to better control exposure of public vs. private/implementation interfaces by the selective using-declaration of ABI namespace entities. For example, it is assumed (note: this assumption may be incorrect) that bsoncxx::v_noabi::types::is_bson_view_compatible in bsoncxx/v_noabi/bsoncxx/types/bson_value/view.hpp is not intended to be part of the public API, thus it is not included in the using-declarations below, preventing its accidental/unintended use (there is no bsoncxx::types::is_bson_view_compatible).

Furthermore, the impact of using-declarations and using-directives used within the v_noabi namespaces in public headers can be somewhat mitigated in the future by being excluded in their root namespace counterparts. Their removal is not done in this PR to ensure source-compatibility, but // Deprecated. comments have been added to discourage their continued dependence/use.

ABI Interface Stability

In library interface and implementation (headers and sources), all references to bsoncxx::X have been refactored to either exclude the root namespace qualifier entirely (e.g. just X) or to be fully qualified with the ABI namespace included (e.g. ::bsoncxx::v_noabi::X). This is a coding style that should be enforced via code review going forward: entities within an ABI namespace MUST NOT be defined in terms of root namespace declarations. This is to ensure that any interface defined in terms of ABI namespace vA will never be inadvertently affected by the upgrade of a root namespace declaration vA -> vB.

For example:

namespace bsoncxx {
  namespace vA {
    struct X;
  }

  namespace vB {
    struct X;
  }
}

namespace bsoncxx {
  using ::bsoncxx::vA::X; // Updating this to `bsoncxx::vB::X`...
}

namespace bsoncxx {
  namespace vA {
    void foo(bsoncxx::X const& x); // ... MUST NOT change the meaning of this
                                   // ABI-specific declaration (unintended)!
  }
}

Applying this rule consistently throughout the bsoncxx library contributed most to the volume of diffs in this PR.

Private and Test Interfaces

Private and test interfaces, by-design, should not affect the ABI, thus are not subject to ABI stability requirements or concerns as described above. Therefore, private and test interfaces have (mostly) been redeclared without being associated with an ABI namespace, simply due to it not being necessary (e.g. bsoncxx/private/helpers.hh). However, if convenient, they may still be declared within an ABI namespace (e.g. bsoncxx/types/private/convert.hh).

This PR takes this opportunity to replace instances of BSONCXX_INLINE in private and test headers with a simple inline to avoid conflating the macro's intended usage as documenting "a public API member that is not a member of the stable ABI". This takes advantage of the CMake target property VISIBILITY_INLINES_HIDDEN being set to ON in the bsoncxx library CMake configuration.

Type Traits

The type_traits.hpp header is the sole exception to the rules described above. It was observed that the use of type traits are, with one sole exception (arguably a bug; it should be declared BSONCXX_INLINE), exclusively used in interfaces declared BSONCXX_INLINE, inline, or private, indicating they are not guaranteed ABI-stability. To both document and encourage this pattern, the type traits defined in bsoncxx/stdx/type_traits.hpp are not declared in an ABI namespace, as the interfaces they are used for are not intended to affect the ABI.

To further enforce this expectation, note the addition of an abi-prohibited-symbols task in this patch (pending ABI compatibility check scripts) which greps for any mention of bsoncxx::detail in the exported dynamic symbol table, per nm --demangle --dynamic --defined-only --extern-only --just-symbols.

Doxygen Documentation

A notable casualty of this PR is that, shockingly, Doxygen does not yet understand how to identify using-declarations (doxygen/doxygen#3760). This means documentation of root namespace declarations are completely gone following this PR (Doxygen effectively behaves as if they do not exist). This presents a documentation problem for how to communicate what entities are declared in the root namespace. The current Doxygen pages in release 3.9.0 have the ideal behavior (root namespace entities redirect to v_noabi entities and inherit their documentation). An acceptable workaround for this issue has not yet been identified.

This is not too severe an issue at the moment, as the only ABI namespace is the v_noabi namespace, thus all root namespace declarations only refer to v_noabi namespace entities, and all current CXX Driver users are familiar with the existence of the root namespace declarations. However, this issue will need to be addressed before any v1 namespace entities are introduced to avoid confusion.

@eramongodb eramongodb self-assigned this Dec 4, 2023
Copy link
Collaborator

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with suggested developer documentation.

entities within an ABI namespace MUST NOT be defined in terms of root namespace declarations.

Suggest noting this in CONTRIBUTING.md to document this expected pattern for developers. Example:

Referencing symbols

The library root namespaces (mongocxx and bsoncxx) contain ABI versioned namespaces (e.g. v_noabi).

In library interface and implementation (headers and sources), references to library symbols (e.g. bsoncxx::X) must either exclude the root namespace qualifier entirely (e.g. just X) or be fully qualified with the ABI namespace included (e.g. ::bsoncxx::v_noabi::X). This is to ensure that any interface defined in terms of ABI namespace vA will never be inadvertently affected by the upgrade of a root namespace declaration vA -> vB.

Examples:

// header.hpp
void foo (::bsoncxx::document::view view); // Not OK! Does not include ABI namespace.
// header.hpp
namespace bsoncxx {
namespace v_noabi {
namespace document {
void foo (view view); // OK. Excludes root namespace qualifier.
} // namespace document
} // namespace v_noabi
} // namespace bsoncxx
// header.hpp
void foo (::bsoncxx::v_noabi::document::view view); // OK. Includes ABI namespace.

Copy link
Contributor

@vector-of-bool vector-of-bool left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, but LGTM. A few open non-blocking questions:

  1. Is BSONCXX_INLINE ever actually needed? Everything is already "hidden", including function templates and inline class member function definitions, and there is no documentation on it, so I'm not sure what it is supposed to convey. My only experience so far is that it adds visual noise.
  2. I'm curious how this effects auto-complete in editors. i.e. My clangd will complete a fragment view_or_ as bsoncxx::{document,array,string}::view_or_value, and omits the inline namespace. Will it now inject the ABI namespace as part of expansion?

@eramongodb
Copy link
Contributor Author

eramongodb commented Dec 12, 2023

  1. Is BSONCXX_INLINE ever actually needed?

The proposed upcoming changes in #1052 documents BSONCXX_PRIVATE and BSONCXX_INLINE as excluding private and public API symbols from ABI stability guarantees respectively. My understanding was that this is needed to exclude public member functions in a class declared with BSONCXX_API from being given non-hidden visibility, such as view() in bsoncxx::document::value (using "non-hidden" to avoid confusion with common use of the word "default"):

// Due to `BSONCXX_API`, all class members have non-hidden visibility by default.
class BSONCXX_API value;

class value {
  // ...

  // Due to `BSONCXX_INLINE`, this member function is excluded from the class'
  // default non-hidden visibility and given hidden visibility instead.
  BSONCXX_INLINE document::view view() const noexcept;

  // ...
};

// Due to `BSONCXX_INLINE`, this member function definition is inline.
BSONCXX_INLINE document::view value::view() const noexcept { ... }

There should not be a symbol for view() in the ABI, but it is still intended to be used by users as part of the public API of the class. However, because the class itself is declared with non-hidden visibility, by default this member function also has non-hidden visibility. Therefore, it must be explicitly declared with visibility("hidden")... or so I thought.

Upon further investigation, because we also set VISIBILITY_INLINES_HIDDEN in addition to CXX_VISIBILITY_PRESET, all "inline" functions and variables are apparently given hidden visibility even if the class is declared with non-hidden visibility. This seems to include any symbol implicitly or explicitly declared inline, as well as instantiations of a variable/function template (despite not technically being inline). If this is true, then I believe it is correct that BSONCXX_INLINE is not actually needed and could be removed.

However, I am having trouble finding definitive documentation about what constitutes "inline" w.r.t. -fvisibility-inlines-hidden. Documentation seems to be scarce, seemingly all pointing to the GCC visibility doc which only references class member functions:

Lastly, there's one other new command line switch: -fvisibility-inlines-hidden. This causes all inlined class member functions to have hidden visibility, causing significant export symbol table size & binary size reductions though not as much as using -fvisibility=hidden. However, -fvisibility-inlines-hidden can be used with no source alterations, unless you need to override it for inlines where address identity is important either for the function itself or any function local static data.

If we are comfortable depending on -fvisibility-inlines-hidden applying to all entities effective "defined in header", including template instantiations, then I think we can deprecate/remove BSONCXX_INLINE. But I'm not sure how consistent/reliable this is across all the toolchains we currently support.

I'm curious how this effects auto-complete in editors.

So long as the editor (or whatever underlying C++ parser it is using) is capable of identifying using-declarations and redeclaration of referenced namespaces/entities in the correct scope, I believe it should work no different to how it already does with inline namespaces. e.g. a quick test in VS Code correctly identifies autocompletion candidates for a partially-written bsoncxx::document::v as one of bsoncxx::v_noabi::document::(view|value|view_or_value) (including correct corresponding documentation), but the autocompletion does not substitute-in the v_noabi namespace when any of the candidates are selected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants