Skip to content

CDRIVER-3775 mongoc_structured_log #1795

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
144 commits merged into from
Jan 9, 2025
Merged

CDRIVER-3775 mongoc_structured_log #1795

144 commits merged into from
Jan 9, 2025

Conversation

ghost
Copy link

@ghost ghost commented Nov 19, 2024

This is a revival of an old pull request by @alcaeus to add structured logging to the C driver. (#684)

Structured logging is a new subsystem, independent from unstructured logging. The original PR introduced structured logging with a global scope, similar in design to unstructured logging. This version redesigns it to be per-client or per-pool. Internally, there is a structured logging "instance" owned by the mongoc_topology_t. A separate public mongoc_structured_log_opts_t object provides a configuration API, and the opts can then be passed on to the client or client pool in order to apply new log settings.

The internal API has been redesigned. The previous PR required each new type of log message to have functions and structures associated with that specific message format. In this redesign, I've split the single callback into a variable length list of callbacks, and added a set of macros to make it straightforward to build these function tables. This has some nice properties:

  • Submitting a log entry still doesn't require any deep copies, json serialization, or any memory allocation, only a stack-allocated table of references.
  • Now the mongoc_structured_log() call site explicitly names the keys that are included in the resulting document.
  • Now it's easy to define ad-hoc log entries or add new values to existing log entries.
  • The list-based approach also makes it easy to define reusable building blocks. This is immediately useful for command redaction and for server descriptions.

Here's a sample invocation:

   mongoc_structured_log (
      client->topology->structured_log,
      MONGOC_STRUCTURED_LOG_LEVEL_DEBUG,
      MONGOC_STRUCTURED_LOG_COMPONENT_COMMAND,
      "Command started",
      int32 ("requestId", client->cluster.request_id),
      server_description (server_stream->sd, SERVER_HOST, SERVER_PORT, SERVER_CONNECTION_ID, SERVICE_ID),
      utf8_n ("databaseName", cursor->ns, cursor->dblen),
      utf8 ("commandName", "getMore"),
      int64 ("operationId", cursor->operation_id),
      bson_as_json ("command", &doc));

The macros are explained by doc comments in mongoc-structured-log-private.h.

This PR includes updated unified tests from the command logging and monitoring spec, which now pass. This required several other changes to the unified test runner.

Contents:

  • Implement and document a structured logging facility
  • Structured logging items for command and reply redaction
  • Unified test runner support for: createEntities, observeLogMessages, waitForEvent, $$matchAsDocument, $$matchAsRoot
  • Add serverDescriptionChangedEvent to the unified test runner and unify its two event serialization systems
  • Private serialization utilities, *_append_contents_to_bson
  • Sync unified tests from the command-logging-and-monitoring spec
  • Enough command logging to pass the CLAM tests
  • bson-dsl support for oid() values
  • Private utilities for dealing with zero'ed oids
  • Minor drive-by cleanup

CDRIVER-3775 is the epic for structured logging in general. This PR currently covers:

Follow-up changes will address the remaining parts of CDRIVER-3775. In particular, these major items are out of scope for this PR:

Copy link
Collaborator

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work. LGTM with minor comments addressed. I like the use of atomics to avoid repeated logs from invalid environment variable values.

micah and others added 9 commits December 16, 2024 08:15
The printf() is meant to stand in for a stream of some sort, but on its own doesn't motivate thread safety because stdio has its own locking.

We could use a non-threadsafe file or queue, but Kevin suggested a counter and I like how simple this is. The code now includes a counter, and the comments mention that it could represent something mroe complex like a file or queue.
I had an earlier change which tried to move cmd serialization to mongoc-cmd, but I had to move it back to structured-log due to dependencies on the structured logging options. This reverts an include I missed when reverting the earlier move.
Micah Scott added 5 commits December 17, 2024 08:17
The immediate provocation here is to work around what appears to be a clang static analyzer bug, but it's a nice cleanup anyhow. The noreturn in timegm was unused, but I found it when adding the new BSON_NORETURN needed for the assert implementation.
@ghost
Copy link
Author

ghost commented Dec 18, 2024

Re the scan-build-macos-14-arm64-clang failure that remains on evergreen, it seems to me like some of the build hosts may have an inconsistent environment? The failing runs here are unable to locate a scan-build binary. On the master branch runs that succeed, the script locates a locally installed scan-build from homebrew.

@kevinAlbs
Copy link
Collaborator

build hosts may have an inconsistent environment?

Seems unexpected. Asked on DEVPROD-12817.

Copy link
Member

@jmikola jmikola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating copyright dates on headers is probably the only necessary change (and maybe the bson_t formatting in docs if you care about that).

I'll leave a provisional LGTM to avoid holding this up further.

@ghost ghost removed the request for review from vector-of-bool January 8, 2025 19:44
@ghost ghost merged commit 6e5c6be into mongodb:master Jan 9, 2025
45 checks passed
@ghost ghost deleted the CDRIVER-3775 branch January 9, 2025 04:09
ghost pushed a commit that referenced this pull request Jan 22, 2025
#1826)

This PR is a follow-up for #1795, #1821, and #1816. The new internal string and json APIs are used to implement missing functionality and tests for the structured logging subsystem.

* Adds public APIs for the max_document_length within a mongoc_structured_log_opts_t.
* Addresses the remainder of CDRIVER-4485: JSON documents within structured log messages are truncated correctly according to the rules set out in the Logging specification.
* Addresses the remainder of CDRIVER-4486 by implementing prose tests.
* Addresses CDRIVER-4814. Command or payload data beyond the truncation limit no longer degrades logging performance. (New serializer for command-with-payload)
* Unified tests use the suggested max document length of 10000 from the spec
* Replace atomic counter with atomic flag for environment error guards
* Tests for environment defaults now skip if relevant variables are set externally
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants