Skip to content

CDRIVER-3775: Implement standardised logging POC #684

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 33 commits into from

Conversation

alcaeus
Copy link
Member

@alcaeus alcaeus commented Sep 18, 2020

Summary

CDRIVER-3775

This PR serves as a starting point for discussion. Parallel to the current mongoc_log implementation, we add a new mongoc_structured_log implementation, which is private and not available to wrapping libraries. To allow for testing, we currently log some data from the command logging portion of the logging spec. The current state of the POC allows for testing in wrapping drivers, but does not yet implement a spec-compliant logger itself.

What's working:

  • Command start, success, and failures are logged to the structured logger
  • A rudimentary default handler logs all messages to stderr
  • The log message is constructed lazily. This helps reduce memory and runtime impact when a log message is not consumed (e.g. based on severity)
  • Wrapping libraries can override the default handler and handle log messages themselves (necessary for integration in PHP, Swift, and C++ drivers)
  • Implement a spec-compliant default logger. This logger is expected to read environment variables to configure logger behaviour (e.g. levels and target)

Next steps:

  • Remove memory leaks - my calls to BCON_APPEND seem to leak memory, but I haven't invested much time in figuring out those leaks yet.
  • Fix Windows builds: structured logging currently produces some ugly segmentation faults due to access violations whenever we invoke the default logger. This works fine on Ubuntu (apart from the leaks mentioned above).
  • Finalise logging calls to add missing information. We're only logging information we already have. The driver_connection_id, server_connection_id, and explicit_session values of the structured log messages are not yet logged as the data is not always available where we can log.
  • Integrate with existing log implementation: this is not necessarily required or even possible, as we should try to avoid BC breaks here. For example, PHP relies on the existing logging mechanism to put trace logs into a special file using the mongodb.debug INI switch.
  • Run clang-format on the whole PR

Implementation details

The log implementation is loosely based on mongoc_log(), with a few notable exceptions. Since the logging spec defines what can be logged using the structured logger, the entire API is private and not available outside of this implementation. We only provide APIs necessary to consume and handle log messages. This is a notable change from mongoc_log, which can be called from the outside.

A significant detail is the handling of additional data for structured logging. Every log message consists of a component, severity level, and a message portion (e.g. "Command started"). Optionally, a method to build the structured message can be passed, along with a struct to contain additional data. For the time being, this only allows to add command information to the log entry, but this will be changed once other log messages are implemented. The structured message is only built if the log message is consumed. While this makes data handling a bit more fragile (e.g. it's not possible to store a mongoc_structured_log_entry_t somewhere for later processing), it does significantly reduce the memory impact of the logger when logging is not required. For example, the current logging implementation serialises a the command and reply documents encountered during command execution to the message. Doing this when the message is not consumed but ignored (which would be the default case for this message), makes this problematic. For this reason, assembly of the structured message is deferred until the message is extracted from the mongoc_structured_log_entry_t using mongoc_structured_log_entry_get_message. From then on, it is stored in the original entry so that it does not need to be re-assembled.

@kevinAlbs kevinAlbs self-requested a review September 18, 2020 18:51
Copy link
Collaborator

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great so far! I noted a possible cause of leaks/segfaults, have a suggestion for removing the union in favor of a void *, and noted that the command started logs may not include OP_MSG sequences.

structured_message_data,
};

if (!gStructuredLogger) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If users can overwrite the log handler at any time, I think this should be protected by the mutex, and bson_once (&once, &_mongoc_ensure_mutex_once); should be called before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good to know. FWIW, mongoc_log() accesses gLogFunc outside of a mutex lock:

stop_logging = !gLogFunc;

I've updated this instance, let me know if I should fix the access in mongoc-log.c.

Copy link

@ghost ghost Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW i think it's better not to take the mutex before capturing stop_logging. The downside is that logging may have been disabled right before we log anyway, but if that's not a problem there's a definite performance advantage to having an early-out that skips the mutex entirely when we aren't logging.

uint32_t request_id,
uint32_t driver_connection_id,
uint32_t server_connection_id,
bool explicit_session);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this might not be capturing OP_MSG document sequences.
There are two ways of initializing a mongoc_apm_command_started_t:

I'm not sure off-hand if it's feasible to change this to accept a mongoc_cmd_t to handle commands with and without OP_MSG document sequence. If not, we may need a separate mongoc_structured_log_command_started_with_cmd.

Copy link
Member Author

@alcaeus alcaeus Sep 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally had the method accept only a mongoc_cmd_t, but had to change it to accommodate legacy commands that are run without a mongoc_cmd_t. I've added a separate mongoc_structured_log_command_started_with_cmd, similar to how we have two initialisers for mongoc_apm_command_started_t.

@alcaeus
Copy link
Member Author

alcaeus commented Oct 14, 2020

I've extracted the BSON encoding changes to #690. Once that PR is merged, I'll also extract the base logic (mongoc_structured_log.c and its headers) to a separate PR, as that covers the base spec (see mongodb/specifications#861). The command logging specification will then be yet another PR, as that will necessitate more changes to collect all required data (e.g. connection ID, client port, etc.). I'll leave this PR open in the meantime and rebase as individual PRs are created and merged.

@kevinAlbs kevinAlbs closed this Sep 20, 2021
@alcaeus alcaeus deleted the cdriver-3775 branch October 29, 2024 16:57
@ghost ghost mentioned this pull request Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants