Skip to content

Switch performance event traffic to transport endpoint #2593

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Apr 8, 2020

Conversation

zijianjoy
Copy link
Contributor

@zijianjoy zijianjoy commented Feb 3, 2020

  • Implement API call to transport endpoint for dispatching Performance Monitoring events.
    • Honor nextRequestWaitMillis from transport response.
    • Act on responseAction for logResponseDetail: Push batch message back to queue if RETRY_REQUEST_LATER.
  • Fetch Remote Config to dynamically update transport endpoint Key.
  • Fetch Remote Config to get rollout percentage to transport:
    1. If there is no template from Remote Config fetch, send events to cc.
    2. If there is template but no rollout flag, send events to transport.
    3. If there is rollout flag, generate hash value from instanceID:
      1. If hash < rollout percentage, send to transport.
      2. If hash >= rollout percentage, send to cc.

@zijianjoy zijianjoy requested a review from alikn March 12, 2020 21:34
@zijianjoy zijianjoy marked this pull request as ready for review March 12, 2020 21:34
@zijianjoy zijianjoy requested a review from visumickey March 17, 2020 16:42
state === 'INSTANCE_STATE_UNSPECIFIED' ||
state === 'NO_TEMPLATE'
) {
if (NO_TEMPLATE_CONFIGS.shouldSendToTransport !== undefined) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is no risk of shouldSendToTransport being one of the js falsey values, we can drop the "!== undefined" in the checks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NO_TEMPLATE_CONFIGS.shouldSendToTransport can be false as boolean value. If changing the check above into if (NO_TEMPLATE_CONFIGS.shouldSendToTransport), then this check will return false if shouldSendToTransport is false. It will cause boolean value assignment inside the if block being skipped. Please correct me if I was wrong understanding the comment. Thank you!

interface BatchEvent {
message: string;
eventTime: number;
}

/* eslint-disable camelcase */
// CC accepted log format.
interface CcBatchLogFormat {
// CC/Transport accepted log format.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than "transport" let's call it Firelog to make it clear what it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to continue discussion on naming convention.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on changing it to Firelog.

Transport is a mechanism and Clearcut and Firelog are the different types of it IMO.

Note: If you plan to change it please update at other places as well to have the distinction between Transport and Firelog clear.


However, if you want to have it Transport for abstraction reason where other parts of the SDK doesn't need to care about what's the underlying technology is (Clearcut, Firelog etc) than I am OK with that with 2 caveats:

  1. To external world (other parts of Web SDK) the API would be sendEventsToTransport() (or something along the lines)

  2. Internally (within this class) we need a way to know which transport mechanism are we using. I see the internal APIs doesn't talk about Firelog at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion, I have changed the name of Transport to FL to avoid confusion. Please take a look and see whether it makes sense to you. Thank you both!

@google-oss-bot
Copy link
Contributor

google-oss-bot commented Apr 6, 2020

Binary Size Report

Affected SDKs

SDKTypeBase (6898eb8)Head (7074d74)Diff
@firebase/loggermodule?4778.00? (?)
esm2017?3223.00? (?)
main?5087.00? (?)
@firebase/storagemodule?62706.00? (?)
esm2017?56574.00? (?)
main?62912.00? (?)
@firebase/firestore/memorybrowser?198105.00? (?)
module?196503.00? (?)
esm2017?155605.00? (?)
main?374451.00? (?)
@firebase/utilbrowser?19454.00? (?)
module?18551.00? (?)
esm2017?17369.00? (?)
main?19481.00? (?)
@firebase/applite-esm2017?7748.00? (?)
react-native?9862.00? (?)
browser?11108.00? (?)
module?11011.00? (?)
esm2017?9461.00? (?)
2 more entries not displayed here ...
@firebase/firestorebrowser?252362.00? (?)
module?250342.00? (?)
esm2017?199040.00? (?)
main?486139.00? (?)
@firebase/installationsmodule?21544.00? (?)
esm2017?16530.00? (?)
main?21995.00? (?)
@firebase/functionsbrowser?9174.00? (?)
module?8994.00? (?)
esm2017?7010.00? (?)
main?9209.00? (?)
@firebase/webchannel-wrappermodule?41226.00? (?)
main?40383.00? (?)
@firebase/performancebrowser?28112.00? (?)
module?27862.00? (?)
esm2017?26090.00? (?)
main?28112.00? (?)
firebasefirebase-installations.js?19157.00? (?)
firebase-storage.js?40971.00? (?)
firebase-messaging.js?39120.00? (?)
firebase-auth.js?173406.00? (?)
firebase.js?827721.00? (?)
10 more entries not displayed here ...
@firebase/analyticsmodule?9370.00? (?)
esm2017?8681.00? (?)
main?9688.00? (?)
@firebase/authmodule?177089.00? (?)
main?177099.00? (?)
@firebase/remote-configbrowser?23103.00? (?)
module?22720.00? (?)
esm2017?17705.00? (?)
main?23103.00? (?)
@firebase/testingmain?6607.00? (?)
@firebase/messagingmodule?30886.00? (?)
esm2017?23151.00? (?)
main?31343.00? (?)
@firebase/databasebrowser?267892.00? (?)
module?266349.00? (?)
esm2017?234647.00? (?)
main?268656.00? (?)
@firebase/componentbrowser?5164.00? (?)
module?5046.00? (?)
esm2017?3898.00? (?)
main?5164.00? (?)
@firebase/polyfillmodule?666.00? (?)
main?733.00? (?)
Metric Unit: byte

Test Logs

Copy link

@ramanpreetSinghKhinda ramanpreetSinghKhinda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still reviewing...

@@ -301,5 +302,78 @@ describe('Performance Monitoring > perf_logger', () => {
EXPECTED_NETWORK_MESSAGE
);
});

it('skips performance collection if domain is cc', () => {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we want to skip the collection in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fireperf automatically collects network request (including request to cc). However, to avoid infinite loop of network request collection (request to CC -> request is captured as performance resource -> another request to CC), if the domain is cc, then we will not collect the network request.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks James. Based on our offline discussion this is validating that we don't instrument Network Request to CC/FL sent by our own SDK.

Since it took a while to understand this scenario it would be good to add some documentation to make it clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Raman! Added comments to explain that we don't instrument requests sent from SDK itself in both production code and test code.

'tp:/ieaeogn-agolai.o/1frlglgc/o'
);

transportKey = mergeStrings('AzSC8r6ReiGqFMyfvgow', 'Iayx0u-XT3vksVM-pIV');

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Web SDK is an Open Source project, I am wondering how difficult it would be for any intruder to crack our API?

Do we want to get a review from Firelog team to see if it's OK to expose this API Key even if it's juggled (considering the encoding logic is also public, so anyone can decode it very easily)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Storing plain text key on GitHub will automatically trigger alert to the owner, therefore we cannot store plain text key directly. Current approach is to align with mobile SDK change for interleave encoding. We have previously confirmed with the team for this approach.

interface BatchEvent {
message: string;
eventTime: number;
}

/* eslint-disable camelcase */
// CC accepted log format.
interface CcBatchLogFormat {
// CC/Transport accepted log format.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on changing it to Firelog.

Transport is a mechanism and Clearcut and Firelog are the different types of it IMO.

Note: If you plan to change it please update at other places as well to have the distinction between Transport and Firelog clear.


However, if you want to have it Transport for abstraction reason where other parts of the SDK doesn't need to care about what's the underlying technology is (Clearcut, Firelog etc) than I am OK with that with 2 caveats:

  1. To external world (other parts of Web SDK) the API would be sendEventsToTransport() (or something along the lines)

  2. Internally (within this class) we need a way to know which transport mechanism are we using. I see the internal APIs doesn't talk about Firelog at all.

): Promise<void> {
// Gradually rollout traffic from cc to transport using remote config.
if (SettingsService.getInstance().shouldSendToTransport) {
return sendEventsToTransport(data, staged);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

function getHashPercent(seed: string): number {
let hash = 0;
for (let i = 0; i < seed.length; i++) {
hash = (hash << 3) + hash - seed.charCodeAt(i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reference to where we chose to pick this logic? I think good to callout the source for this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment to call out the source.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I missing something? I still don't see the reference link to the algorithm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is at the function comment: Implementation from String.hashCode() in Java.. This will keep the comment concise and we can find reference online.

if (iid.length === 0) {
return false;
}
return getHashPercent(iid) < rolloutPercent;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check for the rolloutPercent ranges?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SDK will honor the percentage because we don't perform special handling if it is out of valid range, backend can inspect and control flag value if this is not correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, even if BE does that, having a range check would make the SDK cleaner.

Copy link
Contributor Author

@zijianjoy zijianjoy Apr 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we don't have range check for all RC fetch result. We can add them but I don't expect SDK to handle this differently if the value is out-of-range, therefore not adding additional check for now and revisit it in the future.

'Custom metric name {$customMetricName} is invalid'
'Custom metric name {$customMetricName} is invalid',
[ErrorCode.INVALID_STRING_MERGER_PARAMETER]:
'Input for String merger is invalid'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite a cryptic message for the developers. Can we make this message a bit more contextual? Eg. Unable to form API Key/URL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message is not planned for developers to resolve by themselves, it is useful to identify the issue quickly when developer takes this message to Firebase customer support.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but still the developer looking at that message would be confused as to how to act on that message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more information to let developer know about the next step.

@zijianjoy zijianjoy requested a review from Feiyang1 April 7, 2020 16:53
Copy link
Member

@Feiyang1 Feiyang1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks good to me except for the URL API usage.
Will we eventually replace cc with fl? The perf SDK is very sensitive to size and this PR increases it (It'd be interesting to see the actual increase in percentage - Currently the size check is broken for this PR, but I will fix it so we can get the number).


// Do not log the js sdk's call to transport service domain to avoid unnecessary cycle.
// Need to blacklist both old and new endpoints to avoid migration gap.
const networkRequestHostName = new URL(networkRequest.url).hostname;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

URL API is not available in IE11. Are you able to just use string parsing? I'm reluctant to ask people to add an additional polyfill unless there is a strong reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the heads-up! Decided not to use URL API and match the full URL without search parameter for cc and fl endpoints.

/**
* True if event should be sent to Fl transport endpoint rather than CC transport endpoint.
* rolloutPercent is in range [0.0, 100.0].
* @param {string} iid Installation ID which identifies a web app installed on client.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to specify the parameter type in Typescript

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed parameter type from comments.

@zijianjoy
Copy link
Contributor Author

Overall it looks good to me except for the URL API usage.
Will we eventually replace cc with fl? The perf SDK is very sensitive to size and this PR increases it (It'd be interesting to see the actual increase in percentage - Currently the size check is broken for this PR, but I will fix it so we can get the number).

Thank you Fei! I have removed the URL API usage. We are expecting to remove cc usage completely after 2 releases. Expecting the SDK size to decrease by then.

@@ -271,9 +271,6 @@ function shouldLogAfterSampling(samplingRate: number): boolean {
/**
* True if event should be sent to Fl transport endpoint rather than CC transport endpoint.
* rolloutPercent is in range [0.0, 100.0].
* @param {string} iid Installation ID which identifies a web app installed on client.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, actually I suggested to only remove the type information in the comment, that is e.g. {string}, and we should keep the rest of the comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the callout! I overlooked your message about parameter type instead of parameter.

@zijianjoy zijianjoy merged commit d689b75 into master Apr 8, 2020
@firebase firebase locked and limited conversation to collaborators May 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants