Skip to content

fix(replay): Fully stop & restart session when it expires #8834

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
118 changes: 48 additions & 70 deletions packages/replay/src/replay.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ import { setupPerformanceObserver } from './coreHandlers/performanceObserver';
import { createEventBuffer } from './eventBuffer';
import { clearSession } from './session/clearSession';
import { loadOrCreateSession } from './session/loadOrCreateSession';
import { maybeRefreshSession } from './session/maybeRefreshSession';
import { saveSession } from './session/saveSession';
import { shouldRefreshSession } from './session/shouldRefreshSession';
import type {
AddEventResult,
AddUpdateCallback,
Expand Down Expand Up @@ -217,7 +217,7 @@ export class ReplayContainer implements ReplayContainerInterface {
* Initializes the plugin based on sampling configuration. Should not be
* called outside of constructor.
*/
public initializeSampling(): void {
public initializeSampling(previousSessionId?: string): void {

Check failure

Code scanning / CodeQL

Insecure randomness

This uses a cryptographically insecure random number generated at [Math.random()](1) in a security context.
const { errorSampleRate, sessionSampleRate } = this._options;

// If neither sample rate is > 0, then do nothing - user will need to call one of
Expand All @@ -228,7 +228,7 @@ export class ReplayContainer implements ReplayContainerInterface {

// Otherwise if there is _any_ sample rate set, try to load an existing
// session, or create a new one.
this._initializeSessionForSampling();
this._initializeSessionForSampling(previousSessionId);

if (!this.session) {
// This should not happen, something wrong has occurred
Expand Down Expand Up @@ -273,7 +273,6 @@ export class ReplayContainer implements ReplayContainerInterface {
logInfoNextTick('[Replay] Starting replay in session mode', this._options._experiments.traceInternals);

const session = loadOrCreateSession(
this.session,
{
maxReplayDuration: this._options.maxReplayDuration,
sessionIdleExpire: this.timeouts.sessionIdleExpire,
Expand Down Expand Up @@ -304,7 +303,6 @@ export class ReplayContainer implements ReplayContainerInterface {
logInfoNextTick('[Replay] Starting replay in buffer mode', this._options._experiments.traceInternals);

const session = loadOrCreateSession(
this.session,
{
sessionIdleExpire: this.timeouts.sessionIdleExpire,
maxReplayDuration: this._options.maxReplayDuration,
Expand Down Expand Up @@ -373,15 +371,16 @@ export class ReplayContainer implements ReplayContainerInterface {
return;
}

// We can't move `_isEnabled` after awaiting a flush, otherwise we can
// enter into an infinite loop when `stop()` is called while flushing.
this._isEnabled = false;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this out of the try-catch, as otherwise this lead to weird race conditions. I guess by having this in the try-catch the timing semantics are slightly different. I had a bunch of cases where stop was then called twice at the exact same time, which was fixed by moving this out here 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, since try/catch inside of an async function is sugar on top of promises.


try {
logInfo(
`[Replay] Stopping Replay${reason ? ` triggered by ${reason}` : ''}`,
this._options._experiments.traceInternals,
);

// We can't move `_isEnabled` after awaiting a flush, otherwise we can
// enter into an infinite loop when `stop()` is called while flushing.
this._isEnabled = false;
this._removeListeners();
this.stopRecording();

Expand Down Expand Up @@ -475,16 +474,6 @@ export class ReplayContainer implements ReplayContainerInterface {

// Once this session ends, we do not want to refresh it
if (this.session) {
this.session.shouldRefresh = false;

// It's possible that the session lifespan is > max session lifespan
// because we have been buffering beyond max session lifespan (we ignore
// expiration given that `shouldRefresh` is true). Since we flip
// `shouldRefresh`, the session could be considered expired due to
// lifespan, which is not what we want. Update session start date to be
// the current timestamp, so that session is not considered to be
// expired. This means that max replay duration can be MAX_REPLAY_DURATION +
// (length of buffer), which we are ok with.
this._updateUserActivity(activityTime);
this._updateSessionActivity(activityTime);
this._maybeSaveSession();
Expand Down Expand Up @@ -612,8 +601,6 @@ export class ReplayContainer implements ReplayContainerInterface {
* @hidden
*/
public checkAndHandleExpiredSession(): boolean | void {
const oldSessionId = this.getSessionId();

// Prevent starting a new session if the last user activity is older than
// SESSION_IDLE_PAUSE_DURATION. Otherwise non-user activity can trigger a new
// session+recording. This creates noisy replays that do not have much
Expand All @@ -635,24 +622,11 @@ export class ReplayContainer implements ReplayContainerInterface {
// --- There is recent user activity --- //
// This will create a new session if expired, based on expiry length
if (!this._checkSession()) {
return;
}

// Session was expired if session ids do not match
const expired = oldSessionId !== this.getSessionId();

if (!expired) {
return true;
}

// Session is expired, trigger a full snapshot (which will create a new session)
if (this.isPaused()) {
this.resume();
} else {
this._triggerFullSnapshot();
// Check session handles the refreshing itself
return false;
}

return false;
return true;
}

/**
Expand Down Expand Up @@ -740,6 +714,7 @@ export class ReplayContainer implements ReplayContainerInterface {

// Need to set as enabled before we start recording, as `record()` can trigger a flush with a new checkout
this._isEnabled = true;
this._isPaused = false;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making sure we are also unpaused here, when starting new.


this.startRecording();
}
Expand All @@ -756,17 +731,17 @@ export class ReplayContainer implements ReplayContainerInterface {
/**
* Loads (or refreshes) the current session.
*/
private _initializeSessionForSampling(): void {
private _initializeSessionForSampling(previousSessionId?: string): void {
// Whenever there is _any_ error sample rate, we always allow buffering
// Because we decide on sampling when an error occurs, we need to buffer at all times if sampling for errors
const allowBuffering = this._options.errorSampleRate > 0;

const session = loadOrCreateSession(
this.session,
{
sessionIdleExpire: this.timeouts.sessionIdleExpire,
maxReplayDuration: this._options.maxReplayDuration,
traceInternals: this._options._experiments.traceInternals,
previousSessionId,
},
{
stickySession: this._options.stickySession,
Expand All @@ -791,37 +766,32 @@ export class ReplayContainer implements ReplayContainerInterface {

const currentSession = this.session;

const newSession = maybeRefreshSession(
currentSession,
{
if (
shouldRefreshSession(currentSession, {
sessionIdleExpire: this.timeouts.sessionIdleExpire,
traceInternals: this._options._experiments.traceInternals,
maxReplayDuration: this._options.maxReplayDuration,
},
{
stickySession: Boolean(this._options.stickySession),
sessionSampleRate: this._options.sessionSampleRate,
allowBuffering: this._options.errorSampleRate > 0,
},
);

const isNew = newSession.id !== currentSession.id;

// If session was newly created (i.e. was not loaded from storage), then
// enable flag to create the root replay
if (isNew) {
this.setInitialState();
this.session = newSession;
}

if (!this.session.sampled) {
void this.stop({ reason: 'session not refreshed' });
})
) {
void this._refreshSession(currentSession);
return false;
}

return true;
}

/**
* Refresh a session with a new one.
* This stops the current session (without forcing a flush, as that would never work since we are expired),
* and then does a new sampling based on the refreshed session.
*/
private async _refreshSession(session: Session): Promise<void> {
if (!this._isEnabled) {
return;
}
await this.stop({ reason: 'refresh session' });
this.initializeSampling(session.id);
}

/**
* Adds listeners to record events for the replay
*/
Expand Down Expand Up @@ -933,10 +903,14 @@ export class ReplayContainer implements ReplayContainerInterface {

const expired = isSessionExpired(this.session, {
maxReplayDuration: this._options.maxReplayDuration,
...this.timeouts,
sessionIdleExpire: this.timeouts.sessionIdleExpire,
});

if (breadcrumb && !expired) {
if (expired) {
return;
}

if (breadcrumb) {
this._createCustomBreadcrumb(breadcrumb);
}

Expand Down Expand Up @@ -1081,7 +1055,9 @@ export class ReplayContainer implements ReplayContainerInterface {
* Should never be called directly, only by `flush`
*/
private async _runFlush(): Promise<void> {
if (!this.session || !this.eventBuffer) {
const replayId = this.getSessionId();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

capturing the replayId to use at the very top here. With this, if this runs out of sync while processing, we should still never send data to the wrong replay ID, at least.


if (!this.session || !this.eventBuffer || !replayId) {
__DEBUG_BUILD__ && logger.error('[Replay] No session or eventBuffer found to flush.');
return;
}
Expand All @@ -1101,13 +1077,15 @@ export class ReplayContainer implements ReplayContainerInterface {
return;
}

// if this changed in the meanwhile, e.g. because the session was refreshed or similar, we abort here
if (replayId !== this.getSessionId()) {
return;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just bailing out here, i think that should be fine? If this changed in the meanwhile, if we flush again later we should have discarded all the stuff before already...?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be an edgecase that we will have to deal with separately where we have an ongoing/suspended flush, and a new session is created w/ checkout snapshot that gets added to buffer and then discarded here

}

try {
// This uses the data from the eventBuffer, so we need to call this before `finish()
this._updateInitialTimestampFromEventBuffer();

// Note this empties the event buffer regardless of outcome of sending replay
const recordingData = await this.eventBuffer.finish();

const timestamp = Date.now();

// Check total duration again, to avoid sending outdated stuff
Expand All @@ -1117,14 +1095,14 @@ export class ReplayContainer implements ReplayContainerInterface {
throw new Error('Session is too long, not sending replay');
}

// NOTE: Copy values from instance members, as it's possible they could
// change before the flush finishes.
const replayId = this.session.id;
const eventContext = this._popEventContext();
// Always increment segmentId regardless of outcome of sending replay
const segmentId = this.session.segmentId++;
this._maybeSaveSession();

// Note this empties the event buffer regardless of outcome of sending replay
const recordingData = await this.eventBuffer.finish();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this down here, the idea is:

  1. We capture the timestamp, event context etc. first
  2. Then, if the buffer flushing takes a long time, we don't care, even if in the meanwhile the session was refreshed we can still send the stuff to the old session. Also timestamps should be less prone to be too late due to buffer flushing, and should actually be more correct because it is the time of the flush and not whenever the buffer finished.


await sendReplay({
replayId,
recordingData,
Expand Down
2 changes: 0 additions & 2 deletions packages/replay/src/session/Session.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ export function makeSession(session: Partial<Session> & { sampled: Sampled }): S
const lastActivity = session.lastActivity || now;
const segmentId = session.segmentId || 0;
const sampled = session.sampled;
const shouldRefresh = typeof session.shouldRefresh === 'boolean' ? session.shouldRefresh : true;
const previousSessionId = session.previousSessionId;

return {
Expand All @@ -22,7 +21,6 @@ export function makeSession(session: Partial<Session> & { sampled: Sampled }): S
lastActivity,
segmentId,
sampled,
shouldRefresh,
previousSessionId,
};
}
19 changes: 12 additions & 7 deletions packages/replay/src/session/loadOrCreateSession.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,33 +2,38 @@ import type { Session, SessionOptions } from '../types';
import { logInfoNextTick } from '../util/log';
import { createSession } from './createSession';
import { fetchSession } from './fetchSession';
import { maybeRefreshSession } from './maybeRefreshSession';
import { shouldRefreshSession } from './shouldRefreshSession';

/**
* Get or create a session, when initializing the replay.
* Returns a session that may be unsampled.
*/
export function loadOrCreateSession(
currentSession: Session | undefined,
{
traceInternals,
sessionIdleExpire,
maxReplayDuration,
previousSessionId,
}: {
sessionIdleExpire: number;
maxReplayDuration: number;
traceInternals?: boolean;
previousSessionId?: string;
},
sessionOptions: SessionOptions,
): Session {
// If session exists and is passed, use it instead of always hitting session storage
const existingSession = currentSession || (sessionOptions.stickySession && fetchSession(traceInternals));
const existingSession = sessionOptions.stickySession && fetchSession(traceInternals);

// No session exists yet, just create a new one
if (!existingSession) {
logInfoNextTick('[Replay] Created new session', traceInternals);
return createSession(sessionOptions);
logInfoNextTick('[Replay] Creating new session', traceInternals);
return createSession(sessionOptions, { previousSessionId });
}

return maybeRefreshSession(existingSession, { sessionIdleExpire, traceInternals, maxReplayDuration }, sessionOptions);
if (!shouldRefreshSession(existingSession, { sessionIdleExpire, maxReplayDuration })) {
return existingSession;
}

logInfoNextTick('[Replay] Session in sessionStorage is expired, creating new one...');
return createSession(sessionOptions, { previousSessionId: existingSession.id });

Check failure

Code scanning / CodeQL

Insecure randomness

This uses a cryptographically insecure random number generated at [Math.random()](1) in a security context.
}
50 changes: 0 additions & 50 deletions packages/replay/src/session/maybeRefreshSession.ts

This file was deleted.

Loading