Skip to content

Commit 3d9258e

Browse files
authored
πŸ“ Add documentation for spike protection (#5791)
1 parent 6d779bc commit 3d9258e

File tree

4 files changed

+54
-0
lines changed

4 files changed

+54
-0
lines changed

β€Žsrc/docs/product/accounts/quotas/manage-event-stream-guide.mdx

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,60 @@ To review the error events dropped because of spike protection, go to the "Usage
111111

112112
Events will not be dropped during any minute in which you don't send more than the hourly limit that Sentry has calculated for you. After 24 hours without any dropped events, spike protection becomes "inactive" again. This means that it is no longer dropping events, but _it does not mean the system has stopped paying attention._ The next time events are dropped, spike protection will be "reactivated".
113113

114+
### New Heuristic Changes
115+
116+
<Include name="limited-avail-note.mdx" />
117+
118+
Spike protection is enabled for every project by default, and when it's enabled, Sentry continually monitors for spikes. You can confirm that it's enabled inΒ **Settings > Projects > _Select Project_ > General Settings**.
119+
120+
The way our spike protection algorithm essentially works is by using a weighted average of your events over the past 168 hours (past 7 days), applying a multiplier to that number, comparing this final number against a floor bound that is determined using your quota, and setting that as your spike limit.
121+
122+
### Spike Protection Inputs
123+
124+
- Number of projects
125+
- Quota (per event type)
126+
- Events in the past 7 days
127+
128+
### Floor Bound Calculation
129+
130+
To break it down even further, the first step of this algorithm identifies a floor bound that is calculated using your quota. This bound takes the max of either 500 events or (3 \* your quota)/(720 \* number of projects) - the latter number represents your project using up 3 times your overall quota in 30 days if events are continually ingested at this hourly rate, thus flagging for a potential spike.
131+
132+
### Spike Limit Calculation
133+
134+
The next step uses hourly data from the past 7 days to calculate spike limit projections for the next 7 days. This data is used to calculate weighted averages, which takes into account weekly and hourly seasonality. For example, the weighted average calculated for Monday at 3 pm is more heavily influenced by data points on Monday or hours around 3 pm. This weighted average is then multiplied by a multiplier that is 5 times the overall standard deviation of the past week - this multiplier is bounded between 3 and 6.
135+
136+
### Setting the Final Limit
137+
138+
The final spike limit for each hour is set to the max of the floor bound or the calculated limit. This is done for a multitude of reasons - firstly, using the floor bound protects smaller or new projects. New projects that do not have a week’s worth of data to use to calibrate spike limits can use the floor, an adaptation of the organization’s quota, to approximate appropriate limits. Additionally, the floor can be used to minimize false positives in smaller/new projects such that spikes aren’t flagged incorrectly.
139+
140+
Additionally, at the onset of a spike, spike limits are recalculated in real time throughout the duration of the spike. While this is done to adjust for the increasing volume of incoming events, the limit grows at a steady rate such that quota is protected and not blown through. An example of how our heuristic works during a spike is shown below.
141+
142+
### **Example Calculations**
143+
144+
![Spike zoomed out](spike-protection-zoomed-out.png)
145+
146+
![Spike zoomed in plotted with spike limits](spike-protection-zoomed-in.png)
147+
148+
**_During Spike_**
149+
150+
- 1st hour: 6k events ingested, limit is recalculated to 2083, 3917 events dropped
151+
- 2nd hour: 34k events ingested, limit is recalculated to 2873, 31217 events dropped
152+
- 3rd hour: 55k events ingested, limit is recalculated to 5452, ~49k events dropped
153+
- 4th hour: 49k events ingested, limit is recalculated to 7628, ~41k events dropped
154+
- 5th hour: 41k events ingested, limit is recalculated to 9371, ~31k events dropped
155+
156+
Limits are recalculated throughout the duration of the spike.
157+
158+
For this particular example:
159+
160+
- Org Quota: 500k
161+
- Events Ingested: ~478k
162+
- Events ~157k
163+
164+
Here's an example of spike limit projections for a week, taking into account seasonality:
165+
166+
![Spike limit projections with seasonality](spike-protection-steady-state.png)
167+
114168
## 2. Adjusting Quotas
115169

116170
<Note>
Loading
Loading
Loading

0 commit comments

Comments
Β (0)