Skip to content

Commit ea46cac

Browse files
committed
📖 Add a design for a priority queue
This change describes the motivation and implementation details for a priority queue in controller-runtime.
1 parent c1331a5 commit ea46cac

File tree

1 file changed

+100
-0
lines changed

1 file changed

+100
-0
lines changed

designs/priorityqueue.md

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
Priority Queue
2+
===================
3+
4+
This document describes the motivation behind implementing a priority queue
5+
in controller-runtime and its design details.
6+
7+
## Motivation
8+
9+
1. Controllers reconcile all objects during startup to account for changes in
10+
the reconciliation logic. Some controllers also periodically re-reconcile
11+
everything to account for out of band changes they do not get notified for,
12+
this is for example common for controllers managing cloud resources. In both
13+
these cases, the reconciliation of new or changed objects gets delayed,
14+
resulting in poor user experience. [Example][0]
15+
2. There may be application-specific reason why some events are more important
16+
than others, [Example][1]
17+
18+
## Proposed changes
19+
20+
Implement a priority queue in controller-runtime that exposes the following
21+
interface:
22+
23+
```go
24+
type PriorityQueue[T comparable] interface {
25+
// AddWithOpts adds one or more items to the workqueue. Items
26+
// in the workqueue are de-duplicated, so there will only ever
27+
// be one entry for a given key.
28+
// Adding an item that is already there may update its wait
29+
// period to the lowest of existing and new wait period or
30+
// its priority to the highest of existing and new priority.
31+
AddWithOpts(o AddOpts, items ...T)
32+
33+
// GetWithPriority returns an item and its priority. It allows
34+
// a controller to re-use the priority if it enqueues an item
35+
// again.
36+
GetWithPriority() (item T, priority int, shutdown bool)
37+
38+
// workqueue.TypedRateLimitingInterface is kept for backwards
39+
// compatibility.
40+
workqueue.TypedRateLimitingInterface[T]
41+
}
42+
43+
type AddOpts struct {
44+
// After is a duration after which the object will be available for
45+
// reconciliation. If the object is already in the workqueue, the
46+
// lowest of existing and new After period will be used.
47+
After time.Duration
48+
49+
// Ratelimited specifies if the ratelimiter should be used to
50+
// determine a wait period. If the object is already in the
51+
// workqueue, the lowest of existing and new wait period will be
52+
// used.
53+
RateLimited bool
54+
55+
// Priority specifies the priority of the object. Objects with higher
56+
// priority are returned before objects with lower priority. If the
57+
// object is already in the workqueue, the priority will be updated
58+
// to the highest of existing and new priority.
59+
Priority int
60+
}
61+
```
62+
63+
In order to fix the issue described in point one of the motivation section,
64+
we have to be able to differentiate events stemming from the initial list
65+
during startup and from resyncs from other events. In both these cases, the
66+
informer emits an artifical create. The suggestion is to use a heuristic that
67+
checks if an object in a `Create` event is older than one minute and if so,
68+
reduce the priority of the event using a wrapper that can be used with any
69+
existing handler.
70+
71+
```go
72+
// WithLowPriorityWhenUnchanged wraps an existing handler and will
73+
// reduce the priority of events stemming from the initial listwatch
74+
// or cache resyncs.
75+
func WithLowPriorityWhenUnchanged[object client.Object, request comparable](u TypedEventHandler[object, request]) TypedEventHandler[object, request]{
76+
}
77+
```
78+
79+
The issue described in point two of the motivation section ("application-specific
80+
reasons to prioritize some events") will always require implementation of a custom
81+
handler or eventsource in order to inject the appropriate priority.
82+
83+
## Implementation stages
84+
85+
In order to safely roll this out to all controller-runtime users, it is suggested to
86+
divide the implementation into two stages: Initially, we will add the priority queue
87+
but mark it as experimental and all usage of it requires explicit opt-in by setting
88+
a boolean on the manager or configuring `NewQueue` in a controllers opts. There will
89+
be no breaking changes required for this, but sources or handlers that want to make
90+
use of the new queue will have to use type assertions.
91+
92+
After we've gained some confidence that the implementation is useful and correct, we
93+
will make it the default. Doing so entails breaking the `source.Source` and the
94+
`handler.Handler` interfaces as well as the `controller.Options` struct to refer to
95+
the new workqueue interface. We will wait at least one minor release after introducing
96+
the `PriorityQueue` before doing this.
97+
98+
99+
* [0]: https://youtu.be/AYNaaXlV8LQ?si=i2Pfo7Ske6rTrPLS
100+
* [1]: https://github.com/cilium/cilium/blob/a17d6945b29c177209af3d985bd82cce49eed4a1/operator/pkg/ciliumendpointslice/controller.go#L73

0 commit comments

Comments
 (0)