Skip to content

Commit 41d88a2

Browse files
iandesjperllaghukrassowski
authored
Add disk monitoring (#233)
* Update the server-side api * In theory, add disk stuff to the front end * Working as dev environment * shift config to individual views, tweak the CONTRIB docs, and add an example config * Update the readme * Update static/main.js to pass eslint, and create a single style entry * Correct debugging mis-naming * Replace missing semicolon.. * fix: Compute disk warning state with config.disk_warning_threshold * feat: Add model class for keeping resource warnings * feat: Condition to flash warnings no looks at all computed warnings * chore: Run lint fix * chore: Run lint fix again * chore: Address critical and high dependabot flagged packages * task: remove console log Co-authored-by: Michał Krassowski <[email protected]> * Update CONTRIBUTING.md Co-authored-by: Michał Krassowski <[email protected]> * Fix typo in CONTRIBUTING.md Co-authored-by: Michał Krassowski <[email protected]> * Fix typo in README.md docs Co-authored-by: Michał Krassowski <[email protected]> * Fix server extension docs language Co-authored-by: Michał Krassowski <[email protected]> * Fix typo regarding disk warning thresholds Co-authored-by: Michał Krassowski <[email protected]> * Catch Exception instead of nothing at all Co-authored-by: Michał Krassowski <[email protected]> * Update docs and delete example server config --------- Co-authored-by: Ian Stuart <[email protected]> Co-authored-by: Michał Krassowski <[email protected]>
1 parent 6f15ef9 commit 41d88a2

File tree

21 files changed

+471
-165
lines changed

21 files changed

+471
-165
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@ __pycache__/
88

99
# Distribution / packaging
1010
.Python
11+
.direnv
12+
.envrc
1113
env/
1214
build/
1315
develop-eggs/

CONTRIBUTING.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -127,6 +127,14 @@ JupyterLab v3.0.0
127127
jupyter-resource-usage v0.1.0 enabled OK
128128
```
129129

130+
## Which code creates what content
131+
132+
The stats are created by the server-side code in `jupyter_resource_usage`.
133+
134+
For the jupyterlab 4 / notebook 7 UIs, the code in `packages/labextension` creates and writes the content for both the statusbar and the topbar.
135+
136+
The topbar is defined in the schema, whilst the contents of the statusbar is driven purely by the labextension code.... and labels are defined by their appropriate `*View.tsx` file
137+
130138
## pre-commit
131139

132140
`jupyter-resource-usage` has adopted automatic code formatting so you shouldn't need to worry too much about your code style.

README.md

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,23 @@ memory:
134134

135135
![Screenshot with CPU and memory](./doc/statusbar-cpu.png)
136136

137+
### Disk [partition] Usage
138+
139+
`jupyter-resource-usage` can also track disk usage [of a defined partition] and report the `total` and `used` values as part of the `/api/metrics/v1` response.
140+
141+
You enable tracking by setting the `track_disk_usage` trait (disabled by default):
142+
143+
```python
144+
c = get_config()
145+
c.ResourceUseDisplay.track_disk_usage = True
146+
```
147+
148+
The values are from the partition containing the folder in the trait `disk_path` (which defaults to `/home/joyvan`). If this path does not exist, disk usage information is omitted from the display.
149+
150+
Mirroring CPU and Memory, the trait `disk_warning_threshold` signifies when to flag a usage warning, and like the others, it defaults to `0.1` (10% remaining)
151+
152+
![Screenshot with Disk, CPU, and memory](./doc/statusbar_disk.png)
153+
137154
### Disable Prometheus Metrics
138155

139156
There is a [known bug](https://github.com/jupyter-server/jupyter-resource-usage/issues/123) with Prometheus metrics which
@@ -157,9 +174,11 @@ render the alternative frontend in the topbar.
157174
Users can change the label and refresh rate for the alternative frontend using settings
158175
editor.
159176

177+
(The vertical bars are included by default, to help separate the three indicators.)
178+
160179
## Resources Displayed
161180

162-
Currently the server extension only reports memory usage and CPU usage. Other metrics will be added in the future as needed.
181+
Currently the server extension reports disk usage, memory usage and CPU usage. Other metrics will be added in the future as needed.
163182

164183
Memory usage will show the PSS whenever possible (Linux only feature), and default to RSS otherwise.
165184

doc/settings.png

9.53 KB
Loading

doc/statusbar_disk.png

7.54 KB
Loading

jupyter_resource_usage/api.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,20 @@ async def get(self):
7575

7676
metrics.update(cpu_percent=cpu_percent, cpu_count=cpu_count)
7777

78+
# Optionally get Disk information
79+
if config.track_disk_usage:
80+
try:
81+
disk_info = psutil.disk_usage(config.disk_path)
82+
except Exception:
83+
pass
84+
else:
85+
metrics.update(disk_used=disk_info.used, disk_total=disk_info.total)
86+
limits["disk"] = {"disk": disk_info.total}
87+
if config.disk_warning_threshold != 0:
88+
limits["disk"]["warn"] = (disk_info.total - disk_info.used) < (
89+
disk_info.total * config.disk_warning_threshold
90+
)
91+
7892
self.write(json.dumps(metrics))
7993

8094
@run_on_executor

jupyter_resource_usage/config.py

Lines changed: 48 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
from traitlets import Int
88
from traitlets import List
99
from traitlets import TraitType
10+
from traitlets import Unicode
1011
from traitlets import Union
1112
from traitlets.config import Configurable
1213

@@ -27,7 +28,7 @@ def validate(self, obj, value):
2728
keys = list(value.keys())
2829
if "name" in keys:
2930
keys.remove("name")
30-
if all(key in ["kwargs", "attribute"] for key in keys):
31+
if all(key in ["args", "kwargs", "attribute"] for key in keys):
3132
return value
3233
self.error(obj, value)
3334

@@ -37,6 +38,15 @@ class ResourceUseDisplay(Configurable):
3738
Holds server-side configuration for jupyter-resource-usage
3839
"""
3940

41+
# Needs to be defined early, so the metrics can use it.
42+
disk_path = Union(
43+
trait_types=[Unicode(), Callable()],
44+
default_value="/home/joyvan",
45+
help="""
46+
A path in the partition to be reported on.
47+
""",
48+
).tag(config=True)
49+
4050
process_memory_metrics = List(
4151
trait=PSUtilMetric(),
4252
default_value=[{"name": "memory_info", "attribute": "rss"}],
@@ -56,6 +66,19 @@ class ResourceUseDisplay(Configurable):
5666
trait=PSUtilMetric(), default_value=[{"name": "cpu_count"}]
5767
)
5868

69+
process_disk_metrics = List(
70+
trait=PSUtilMetric(),
71+
default_value=[],
72+
)
73+
74+
system_disk_metrics = List(
75+
trait=PSUtilMetric(),
76+
default_value=[
77+
{"name": "disk_usage", "args": [disk_path], "attribute": "total"},
78+
{"name": "disk_usage", "args": [disk_path], "attribute": "used"},
79+
],
80+
)
81+
5982
mem_warning_threshold = Float(
6083
default_value=0.1,
6184
help="""
@@ -123,6 +146,30 @@ def _mem_limit_default(self):
123146
def _cpu_limit_default(self):
124147
return float(os.environ.get("CPU_LIMIT", 0))
125148

149+
track_disk_usage = Bool(
150+
default_value=False,
151+
help="""
152+
Set to True in order to enable reporting of disk usage statistics.
153+
""",
154+
).tag(config=True)
155+
156+
@default("disk_path")
157+
def _disk_path_default(self):
158+
return str(os.environ.get("HOME", "/home/joyvan"))
159+
160+
disk_warning_threshold = Float(
161+
default_value=0.1,
162+
help="""
163+
Warn user with flashing lights when disk usage is within this fraction
164+
total space.
165+
166+
For example, if total size is 10G, `disk_warning_threshold` is 0.1,
167+
we will start warning the user when they use (10 - (10 * 0.1)) G.
168+
169+
Set to 0 to disable warning.
170+
""",
171+
).tag(config=True)
172+
126173
enable_prometheus_metrics = Bool(
127174
default_value=True,
128175
help="""

jupyter_resource_usage/metrics.py

Lines changed: 20 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,10 @@ def __init__(self, server_app: ServerApp):
1313
]
1414
self.server_app = server_app
1515

16-
def get_process_metric_value(self, process, name, kwargs, attribute=None):
16+
def get_process_metric_value(self, process, name, args, kwargs, attribute=None):
1717
try:
1818
# psutil.Process methods will either return...
19-
metric_value = getattr(process, name)(**kwargs)
19+
metric_value = getattr(process, name)(*args, **kwargs)
2020
if attribute is not None: # ... a named tuple
2121
return getattr(metric_value, attribute)
2222
else: # ... or a number
@@ -26,25 +26,28 @@ def get_process_metric_value(self, process, name, kwargs, attribute=None):
2626
except BaseException:
2727
return 0
2828

29-
def process_metric(self, name, kwargs={}, attribute=None):
29+
def process_metric(self, name, args=[], kwargs={}, attribute=None):
3030
if psutil is None:
3131
return None
3232
else:
3333
current_process = psutil.Process()
3434
all_processes = [current_process] + current_process.children(recursive=True)
3535

3636
process_metric_value = lambda process: self.get_process_metric_value(
37-
process, name, kwargs, attribute
37+
process, name, args, kwargs, attribute
3838
)
3939

4040
return sum([process_metric_value(process) for process in all_processes])
4141

42-
def system_metric(self, name, kwargs={}, attribute=None):
42+
def system_metric(self, name, args=[], kwargs={}, attribute=None):
4343
if psutil is None:
4444
return None
4545
else:
46-
# psutil functions will either return...
47-
metric_value = getattr(psutil, name)(**kwargs)
46+
# psutil functions will either raise an error, or return...
47+
try:
48+
metric_value = getattr(psutil, name)(*args, **kwargs)
49+
except:
50+
return None
4851
if attribute is not None: # ... a named tuple
4952
return getattr(metric_value, attribute)
5053
else: # ... or a number
@@ -63,8 +66,11 @@ def get_metric_values(self, metrics, metric_type):
6366
return metric_values
6467

6568
def metrics(self, process_metrics, system_metrics):
66-
metric_values = self.get_metric_values(process_metrics, "process")
67-
metric_values.update(self.get_metric_values(system_metrics, "system"))
69+
metric_values = {}
70+
if process_metrics:
71+
metric_values.update(self.get_metric_values(process_metrics, "process"))
72+
if system_metrics:
73+
metric_values.update(self.get_metric_values(system_metrics, "system"))
6874

6975
if any(value is None for value in metric_values.values()):
7076
return None
@@ -80,3 +86,8 @@ def cpu_metrics(self):
8086
return self.metrics(
8187
self.config.process_cpu_metrics, self.config.system_cpu_metrics
8288
)
89+
90+
def disk_metrics(self):
91+
return self.metrics(
92+
self.config.process_disk_metrics, self.config.system_disk_metrics
93+
)

jupyter_resource_usage/prometheus.py

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,14 @@ def __init__(self, metricsloader: PSUtilMetricsLoader):
1818
self.config = metricsloader.config
1919
self.session_manager = metricsloader.server_app.session_manager
2020

21-
gauge_names = ["total_memory", "max_memory", "total_cpu", "max_cpu"]
21+
gauge_names = [
22+
"total_memory",
23+
"max_memory",
24+
"total_cpu",
25+
"max_cpu",
26+
"max_disk",
27+
"current_disk",
28+
]
2229
for name in gauge_names:
2330
phrase = name + "_usage"
2431
gauge = Gauge(phrase, "counter for " + phrase.replace("_", " "), [])
@@ -34,6 +41,11 @@ async def __call__(self, *args, **kwargs):
3441
if cpu_metric_values is not None:
3542
self.TOTAL_CPU_USAGE.set(cpu_metric_values["cpu_percent"])
3643
self.MAX_CPU_USAGE.set(self.apply_cpu_limit(cpu_metric_values))
44+
if self.config.track_disk_usage:
45+
disk_metric_values = self.metricsloader.disk_metrics()
46+
if disk_metric_values is not None:
47+
self.CURRENT_DISK_USAGE.set(disk_metric_values["disk_usage_used"])
48+
self.MAX_DISK_USAGE.set(disk_metric_values["disk_usage_total"])
3749

3850
def apply_memory_limit(self, memory_metric_values) -> Optional[int]:
3951
if memory_metric_values is None:

0 commit comments

Comments
 (0)