Skip to content

Commit bc80d9a

Browse files
authored
Update docs (#1949)
1 parent 481329c commit bc80d9a

File tree

7 files changed

+44
-62
lines changed

7 files changed

+44
-62
lines changed

README.md

Lines changed: 40 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -6,70 +6,64 @@
66

77
<br>
88

9-
# Model serving at scale
9+
# Deploy, manage, and scale machine learning models in production
1010

11-
Cortex is a platform for deploying, managing, and scaling machine learning in production.
11+
Cortex is a cloud native model serving platform for machine learning engineering teams.
1212

1313
<br>
1414

15-
## Key features
15+
## Use cases
1616

17-
* Run realtime inference, batch inference, and training workloads.
18-
* Deploy TensorFlow, PyTorch, ONNX, and other models to production.
19-
* Scale to handle production workloads with server-side batching and request-based autoscaling.
20-
* Configure rolling updates and live model reloading to update APIs without downtime.
21-
* Serve models efficiently with multi-model caching and spot / preemptible instances.
22-
* Stream performance metrics and structured logs to any monitoring tool.
23-
* Perform A/B tests with configurable traffic splitting.
17+
* **Realtime machine learning** - build NLP, computer vision, and other APIs and integrate them into any application.
18+
* **Large-scale inference** - scale realtime or batch inference workloads across hundreds or thousands of instances.
19+
* **Consistent MLOps workflows** - create streamlined and reproducible MLOps workflows for any machine learning team.
2420

2521
<br>
2622

27-
## How it works
23+
## Deploy
2824

29-
### Implement a Predictor
25+
* Deploy TensorFlow, PyTorch, ONNX, and other models using a simple CLI or Python client.
26+
* Run realtime inference, batch inference, asynchronous inference, and training jobs.
27+
* Define preprocessing and postprocessing steps in Python and chain workloads seamlessly.
3028

31-
```python
32-
# predictor.py
29+
```text
30+
$ cortex deploy apis.yaml
3331
34-
from transformers import pipeline
32+
• creating text-generator (realtime API)
33+
• creating image-classifier (batch API)
34+
• creating video-analyzer (async API)
3535
36-
class PythonPredictor:
37-
def __init__(self, config):
38-
self.model = pipeline(task="text-generation")
39-
40-
def predict(self, payload):
41-
return self.model(payload["text"])[0]
42-
```
43-
44-
### Configure a realtime API
45-
46-
```yaml
47-
# text_generator.yaml
48-
49-
- name: text-generator
50-
kind: RealtimeAPI
51-
predictor:
52-
type: python
53-
path: predictor.py
54-
compute:
55-
gpu: 1
56-
mem: 8Gi
57-
autoscaling:
58-
min_replicas: 1
59-
max_replicas: 10
36+
all APIs are ready!
6037
```
6138

62-
### Deploy
39+
## Manage
6340

64-
```bash
65-
$ cortex deploy text_generator.yaml
41+
* Create A/B tests and shadow pipelines with configurable traffic splitting.
42+
* Automatically stream logs from every workload to your favorite log management tool.
43+
* Monitor your workloads with pre-built Grafana dashboards and add your own custom dashboards.
6644

67-
# creating http://example.com/text-generator
45+
```text
46+
$ cortex get
6847
48+
API TYPE GPUs
49+
text-generator realtime 32
50+
image-classifier batch 64
51+
video-analyzer async 16
6952
```
7053

71-
### Serve prediction requests
54+
## Scale
55+
56+
* Configure workload and cluster autoscaling to efficiently handle large-scale production workloads.
57+
* Create clusters with different types of instances for different types of workloads.
58+
* Spend less on cloud infrastructure by letting Cortex manage spot or preemptible instances.
59+
60+
```text
61+
$ cortex cluster info
7262
73-
```bash
74-
$ curl http://example.com/text-generator -X POST -H "Content-Type: application/json" -d '{"text": "hello world"}'
63+
provider: aws
64+
region: us-east-1
65+
instance_types: [c5.xlarge, g4dn.xlarge]
66+
spot_instances: true
67+
min_instances: 10
68+
max_instances: 100
7569
```

cli/cmd/root.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -148,7 +148,7 @@ func initTelemetry() {
148148
var _rootCmd = &cobra.Command{
149149
Use: "cortex",
150150
Aliases: []string{"cx"},
151-
Short: "model serving at scale",
151+
Short: "deploy machine learning models to production",
152152
}
153153

154154
func Execute() {

docs/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
**Please view our documentation at [docs.cortex.dev](https://docs.cortex.dev/)**
1+
**Please view our documentation at [docs.cortex.dev](https://docs.cortex.dev)**

docs/clients/telemetry.md

Lines changed: 0 additions & 11 deletions
This file was deleted.

docs/summary.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,6 @@
88
* [CLI commands](clients/cli.md)
99
* [Python API](clients/python.md)
1010
* [Environments](clients/environments.md)
11-
* [Telemetry](clients/telemetry.md)
1211
* [Uninstall](clients/uninstall.md)
1312

1413
## Workloads

pkg/cortex/client/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
Model serving at scale - [docs.cortex.dev](https://www.docs.cortex.dev)
1+
Deploy machine learning models to production - [docs.cortex.dev](https://www.docs.cortex.dev)

pkg/cortex/client/setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ def run(self):
7777
setup(
7878
name="cortex",
7979
version="master", # CORTEX_VERSION
80-
description="Model serving at scale",
80+
description="Deploy machine learning models to production",
8181
author="cortex.dev",
8282
author_email="[email protected]",
8383
license="Apache License 2.0",

0 commit comments

Comments
 (0)