Skip to content

Commit 81f68cc

Browse files
authored
Update README.md to reflect Python client (#1595)
1 parent ac82924 commit 81f68cc

File tree

1 file changed

+55
-63
lines changed

1 file changed

+55
-63
lines changed

README.md

Lines changed: 55 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,7 @@
55

66
<!-- Delete on release branches -->
77
<!-- CORTEX_VERSION_README_MINOR -->
8-
9-
[install](https://docs.cortex.dev/install)[documentation](https://docs.cortex.dev)[examples](https://github.com/cortexlabs/cortex/tree/0.23/examples)[support](https://gitter.im/cortexlabs/cortex)
8+
[install](https://docs.cortex.dev/install)[documentation](https://docs.cortex.dev)[examples](https://github.com/cortexlabs/cortex/tree/0.23/examples)[community](https://gitter.im/cortexlabs/cortex)
109

1110
# Deploy machine learning models to production
1211

@@ -16,49 +15,45 @@ Cortex is an open source platform for deploying, managing, and scaling machine l
1615

1716
## Model serving infrastructure
1817

19-
* Supports deploying TensorFlow, PyTorch, sklearn and other models as realtime or batch APIs
20-
* Ensures high availability with availability zones and automated instance restarts
21-
* Scales to handle production workloads with request-based autoscaling
22-
* Runs inference on spot instances with on-demand backups
23-
* Manages traffic splitting for A/B testing
18+
* Supports deploying TensorFlow, PyTorch, sklearn and other models as realtime or batch APIs.
19+
* Ensures high availability with availability zones and automated instance restarts.
20+
* Runs inference on spot instances with on-demand backups.
21+
* Autoscales to handle production workloads.
2422

25-
#### Configure your cluster:
23+
#### Configure Cortex
2624

2725
```yaml
2826
# cluster.yaml
2927

3028
region: us-east-1
31-
availability_zones: [us-east-1a, us-east-1b]
32-
api_gateway: public
3329
instance_type: g4dn.xlarge
3430
min_instances: 10
3531
max_instances: 100
3632
spot: true
3733
```
3834
39-
#### Spin up your cluster on your AWS account:
35+
#### Spin up Cortex on your AWS account
4036
4137
```text
4238
$ cortex cluster up --config cluster.yaml
4339

4440
○ configuring autoscaling ✓
4541
○ configuring networking ✓
4642
○ configuring logging ✓
47-
○ configuring metrics dashboard ✓
4843

4944
cortex is ready!
5045
```
5146

5247
<br>
5348

54-
## Reproducible model deployments
49+
## Reproducible deployments
5550

56-
* Implement request handling in Python
57-
* Customize compute, autoscaling, and networking for each API
58-
* Package dependencies, code, and configuration for reproducible deployments
59-
* Test locally before deploying to your cluster
51+
* Package dependencies, code, and configuration for reproducible deployments.
52+
* Configure compute, autoscaling, and networking for each API.
53+
* Integrate with your data science platform or CI/CD system.
54+
* Test locally before deploying to your cluster.
6055

61-
#### Implement a predictor:
56+
#### Implement a predictor
6257

6358
```python
6459
# predictor.py
@@ -73,70 +68,67 @@ class PythonPredictor:
7368
return self.model(payload["text"])[0]
7469
```
7570

76-
#### Configure an API:
77-
78-
```yaml
79-
# cortex.yaml
80-
81-
name: text-generator
82-
kind: RealtimeAPI
83-
predictor:
84-
path: predictor.py
85-
compute:
86-
gpu: 1
87-
mem: 4Gi
88-
autoscaling:
89-
min_replicas: 1
90-
max_replicas: 10
91-
networking:
92-
api_gateway: public
93-
```
94-
95-
#### Deploy to production:
96-
97-
```text
98-
$ cortex deploy cortex.yaml
99-
100-
creating https://example.com/text-generator
71+
#### Configure an API
10172

102-
$ curl https://example.com/text-generator \
103-
-X POST -H "Content-Type: application/json" \
104-
-d '{"text": "deploy machine learning models to"}'
105-
106-
"deploy machine learning models to production"
73+
```python
74+
api_spec = {
75+
"name": "text-generator",
76+
"kind": "RealtimeAPI",
77+
"predictor": {
78+
"type": "python",
79+
"path": "predictor.py"
80+
},
81+
"compute": {
82+
"gpu": 1,
83+
"mem": "8Gi",
84+
},
85+
"autoscaling": {
86+
"min_replicas": 1,
87+
"max_replicas": 10
88+
},
89+
"networking": {
90+
"api_gateway": "public"
91+
}
92+
}
10793
```
10894

10995
<br>
11096

111-
## API management
97+
## Scalable machine learning APIs
11298

113-
* Monitor API performance
114-
* Aggregate and stream logs
115-
* Customize prediction tracking
116-
* Update APIs without downtime
99+
* Scale to handle production workloads with request-based autoscaling.
100+
* Stream performance metrics and logs to any monitoring tool.
101+
* Serve many models efficiently with multi model caching.
102+
* Configure traffic splitting for A/B testing.
103+
* Update APIs without downtime.
117104

118-
#### Manage your APIs:
105+
#### Deploy to your cluster
119106

120-
```text
121-
$ cortex get
107+
```python
108+
import cortex
122109

123-
realtime api status replicas last update latency requests
110+
cx = cortex.client("aws")
111+
cx.deploy(api_spec, project_dir=".")
124112

125-
text-generator live 34 9h 247ms 71828
126-
object-detector live 13 15h 23ms 828459
113+
# creating https://example.com/text-generator
114+
```
127115

116+
#### Consume your API
128117

129-
batch api running jobs last update
118+
```python
119+
import requests
130120

131-
image-classifier 5 10h
121+
endpoint = "https://example.com/text-generator"
122+
payload = {"text": "hello world"}
123+
prediction = requests.post(endpoint, payload)
132124
```
133125

134126
<br>
135127

136128
## Get started
137129

138-
```text
139-
$ pip install cortex
130+
```bash
131+
pip install cortex
140132
```
141133

142134
See the [installation guide](https://docs.cortex.dev/install) for next steps.

0 commit comments

Comments
 (0)