Update README.md to reflect Python client (#1595)

deliahu · web-flow · commit 81f68ccca1bc · 2020-11-24T16:28:57.000-08:00
diff --git a/README.md b/README.md
@@ -5,8 +5,7 @@
 
 <!-- Delete on release branches -->
 <!-- CORTEX_VERSION_README_MINOR -->
-
-[install](https://docs.cortex.dev/install) • [documentation](https://docs.cortex.dev) • [examples](https://github.com/cortexlabs/cortex/tree/0.23/examples) • [support](https://gitter.im/cortexlabs/cortex)
+[install](https://docs.cortex.dev/install) • [documentation](https://docs.cortex.dev) • [examples](https://github.com/cortexlabs/cortex/tree/0.23/examples) • [community](https://gitter.im/cortexlabs/cortex)
 
 # Deploy machine learning models to production
 
@@ -16,49 +15,45 @@ Cortex is an open source platform for deploying, managing, and scaling machine l
 
 ## Model serving infrastructure
 
-* Supports deploying TensorFlow, PyTorch, sklearn and other models as realtime or batch APIs
-* Ensures high availability with availability zones and automated instance restarts
-* Scales to handle production workloads with request-based autoscaling
-* Runs inference on spot instances with on-demand backups
-* Manages traffic splitting for A/B testing
+* Supports deploying TensorFlow, PyTorch, sklearn and other models as realtime or batch APIs.
+* Ensures high availability with availability zones and automated instance restarts.
+* Runs inference on spot instances with on-demand backups.
+* Autoscales to handle production workloads.
 
-#### Configure your cluster:
+#### Configure Cortex
 
 ```yaml
 # cluster.yaml
 
 region: us-east-1
-availability_zones: [us-east-1a, us-east-1b]
-api_gateway: public
 instance_type: g4dn.xlarge
 min_instances: 10
 max_instances: 100
 spot: true
 ```
 
-#### Spin up your cluster on your AWS account:
+#### Spin up Cortex on your AWS account
 
 ```text
 $ cortex cluster up --config cluster.yaml
 
 ￮ configuring autoscaling ✓
 ￮ configuring networking ✓
 ￮ configuring logging ✓
-￮ configuring metrics dashboard ✓
 
 cortex is ready!
 ```
 
 <br>
 
-## Reproducible model deployments
+## Reproducible deployments
 
-* Implement request handling in Python
-* Customize compute, autoscaling, and networking for each API
-* Package dependencies, code, and configuration for reproducible deployments
-* Test locally before deploying to your cluster
+* Package dependencies, code, and configuration for reproducible deployments.
+* Configure compute, autoscaling, and networking for each API.
+* Integrate with your data science platform or CI/CD system.
+* Test locally before deploying to your cluster.
 
-#### Implement a predictor:
+#### Implement a predictor
 
 ```python
 # predictor.py
@@ -73,70 +68,67 @@ class PythonPredictor:
     return self.model(payload["text"])[0]
 ```
 
-#### Configure an API:
-
-```yaml
-# cortex.yaml
-
-name: text-generator
-kind: RealtimeAPI
-predictor:
-  path: predictor.py
-compute:
-  gpu: 1
-  mem: 4Gi
-autoscaling:
-  min_replicas: 1
-  max_replicas: 10
-networking:
-  api_gateway: public
-```
-
-#### Deploy to production:
-
-```text
-$ cortex deploy cortex.yaml
-
-creating https://example.com/text-generator
+#### Configure an API
 
-$ curl https://example.com/text-generator \
-    -X POST -H "Content-Type: application/json" \
-    -d '{"text": "deploy machine learning models to"}'
-
-"deploy machine learning models to production"
+```python
+api_spec = {
+  "name": "text-generator",
+  "kind": "RealtimeAPI",
+  "predictor": {
+    "type": "python",
+    "path": "predictor.py"
+  },
+  "compute": {
+    "gpu": 1,
+    "mem": "8Gi",
+  },
+  "autoscaling": {
+    "min_replicas": 1,
+    "max_replicas": 10
+  },
+  "networking": {
+    "api_gateway": "public"
+  }
+}
 ```
 
 <br>
 
-## API management
+## Scalable machine learning APIs
 
-* Monitor API performance
-* Aggregate and stream logs
-* Customize prediction tracking
-* Update APIs without downtime
+* Scale to handle production workloads with request-based autoscaling.
+* Stream performance metrics and logs to any monitoring tool.
+* Serve many models efficiently with multi model caching.
+* Configure traffic splitting for A/B testing.
+* Update APIs without downtime.
 
-#### Manage your APIs:
+#### Deploy to your cluster
 
-```text
-$ cortex get
+```python
+import cortex
 
-realtime api       status     replicas   last update   latency   requests
+cx = cortex.client("aws")
+cx.deploy(api_spec, project_dir=".")
 
-text-generator     live       34         9h            247ms     71828
-object-detector    live       13         15h           23ms      828459
+# creating https://example.com/text-generator
+```
 
+#### Consume your API
 
-batch api          running jobs   last update
+```python
+import requests
 
-image-classifier   5              10h
+endpoint = "https://example.com/text-generator"
+payload = {"text": "hello world"}
+prediction = requests.post(endpoint, payload)
 ```
 
 <br>
 
 ## Get started
 
-```text
-$ pip install cortex
+```bash
+pip install cortex
 ```
 
 See the [installation guide](https://docs.cortex.dev/install) for next steps.