Skip to content

Initial port of TD Tutorial #1061

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
May 22, 2019
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
224 changes: 224 additions & 0 deletions docs/tutorials/td-tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,224 @@
# Send data securely to Arm Treasure Data

There are two ways to send data securely from Mbed OS to Treasure Data:

- HTTPS library - Send data directly to the Treasure Data REST API.
- fluentd using fluent logger library - Send data to a hosted fluentd instance that aggregates and forwards the data on to your treasure data account.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ha this is how this works---I think Fluentd should be init capped

Copy link
Contributor Author

@BlackstoneEngineering BlackstoneEngineering May 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was told by the inventor that its always lowercase, see the website : https://www.fluentd.org/ .

I defer to your expertise.


Both libraries are secured with Arm Mbed TLS in transit and are equally secure. We recommend the HTTPS library for development and the fluentd library for production. The tradeoff between the two is size of code on chip, size of data in transit and setup complexity:

- Code size on chip - The HTTPS library is ~50KB of ROM space on chip, this due to the HTTP stack. Both libraries use Mbed TLS to secure the connections, which is ~7KB per connection on your stack for both libraries.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on chip, this is due ....

- Data size in transit - The HTTPS library sends data as a ASCII JSON string. The fluend library uses MessagePack (binary encoded json) across a TLS connection. This means that on average the fluentd library will use less bandwidth to send an equivalent message. When you pay per byte transmitted from both your power budget and data plan it matters.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data size in transit - The HTTPS library sends data as an ASCII JSON string. The Fluend library uses MessagePack (binary encoded JSON) across a TLS connection. This means that on average the Fluentd library uses less bandwidth to send an equivalent message. When you pay per byte transmitted from both your power budget and data plan it matters.

- Maintenance - Initially, it may be simpler to setup the HTTPS library on a device and have it send data directly to treasure data, but what if you want to change what the device is doing or how its data is reported? If you are using the HTTPS library you will need to issue a firmware update to every device to change how it formats its data, but if you are using a fluend server you can simply modify the fluentd config file on the server to change how data is formatted/processed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maintenance - Initially, setting up the HTTPS library on a device and having it send data directly to Treasure Data is easier, but what if you want to change what the device is doing or how its data is reported? If you are using the HTTPS library you must issue a firmware update to every device to change how it formats its data. However, if you are using a Fluend server, you can modify the Fluentd config file on the server to change how data is formatted and processed.


The following steps show how to send data using first the HTTPS library and then using fluentd.

## HTTPS library

To use the HTTPS library, use this program: https://github.com/blackstoneengineering/mbed-os-example-treasuredata-rest. This program turns on Mbed OS device statistics by enabling the `MBED_ALL_STATS_ENABLED` macro and then send heap/CPU/stack/system information to Treasure Data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this example into an Arm repository.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No one has committed to long term maintinance of this library, so for the moment I would like to keep it in my personal repo space.

@janjongboom - what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move it to an Arm account. You can maintain it just as well from there.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and then sends heap.....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'and then sending heap....'


https://www.youtube.com/watch?v=_tqD6GLMHQA

### Import code

You can compile the program through any of our three development tools:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can compile the program using any of the following development tools:


- [Arm Mbed Studio](https://os.mbed.com/studio/).
- Arm Online Compiler - `ide.mbed.com/compiler?import=https://github.com/blackstoneengineering/mbed-os-example-treasuredata-rest`
- Arm Mbed CLI (offline) - `mbed import https://github.com/blackstoneengineering/mbed-os-example-treasuredata-rest`

### Setup variables

1. Configure the Treasure Data API key in `mbed_app.json` by change the `api-key` variable:

```
"api-key":{


"help": "REST API Key for Treasure Data",

"value": "\"REPLACE_WITH_YOUR_KEY\""

},

```

1. Wi-Fi credentials (optional): If you're using Wi-Fi, add your SSID/password. If you are using ethernet, you do not need to add Wi-Fi credentials.

1. Create a database called `test_database` in Treasure Data.
<span class="notes">**Note:** The tables are created automatically.</span>

### Compile and load

Next, you can compile and load your code onto your board. If you are unfamiliar with how to compile and load code, please look at the Mbed OS quick start tutorial.

Once you have compiled your code and loaded it onto your board, open a serial terminal, and connect it to the board. View the output:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After you have...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?? not sure what this means?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to "After you have compiled your code ...."


```
--- Terminal on /dev/tty.usbmodem146103 - 9600,8,N,1 ---
Treasure Data REST API Demo
Connecting to the network using the default network interface...
Connected to the network successfully. IP address: 192.168.43.202
Success

MAC: C4:7F:51:02:D9:5D
IP: 192.168.43.202
Netmask: 255.255.255.0
Gateway: 192.168.43.249

Sending CPU Data: '{"uptime":6918609,"idle_time":0,"sleep_time":509277,"deep_sleep_time":0}'

Sending Heap Data: '{"current_size":15260,"max_size":75334,"total_size":747954,"reserved_size":307232,"alloc_cnt":12,"alloc_fail_cnt":0}'

Sending Stack Data: '{"thread_id":0,"max_size":4820,"reserved_size":12632,"stack_cnt":4}'

Sending System Data: '{"os_version":51104,"cpu_id":1091551809,"compiler_id":2,"compiler_version":60300}'

```

### Verify data in Treasure Data

Go to the [Database list in Treasure data](https://console.treasuredata.com/app/databases), and open the `test_database` you created earlier. You can see the data from the board in the database. There is a 3- to 5-minute delay from when the data is sent to the database until the visualization system lets you see it, so please be patient, and wait for it to arrive. Be sure to refresh the page.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...Treasure Data ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, not sure what this comment means

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I capitalized "Treasure Data" as requested by megmiranda.


<span class="notes">**Note:** The database tab shows how much data you have in the database and gives a few samples, but it does not show all your data. For that, you need to run queries.</span>

### Run queries

Now that you have data in Treasure data, it's time to analyze and use the data.

1. Go to the [Queries tab] (https://console.treasuredata.com/app/queries/editor).
2. Select the `test_database`, and run some queries. To learn more about how to run queries, please read the [Treasure Data documentation](https://support.treasuredata.com/hc/en-us/articles/360001457427-Presto-Query-Engine-Introduction).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this article as an intro to running queries in TD . https://support.treasuredata.com/hc/en-us/articles/360007995693

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, add that too. I took the article that was reccomended to me by the TD folks


#### Select all fields

Run `select * from cpu_info` to get a full list of all fields in the table.

#### Select certain fields, order by time

This query selects only certain columns from the table and then order them by the time field in ascending value, you can also replace `asc` with `desc` to get the order reversed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

....and orders them.... value. You can also ....


```
select time, current_size, total_size, alloc_cnt, max_size, reserved_size, alloc_fail_cnt from heap_info
order by time asc;
```

### Troubleshooting

If you experience issues, ensure you have at least 10KB of space left on your stack. You can also change the `TD_DEBUG` macro to `true` to turn on the Treasure Data debug printfs.

## fluentd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query: Should this be capitalized?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should what be capitalized? The stuff in should be just as shown as they are copied directly from the config file. so TD_DEBUG should be exactly as it is.


For mass deployments, we recommend you use fluentd or fluentbit to aggregate and forward the data into Treasure Data. Depending on where you host your fluentd instance, you will need to follow slightly different setup instructions. (localhost on your machine with self signed certificates or at a public IP address in the cloud with Certificate Authority (CA) signed certificates). This example uses MessagePack (a binary encoded JSON) to encode the data.

<INSERT YOUTUBE VIDEO FOR FLUENTD HERE: COMING SOON>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query: When should we expect this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it has been completed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still waiting for y'all to review the videos and give approval before I upload them to the armmbed channel


### Set up fluentd

#### Install

First, install fluentd. Please see the [fluentd quick start](https://docs.fluentd.org/v1.0/articles/quickstart) for details.

Experienced users can use `gem install fluentd fluent-plugin-td`.

#### Download example code

Download the [example code](https://github.com/BlackstoneEngineering/mbed-os-example-fluentlogger). This repository contains both the embedded example code and the fluentd configuration files.

#### Set configuration file

Run fluentd using the provided configuration file `fluentd --config ./fluentd-setup/fluentd.conf -vv`. This file opens two ports, port 24227 for unencrypted TCP traffic and port 24228 for TLS encrypted traffic. The configuration is provided for reference. We strongly suggest using TLS encryption on port 24228 to secure your data in transit.

You can either run fluentd on a public IP address with CA signed certificates (suggested for deployments), or locally on your machine using self signed certificates (recommended for prototyping/testing).

##### Signed by CA, running in cloud

If you have valid certificates from a CA, replace the `fluentd.crt` and `fluentd.key` files with the CA certificates. Then uncomment the lines in the `fluentd.conf` file for CA trusted certificates, comment out the lines for self-signed certificates and change the passphrase to match for your certificate:

```
# cert_path ~/mbed-os-example-fluentlogger/fluentd-setup/fluentd.crt
# private_key_path ~mbed-os-example-fluentlogger/fluentd-setup/fluentd.key
# private_key_passphrase YOUR_PASSPHRASE
```

##### Self-signed certificates on localhost

https://youtu.be/elB22i4y1yU

If you are running the fluentd server locally on your machine to develop a proof of concept (PoC), you need to generate a new self-signed certificate where the Common Name (CN) is the IP address of your machine and modify the fluentd.conf file with the IP address of your machine. Each time you restart the fluentd instance, it generates a new certificate that you need to copy and paste into your embedded code.

1. Change the `generate_cert_common_name` parameter in `fluentd.conf` to be the IP address of the computer running the fluentd server.
1. Run ` openssl req -new -x509 -sha256 -days 1095 -newkey rsa:2048 -keyout fluentd.key -out fluentd.crt` to generate new certificates. When entering the prompted values, make sure to match the parameters in the `fluentd.conf` file (US, CA, Mountain View and so on). **Make sure the CN field is set to the IP address of the fluentd server**.

For example:

```
Country Name (2 letter code) []:US
State or Province Name (full name) []:CA
Locality Name (eg, city) []:Mountain View
Organization Name (eg, company) []:
Organizational Unit Name (eg, section) []:
Common Name (eg, fully qualified host name) []:192.168.1.85
Email Address []:
```

### Mbed OS setup

Run the example code on your device. You can either [import to the Mbed Online Compiler](http://os.mbed.com/compiler/?import=https%3A%2F%2Fi.8713187.xyz%2FBlackstoneEngineering%2Fmbed-os-example-fluentlogger) or use Mbed CLI to clone it locally, compile and load it to the board:

```shell
$ mbed import https://github.com/BlackstoneEngineering/mbed-os-example-fluentlogger
$ mbed compile --target auto --toolchain GCC_ARM --flash --sterm
```

#### Secure (TLS)

To send data to fluentd over TLS (securely):

1. Run `openssl s_client -connect localhost:24228 -showcerts`.
1. Copy the certificate to `fluentd-sslcert.h`. If you are running the fluentd server on localhost, this certificate will change every time you restart fluentd. You eed to rerun this command and recompile your embedded code every time you restart fluentd.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...You need to rerun...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make the change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BlackstoneEngineering , @megmiranda is our writer from Arm Treasure Data. I told her she could make the changes in comments, and I've already made these changes directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. Its hard for me to tell what is being referenced as such a wide swath is highlighted in the github comment tool. I'll leave it to y'all

1. Modify the call in `main.cpp` to the FluentLogger object.
1. Change the IP address to the IP address of the fluentd server, or if you are hosting it in the cloud, change it to the web address where it is hosted. **It is important that the IP address in the main.cpp file matches the IP address set in the CN field fo the fluentd server. Otherwise, it will not work because Mbed TLS uses strict CN verification.**
1. Compile the code and load it onto your board.

### Success

Successful output on the fluentd terminal:

```sterm
-0500 debug.test: ["sint",0,1,-1,-128,-32768,-2147483648]
-0500 [trace]: #0 fluent/log.rb:281:trace: connected fluent socket addr="192.168.1.95" port=5522
-0500 [trace]: #0 fluent/log.rb:281:trace: accepted fluent socket addr="192.168.1.95" port=5522
-0500 debug.test: ["uint",0,1,128,255,65535,4294967295]
-0500 [trace]: #0 fluent/log.rb:281:trace: connected fluent socket addr="192.168.1.95" port=5523
-0500 [trace]: #0 fluent/log.rb:281:trace: accepted fluent socket addr="192.168.1.95" port=5523
-0500 [trace]: #0 fluent/log.rb:281:trace: enqueueing all chunks in buffer instance=70248976563020
-0500 debug.test: {"string":"Hi!","float":0.3333333432674408,"double":0.3333333333333333}
-0500 [trace]: #0 fluent/log.rb:281:trace: connected fluent socket addr="192.168.1.95" port=5524
-0500 [trace]: #0 fluent/log.rb:281:trace: accepted fluent socket addr="192.168.1.95" port=5524
-0500 debug.test: {"string":"Hi!","float":0.3333333432674408,"double":0.3333333333333333}
-0500 [trace]: #0 fluent/log.rb:281:trace: connected fluent socket addr="192.168.1.95" port=5525
-0500 [trace]: #0 fluent/log.rb:281:trace: accepted fluent socket addr="192.168.1.95" port=5525
-0500 [trace]: #0 fluent/log.rb:281:trace: adding metadata instance=70248976563020 metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=nil, tag="td.fluentd_database.test", variables=nil>
-0500 [trace]: #0 fluent/log.rb:281:trace: writing events into buffer instance=70248976563020 metadata_size=1
-0500 [debug]: #0 fluent/log.rb:302:debug: Created new chunk chunk_id="585c249fd2ebe20867267de2fde7c4bc" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=nil, tag="td.fluentd_database.test", variables=nil>
-0500 [trace]: #0 fluent/log.rb:281:trace: connected fluent socket addr="192.168.1.95" port=5526
-0500 [trace]: #0 fluent/log.rb:281:trace: accepted fluent socket addr="192.168.1.95" port=5526
-0500 debug.test: {"string":"Hi!","float":0.3333333432674408,"double":0.3333333333333333}

```

### Configure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the two sections within this part of configuration, or should we change or remove this heading?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont understand the question, mind chatting with me about this monday? I cant seem to get the context correct in my head

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: Remove h3 configure and replace with h3 setting Treasure Data databases and tables


#### Setting Treasure Data databases and tables

The second field in the tag of your embedded code determines the database. For example, sending data to a tag called `td.mydatabase.mytable` logs the data to the database called `mydatabase` in the table `mytable`. You can modify the example configuration file to see this.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little free of context unless you're already a TD user, I think

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one I can't fix on my own, since I'm not a TD user. @megmiranda can you offer the readers more context?


### Debugging

For more verbose debug messages, turn on the following flags in `mbed_app.json`:

```json
MBEDTLS_SSL_DEBUG_ALL=1

"mbed-trace.enable" : true
```