-
Notifications
You must be signed in to change notification settings - Fork 1.7k
(DOCSP-1271, DOCS-10405): Updated Production Notes. #3233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- A setting of ``100`` tells it to swap aggressively to disk. | ||
|
||
If your host runs kernel versions ``3.5`` or later, or ``2.6.32-303`` | ||
or later, setting this value to ``0`` could disable swapping. Set this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this qualification. Why not just say "kernel version 2.6.32-303 or later"? Are there some versions between 2.6.32-303 and 3.5 to which this doesn't apply?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I clarified.
|
||
If your host runs kernel versions ``3.5`` or later, or ``2.6.32-303`` | ||
or later, setting this value to ``0`` could disable swapping. Set this | ||
to ``1``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The section title says "at least 1", so maybe we should say that here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
To learn more about Enhanced Networking, see to the | ||
`AWS documentation <http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html#enabling_enhanced_networking>`_. | ||
|
||
- Use provisioned IOPS for the storage, with separate devices for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as you're using the :abbr: directive for other abbreviations around here, might as well add it for IOPS too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
journal and data. Do not use the ephemeral storage available on some | ||
instance types as their performance changes moment to moment. | ||
|
||
- Disable hyperthreading, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The suggestions coming from the perf team here are specifically aimed at recommendations for getting reproducible performance tests. They are not aimed at getting maximum performance. As such, they aren't for every body. If a user cares about consistency of performance over raw performance (for benchmarking, or just predictability), they should do this. If they care about performance more than performance reproducibility, they should not pin to one socket and should not disable hyperthreading.
To some extents that applies to the ephemeral storage too. I don't know if ephemeral storage is better or worse for someone just concerned about performance, but if someone is wants reproducibility they should use the provisioned IOPS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I clarified this.
instance types as their performance changes moment to moment. | ||
|
||
- Disable hyperthreading, | ||
:abbr:`DVFS (dynamic voltage and frequency scaling)`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you using the DVFS abbreviation somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is staged here. I was looking for "frequency scaling" and only came across this explanation, which was in sync with what was discussed. Did I miss something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The staging view helped. I see now.
Do you want more details on some of these. There are linux command line options for some of the stuff, although they are on the default linux ami now. We have notes on the DVFS stuff also.
For c states the linux boot option is processor.max_cstate=1 or sometimes intel_idle.max_cstate=1 depending on kernel version. I found this in my notes which helped me at the time:
https://gist.github.com/wmealing/2dd2b543c4d3cff6cab7 and https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/processor_state_control.html
We tested with c3 and c4 instances. The c and p state controls only worked on the c4 generation, but the c3 ended up being lower noise.
There's also an idle=poll setting to similar effect. That is baked into the basic ami image now and we're using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tony, I'm out next week, and back the following week. I added in a little more information. I can see if I have anything else later. We can get someone else looking at it next week if needed. Thanks.
instance types as their performance changes moment to moment. | ||
|
||
- Disable hyperthreading, | ||
:abbr:`DVFS (dynamic voltage and frequency scaling)`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The staging view helped. I see now.
Do you want more details on some of these. There are linux command line options for some of the stuff, although they are on the default linux ami now. We have notes on the DVFS stuff also.
For c states the linux boot option is processor.max_cstate=1 or sometimes intel_idle.max_cstate=1 depending on kernel version. I found this in my notes which helped me at the time:
https://gist.github.com/wmealing/2dd2b543c4d3cff6cab7 and https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/processor_state_control.html
We tested with c3 and c4 instances. The c and p state controls only worked on the c4 generation, but the c3 ended up being lower noise.
There's also an idle=poll setting to similar effect. That is baked into the basic ami image now and we're using it.
Tony, some of our reports -- should have included on the ticket. More for background than details: https://docs.google.com/presentation/d/1TBW6MP6N97vYfpKzCcY6jlluNs1ovyfGah9rkL8J83Y/edit?usp=sharing Henrik's persentation on this work: http://henrikingo.github.io/presentations/Highload%202017%20-%20Measuring%20performance%20variability%20of%20EC2/index.html#/title |
@steveren : Could you give this another look? |
|
||
- Use provisioned :abbr:`IOPS (Input/Output Operations Per Second)` for | ||
the storage, with separate devices for journal and data. Do not use | ||
the ephemeral storage available on some instance types as their |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Some instance types" is vague.
While we didn't spend time going in that direction, for example the i-family of instance types has very good SSDs which should far outperform anything EBS can do. (It's also a very expensive instance type.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might move the part about ephemeral storage, and the next part about dvfs below the "If you are concerned about reproducible performance" part. To make a weak statement, I know they apply there, I don't know they apply to the performance case (they might).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I like the split between the two scenarios. That looks good.
@dalyd : Ready for another review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@kay-kim : RFM for |
@steveren : This includes two short adds: AWS EC2 settings and swappiness
This change is