-
-
Notifications
You must be signed in to change notification settings - Fork 223
Automatically reboot instance if hung #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
README.md
Outdated
@@ -8,7 +8,7 @@ Note: add `${var.ssh_key_pair}` private key to the `ssh agent`. | |||
|
|||
Include this repository as a module in your existing terraform code: | |||
|
|||
``` | |||
```terraform | |||
module "admin_tier" { | |||
source = "git::https://github.com/cloudposse/tf_instance.git?ref=tags/0.1.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace tags/0.1.0
with master
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add name
, namespace
and stage
variables.
README.md
Outdated
| `subnets` | [] | List of VPC Subnet IDs creating instance launched in | Yes | | ||
| `associate_public_ip_address`| `true` | Associate a public ip address with the creating instance. Boolean value | No | | ||
| Name | Default | Description | Required | | ||
|:----------------------------:|:--------------------------------------------:|:--------------------------------------------------------------------------------:|:--------:| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix table indentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind, it's ok.
@@ -61,3 +61,46 @@ variable "ssh_user" { | |||
variable "welcome_message" { | |||
default = "" | |||
} | |||
|
|||
variable "comparison_operator" { | |||
description = "The arithmetic operation to use when comparing the specified Statistic and Threshold. Possible values are: GreaterThanOrEqualToThreshold, GreaterThanThreshold, LessThanThreshold, LessThanOrEqualToThreshold." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use these descriptions in README.md.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
main.tf
Outdated
|
||
resource "aws_cloudwatch_metric_alarm" "reboot" { | ||
alarm_name = "${module.label.id}-status-check-failed" | ||
comparison_operator = "${var.comparison_operator}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we hardcode these variables?
@osterman what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably would have just hard coded it, but since @SweetOps has already parameterized it, I'm okay with it so long as we rename reboot
to default
or something more general.
main.tf
Outdated
|
||
resource "null_resource" "check_alarm" { | ||
triggers = { | ||
action = "arn:aws:swf:${var.aws_region}:${var.aws_account_id}:${var.default_alarm_action}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use a data source to retrieve aws_region
and aws_account_id
. No need for additional inputs.
README.md
Outdated
| `applying_period` | `60` | Period in seconds over which the specified statistic is applied | Yes | | ||
| `statistic_level` | `Maximum` | Statistic to apply to the alarm's associated metric | Yes | | ||
| `metric_threshold` | `1` | Value against which the specified statistic is compared | Yes | | ||
| `default_alarm_action` |`action/actions/AWS_EC2.InstanceId.Reboot/1.0`| List of actions to execute when this alarm transitions into an ALARM state | Yes | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
main.tf
Outdated
} | ||
|
||
resource "aws_cloudwatch_metric_alarm" "reboot" { | ||
alarm_name = "${module.label.id}-status-check-failed" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this alarm_name should be default
since everything is parameterized and it could mean anything.
README.md
Outdated
## Outputs | ||
|
||
| Name | Decription | | ||
|:-------------------:|:------------------------------------------------------------------:| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not center justify name and description.
main.tf
Outdated
|
||
# Restart dead or hung instance | ||
|
||
data "aws_region" "current" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these should be called default
main.tf
Outdated
current = true | ||
} | ||
|
||
data "aws_caller_identity" "current" {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call it default
main.tf
Outdated
|
||
data "aws_caller_identity" "current" {} | ||
|
||
resource "null_resource" "check_alarm" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just call it default
main.tf
Outdated
|
||
resource "null_resource" "check_alarm" { | ||
triggers = { | ||
action = "arn:aws:swf:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:${var.default_alarm_action}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename action
to check_alarm_action
@@ -15,7 +15,7 @@ variable "associate_public_ip_address" { | |||
} | |||
|
|||
variable "ansible_arguments" { | |||
type = "list" | |||
type = "list" | |||
default = [] | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove default values for name
, stage
and namespace
variables.
What
Why