Amazon ECS Operations: Difference between revisions
(104 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
=External= | =External= | ||
* https://aws.amazon.com/getting-started/tutorials/deploy-docker-containers/ | |||
=Internal= | =Internal= | ||
* [[Amazon ECS#Subjects|Amazon ECS]] | * [[Amazon ECS#Subjects|Amazon ECS]] | ||
* [[Amazon ECS Deployment with CloudFormation|Amazon ECS Deployment with CloudFormation]] | |||
=Overview= | =Overview= | ||
=Amazon ECS and CloudFormation= | |||
{{Internal|Amazon ECS Deployment with CloudFormation|Amazon ECS Deployment with CloudFormation}} | |||
=Create a Cluster= | |||
{{External|[https://docs.aws.amazon.com/AmazonECS/latest/developerguide/create_cluster.html Create a Cluster - Reference]}} | |||
==Procedure== | |||
This procedure describes how to create a cluster with the Amazon Console. Cluster should be preferably be [[Amazon_ECS_Deployment_with_CloudFormation#Create_a_Cluster|created with CloudFormation]]. | |||
Amazon ECS -> Clusters -> Create Cluster | |||
Networking only (Fargate) | |||
Cluster Name | |||
===Networking=== | |||
Create VPC: Even if a cluster uses a [[Amazon_VPC_Concepts#Virtual_Private_Cloud_.28VPC.29|VPC]], it does not seem to be possible to create the VPC in advance, and just refer it during the cluster creation process - at least when the cluster is created from the console. If no VPC is created during the cluster creation process, <font color=darkgray>the cluster probably uses one of the existing VPCs. Which one? Maybe the [[Amazon_VPC_Concepts#Default_VPC|default VPC of the account]]?</font> For more details see: {{Internal|Amazon_ECS_Concepts#Relationship_between_a_Cluster_and_a_VPC|Relationship between a Cluster and a VPC}} | |||
CIDR block | |||
10.7.0.0/16 | |||
Subnet 1: | |||
10.7.1.0/24 | |||
Subnet 2: | |||
10.7.2.0/24 | |||
===Result and Next Steps=== | |||
The procedure will create the cluster and the following associated resources: | |||
A [[AWS_CloudFormation_Concepts#Stack|CloudFormation stack]]. The stack automatically gets a name (EC2ContainerService-<''cluster-name''>). | |||
A [[Amazon VPC Concepts#VPC|VPC]]. The VPC spans several availability zones. It is probably a good idea to navigate to the VPC console by following the link, and update the name to something relevant. | |||
Subnets. It is probably a good idea to navigate to the VPC console by following the links, and update the name of the subnets to something relevant. | |||
An Internet gateway. It is probably a good idea to navigate to the VPC console by following the link, and update the name to something relevant. | |||
A route table. It is probably a good idea to navigate to the VPC console by following the link, and update the name to something relevant. The route table will be associated automatically with the subnets created by the process. The routes will include the subnets for the relevant IP address ranges, and the internet gateway for everything else. | |||
An Amazon EC2 route. | |||
A [[Amazon_VPC_Concepts#Virtual_Private_Gateway|virtual private gateway]] attachment. | |||
<font color=darkgray>Configure security group to allow access</font> | |||
=Create a Task Definition= | =Create a Task Definition= | ||
{{External|[https://docs.aws.amazon.com/AmazonECS/latest/developerguide/create-task-definition.html Create a Task Definition - Reference]}} | |||
This is the procedure to create a [[Amazon ECS Concepts#Task_Definition|task definition]]: | |||
Amazon ECS -> Task Definitions -> Create a New Task Definition -> FARGATE -> Next Step | Amazon ECS -> Task Definitions -> Create a New Task Definition -> FARGATE -> Next Step | ||
Task Definition Name: | Task Definition Name: themyscira | ||
Requires Compatibilities: FARGATE | Requires Compatibilities: FARGATE | ||
Task Role: | [[Amazon_ECS_Concepts#Task_Role|Task Role]]: If the task only needs generic permissions, which should be the case, it is a good idea to create a generic Task Role, shared across clusters, and use it here. This is how roles can be created: | ||
{{Internal|AWS_Security_Operations#Create_an_ECS_Task_Role|Create an IAM Task Role}} | |||
After the task role is correctly created, it should show up in the "Task Role" drop-down box. | |||
Network Mode: awsvpc | |||
Task execution IAM role - this is the role that authorizes Amazon ECS to pull private images and publish logs for the task. This takes the place of the EC2 Instance role when running tasks: | |||
{{Internal|AWS_Security_Operations#Create_an_ECS_Task_Execution_Role|Create an IAM Task Execution Role}} | |||
After the task role is correctly created, it should show up in the "Task execution role" drop-down box. If it does not show up, refresh the page. | |||
Task size: | |||
Task memory (GB): 4GB | |||
Task CPU (vCPU): 2 vCPU | |||
Container Definitions: Add Container | |||
Container name: themyscira | |||
Image: 673499572719.dkr.ecr.us-west-2.amazonaws.com/com.uplift/playground/themyscira:latest | |||
If the repository does not exist, create it: | |||
{{Internal|Amazon ECR Operations#Create_Repository|Amazon ECR Operations - Create Repository}} | |||
No Private repository authentication. | |||
Memory Limits (MiB): Hard Limit 4096 | |||
Port Mappings. Port mappings allow containers to access ports on the host container instance to send or receive traffic.: 10001 (tcp) | |||
{{Warn|Host port mappings are not valid when the network mode for a task definition is host or awsvpc. To specify different host and container port mappings, choose the Bridge network mode.}} | |||
Advanced container configuration | |||
Healthcheck | |||
===Environment=== | |||
CPU Units: 2048 | |||
Essential: If the essential parameter of a container is marked as true, the failure of that container will stop the task. | |||
Entry point: | |||
Command: | |||
Working directory: | |||
====Environment Variables==== | |||
This is where we can configure Spring applications' behavior: | |||
{| | |||
|SPRING_PROFILES_ACTIVE || Value || playground | |||
|- | |||
|SERVER_PORT ||Value ||10001 | |||
|} | |||
= | ===Network Settings=== | ||
===Storage and Logging=== | |||
Read only root file system | |||
Mount points: | |||
Volumes from: | |||
Log configuration: Unselect "Auto-configure CloudWatch Logs" | |||
Log driver: awslogs | |||
Values: | |||
awslogs-group: /playground | |||
Note that the log group must be created if it does not exist, otherwise the container launch will fail: | |||
{{Internal|Amazon_CloudWatch_Operations#Create_a_Log_Group|Create a Log Group in CloudWatch}} | |||
awslogs-region: us-west-2 | |||
awslogs-stream-prefix: themyscira | |||
===Resource Limits=== | |||
===Docker Labels=== | |||
=Create a Service= | =Create a Service= | ||
Before creating the [[Amazon_ECS_Concepts#Service|service]], at least a Task Definition must be created in advance. See: {{Internal|#Create_a_Task_Definition|Create a Task Definition}} | |||
If you plan to expose this service though a load balancer, the load balancer must be created first. See below for details: {{Internal|#Load_Balancing|Load Balancing}} | |||
Clusters -> <''Cluster Name''> -> Services tab -> Create: | Clusters -> <''Cluster Name''> -> Services tab -> Create: | ||
Launch Type: | ==Service Configuration== | ||
====Launch Type==== | |||
FARGATE | |||
====Task Definition==== | |||
* Family themyscira | |||
* Revision: latest | |||
====Platform version==== | |||
LATEST | |||
====Cluster==== | |||
playground | |||
====Service name==== | |||
themyscira | |||
====Service type==== | |||
REPLICA. | |||
More details: [[Amazon_ECS_Concepts#Service_Type|Service Type]]. | |||
====Number of Tasks==== | |||
1 | |||
====Minimum healthy percent==== | |||
100 | |||
====Maximum percent==== | |||
200 | |||
==Deployments== | |||
====Deployment type==== | |||
The options are [[Amazon_ECS_Concepts#Rolling_Update|Rolling update]] and "[[Amazon_ECS_Concepts#Blue.2FGreen_Deployment|Blue/green deployment]] (powered by AWS CodeDeploy)". | |||
For simple cases, "Rolling update" is sufficient and a redeployment can be triggered by deleting a task. | |||
If you intend to drive deployments with AWS CodeDeploy, choose "Blue/green deployment (powered by AWS CodeDeploy)". For more details, see [[Amazon_ECS_Concepts#Blue.2FGreen_Deployment|ECS Concepts - Blue/Green Deployment]]. This configuration is required if an [[AWS_CodeDeploy_Operations#Prerequisites|(additional) AWS CodeDeploy deployment group is created for this service]]. Blue/green deployment requires the selection of a [[AWS_CodeDeploy_Concepts#Service_Role|Service role for CodeDeploy]]. | |||
{{Warn|Changing between "Rolling Update" and "Blue/green deployments" require recreation of the service.}} | |||
====Task tagging configuration==== | |||
Enable ECS managed tags. | |||
==Configure network== | |||
===VPC and security groups=== | |||
Cluster VPC: vpc-* | |||
Subnets: ... | |||
Security groups: Edit it and insure the desired ports are properly exposed, change the name. If the security group already exists, it can be used. | |||
Auto-assign public IP: ENABLED | |||
===Health check grace period=== | |||
===Load Balancing=== | |||
{{Warn|Load balancing settings can only be configured on service creation. If the service is to be exposed as integration endpoint by the API Gateway, it needs a [[AWS_Elastic_Load_Balancing_Concepts#Network_Load_Balancer|network load balancer]].}} | |||
If none exists, a network load balancer has to be created in advance as shown here: {{Internal|AWS_Elastic_Load_Balancing_Operations#Create_a_Network_Load_Balancer|Create a Network Load Balancer}} | |||
Load balancer type: [[AWS_Elastic_Load_Balancing_Concepts#Network_Load_Balancer|Network Load Balancer]] | |||
Service IAM role: <font color=darkgray>Return here.</font> | |||
====Container to load balance==== | |||
Container name: port. themyscira:10001:10001 -> Add to load balancer. | |||
This is does not actually add anything to the load balancer just yet, but allows target configuration: | |||
'''themyscira:10001''' | |||
Production listener port: You can use the existing listener (80:TCP) or create a new one. It seems that using "80:TCP" does not work, as the console queries the load balancer configuration in an attempt to find an "ip" target group associated with this listener and it displays an error message if it does not find it. <font color=darkgray>It might work if I create an "ip" target group, appropriately named. Try next time.</font> | |||
Production listener port: create new: 10001. Creating "new one" refers to [[AWS_Elastic_Load_Balancing_Operations#Listeners|creating a new listener]]. | |||
Production listener protocol: TCP | |||
Target group name: create new: themyscira | |||
Target group protocol: TCP | |||
Target type: "ip" | |||
Health check protocol: TCP | |||
After the service creation procedure completes, a new listener and target group pair is created with the load balancer: | |||
:[[Image:NewLBListener.png]] | |||
===Service Discovery=== | |||
Note that this is optional and it should only be completed if we want to access the ECS service endpoint with a DNS name. If a [[#Load_Balancing|load balancer]] was previously configured this is probably not necessary. | |||
{{Warn|Updating existing services to configure service discovery for the first time or change the current configuration is not supported. Service discovery should be configured when the service is created.}} | |||
{{Internal|Amazon_ECS_Service_Discovery_Concepts|Service Discovery Concepts}} | |||
Enable service discovery integration: Check. | |||
[[Amazon_ECS_Service_Discovery_Concepts#Namespace|Namespace]]: | |||
Namespace name: <font color=darkgray>Even if I created a hosted zone in advance in Route53 console, I was not able to select it here, so I chose "create new private namespace", with the same name as the existing hosted zone: playground". | |||
Nothing showed up in Custer VPC dropdown, I assume the VPC previously specified is used. | |||
</font> | |||
Configure [[Amazon_ECS_Service_Discovery_Concepts#Service_Discovery_Service|service discovery service]]: "Create a new service discovery service". | |||
Service discovery name: themyscira | |||
Enable ECS task health propagation: check. | |||
Docker health checks. | |||
Enable public DNS health check. | |||
DNS records for service discovery. | |||
DNS record type: A | |||
TTL: 60 seconds. | |||
===Set Auto Scaling=== | |||
Optional. | |||
Do not adjust the service's desired count. | |||
Create Service | |||
=Force Deployment with AWS CLI= | |||
aws ecs update-service --cluster ${ECS_CLUSTER} --service ${ECS_NAME} --force-new-deployment | |||
Note that if the deployment is managed by a CODE_DEPLOY deployment controller (see [[Amazon_ECS_Concepts#Blue.2FGreen_Deployment|blue/green deployments]]), the attempt to redeploy from command line will fail with: | |||
An error occurred (InvalidParameterException) when calling the UpdateService operation: Cannot force a new deployment on services with a CODE_DEPLOY deployment controller. Please use Code Deploy to trigger a new deployment. | |||
=Troubleshooting= | |||
{{External|[https://docs.aws.amazon.com/AmazonECS/latest/developerguide/troubleshooting.html ECS Troubleshooting]}} | |||
==Troubleshooting Stopped Tasks== | |||
If tasks go through a PROVISIONING, then PENDING status, and they disappear, their death cause can be investigated post-mortem by going to Cluster -> Services -> <''service-name''> -> Tasks -> Stopped -> Click on one of the stopped tasks IDs. The exit status and reason should be available there. |
Latest revision as of 18:00, 30 March 2019
External
Internal
Overview
Amazon ECS and CloudFormation
Create a Cluster
Procedure
This procedure describes how to create a cluster with the Amazon Console. Cluster should be preferably be created with CloudFormation.
Amazon ECS -> Clusters -> Create Cluster
Networking only (Fargate)
Cluster Name
Networking
Create VPC: Even if a cluster uses a VPC, it does not seem to be possible to create the VPC in advance, and just refer it during the cluster creation process - at least when the cluster is created from the console. If no VPC is created during the cluster creation process, the cluster probably uses one of the existing VPCs. Which one? Maybe the default VPC of the account? For more details see:
CIDR block
10.7.0.0/16
Subnet 1:
10.7.1.0/24
Subnet 2:
10.7.2.0/24
Result and Next Steps
The procedure will create the cluster and the following associated resources:
A CloudFormation stack. The stack automatically gets a name (EC2ContainerService-<cluster-name>).
A VPC. The VPC spans several availability zones. It is probably a good idea to navigate to the VPC console by following the link, and update the name to something relevant.
Subnets. It is probably a good idea to navigate to the VPC console by following the links, and update the name of the subnets to something relevant.
An Internet gateway. It is probably a good idea to navigate to the VPC console by following the link, and update the name to something relevant.
A route table. It is probably a good idea to navigate to the VPC console by following the link, and update the name to something relevant. The route table will be associated automatically with the subnets created by the process. The routes will include the subnets for the relevant IP address ranges, and the internet gateway for everything else.
An Amazon EC2 route.
A virtual private gateway attachment.
Configure security group to allow access
Create a Task Definition
This is the procedure to create a task definition:
Amazon ECS -> Task Definitions -> Create a New Task Definition -> FARGATE -> Next Step
Task Definition Name: themyscira
Requires Compatibilities: FARGATE
Task Role: If the task only needs generic permissions, which should be the case, it is a good idea to create a generic Task Role, shared across clusters, and use it here. This is how roles can be created:
After the task role is correctly created, it should show up in the "Task Role" drop-down box.
Network Mode: awsvpc
Task execution IAM role - this is the role that authorizes Amazon ECS to pull private images and publish logs for the task. This takes the place of the EC2 Instance role when running tasks:
After the task role is correctly created, it should show up in the "Task execution role" drop-down box. If it does not show up, refresh the page.
Task size:
Task memory (GB): 4GB
Task CPU (vCPU): 2 vCPU
Container Definitions: Add Container
Container name: themyscira
Image: 673499572719.dkr.ecr.us-west-2.amazonaws.com/com.uplift/playground/themyscira:latest
If the repository does not exist, create it:
No Private repository authentication.
Memory Limits (MiB): Hard Limit 4096
Port Mappings. Port mappings allow containers to access ports on the host container instance to send or receive traffic.: 10001 (tcp)
Host port mappings are not valid when the network mode for a task definition is host or awsvpc. To specify different host and container port mappings, choose the Bridge network mode.
Advanced container configuration
Healthcheck
Environment
CPU Units: 2048
Essential: If the essential parameter of a container is marked as true, the failure of that container will stop the task.
Entry point:
Command:
Working directory:
Environment Variables
This is where we can configure Spring applications' behavior:
SPRING_PROFILES_ACTIVE | Value | playground |
SERVER_PORT | Value | 10001 |
Network Settings
Storage and Logging
Read only root file system
Mount points:
Volumes from:
Log configuration: Unselect "Auto-configure CloudWatch Logs"
Log driver: awslogs
Values:
awslogs-group: /playground
Note that the log group must be created if it does not exist, otherwise the container launch will fail:
awslogs-region: us-west-2
awslogs-stream-prefix: themyscira
Resource Limits
Docker Labels
Create a Service
Before creating the service, at least a Task Definition must be created in advance. See:
If you plan to expose this service though a load balancer, the load balancer must be created first. See below for details:
Clusters -> <Cluster Name> -> Services tab -> Create:
Service Configuration
Launch Type
FARGATE
Task Definition
- Family themyscira
- Revision: latest
Platform version
LATEST
Cluster
playground
Service name
themyscira
Service type
REPLICA.
More details: Service Type.
Number of Tasks
1
Minimum healthy percent
100
Maximum percent
200
Deployments
Deployment type
The options are Rolling update and "Blue/green deployment (powered by AWS CodeDeploy)".
For simple cases, "Rolling update" is sufficient and a redeployment can be triggered by deleting a task.
If you intend to drive deployments with AWS CodeDeploy, choose "Blue/green deployment (powered by AWS CodeDeploy)". For more details, see ECS Concepts - Blue/Green Deployment. This configuration is required if an (additional) AWS CodeDeploy deployment group is created for this service. Blue/green deployment requires the selection of a Service role for CodeDeploy.
Changing between "Rolling Update" and "Blue/green deployments" require recreation of the service.
Task tagging configuration
Enable ECS managed tags.
Configure network
VPC and security groups
Cluster VPC: vpc-*
Subnets: ...
Security groups: Edit it and insure the desired ports are properly exposed, change the name. If the security group already exists, it can be used.
Auto-assign public IP: ENABLED
Health check grace period
Load Balancing
Load balancing settings can only be configured on service creation. If the service is to be exposed as integration endpoint by the API Gateway, it needs a network load balancer.
If none exists, a network load balancer has to be created in advance as shown here:
Load balancer type: Network Load Balancer
Service IAM role: Return here.
Container to load balance
Container name: port. themyscira:10001:10001 -> Add to load balancer.
This is does not actually add anything to the load balancer just yet, but allows target configuration:
themyscira:10001
Production listener port: You can use the existing listener (80:TCP) or create a new one. It seems that using "80:TCP" does not work, as the console queries the load balancer configuration in an attempt to find an "ip" target group associated with this listener and it displays an error message if it does not find it. It might work if I create an "ip" target group, appropriately named. Try next time.
Production listener port: create new: 10001. Creating "new one" refers to creating a new listener.
Production listener protocol: TCP
Target group name: create new: themyscira
Target group protocol: TCP
Target type: "ip"
Health check protocol: TCP
After the service creation procedure completes, a new listener and target group pair is created with the load balancer:
Service Discovery
Note that this is optional and it should only be completed if we want to access the ECS service endpoint with a DNS name. If a load balancer was previously configured this is probably not necessary.
Updating existing services to configure service discovery for the first time or change the current configuration is not supported. Service discovery should be configured when the service is created.
Enable service discovery integration: Check.
Namespace name: Even if I created a hosted zone in advance in Route53 console, I was not able to select it here, so I chose "create new private namespace", with the same name as the existing hosted zone: playground".
Nothing showed up in Custer VPC dropdown, I assume the VPC previously specified is used.
Configure service discovery service: "Create a new service discovery service".
Service discovery name: themyscira
Enable ECS task health propagation: check.
Docker health checks.
Enable public DNS health check.
DNS records for service discovery.
DNS record type: A
TTL: 60 seconds.
Set Auto Scaling
Optional.
Do not adjust the service's desired count.
Create Service
Force Deployment with AWS CLI
aws ecs update-service --cluster ${ECS_CLUSTER} --service ${ECS_NAME} --force-new-deployment
Note that if the deployment is managed by a CODE_DEPLOY deployment controller (see blue/green deployments), the attempt to redeploy from command line will fail with:
An error occurred (InvalidParameterException) when calling the UpdateService operation: Cannot force a new deployment on services with a CODE_DEPLOY deployment controller. Please use Code Deploy to trigger a new deployment.
Troubleshooting
Troubleshooting Stopped Tasks
If tasks go through a PROVISIONING, then PENDING status, and they disappear, their death cause can be investigated post-mortem by going to Cluster -> Services -> <service-name> -> Tasks -> Stopped -> Click on one of the stopped tasks IDs. The exit status and reason should be available there.