Amazon ECS Operations: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
 
(107 intermediate revisions by the same user not shown)
Line 1: Line 1:
=External=
=External=
* https://aws.amazon.com/getting-started/tutorials/deploy-docker-containers/


=Internal=
=Internal=


* [[Amazon ECS#Subjects|Amazon ECS]]
* [[Amazon ECS#Subjects|Amazon ECS]]
* [[Amazon ECS Deployment with CloudFormation|Amazon ECS Deployment with CloudFormation]]


=Overview=
=Overview=
=Amazon ECS and CloudFormation=
{{Internal|Amazon ECS Deployment with CloudFormation|Amazon ECS Deployment with CloudFormation}}


=Create a Cluster=
=Create a Cluster=


{{External|[https://docs.aws.amazon.com/AmazonECS/latest/developerguide/create_cluster.html Create a Cluster - Reference]}}
{{External|[https://docs.aws.amazon.com/AmazonECS/latest/developerguide/create_cluster.html Create a Cluster - Reference]}}
==Procedure==
This procedure describes how to create a cluster with the Amazon Console. Cluster should be preferably be [[Amazon_ECS_Deployment_with_CloudFormation#Create_a_Cluster|created with CloudFormation]].


Amazon ECS -> Clusters -> Create Cluster
Amazon ECS -> Clusters -> Create Cluster
Line 17: Line 28:
Cluster Name
Cluster Name


Networking: VPC, Subnet1
===Networking===
 
Create VPC: Even if a cluster uses a [[Amazon_VPC_Concepts#Virtual_Private_Cloud_.28VPC.29|VPC]], it does not seem to be possible to create the VPC in advance, and just refer it during the cluster creation process - at least when the cluster is created from the console. If no VPC is created during the cluster creation process, <font color=darkgray>the cluster probably uses one of the existing VPCs. Which one? Maybe the [[Amazon_VPC_Concepts#Default_VPC|default VPC of the account]]?</font> For more details see: {{Internal|Amazon_ECS_Concepts#Relationship_between_a_Cluster_and_a_VPC|Relationship between a Cluster and a VPC}}
 
CIDR block
 
10.7.0.0/16
 
Subnet 1:
 
10.7.1.0/24
 
Subnet 2:
 
10.7.2.0/24
 
===Result and Next Steps===
 
The procedure will create the cluster and the following associated resources:
 
A [[AWS_CloudFormation_Concepts#Stack|CloudFormation stack]]. The stack automatically gets a name (EC2ContainerService-<''cluster-name''>).
 
A [[Amazon VPC Concepts#VPC|VPC]]. The VPC spans several availability zones. It is probably a good idea to navigate to the VPC console by following the link, and update the name to something relevant.
 
Subnets. It is probably a good idea to navigate to the VPC console by following the links, and update the name of the subnets to something relevant.
 
An Internet gateway. It is probably a good idea to navigate to the VPC console by following the link, and update the name to something relevant.
 
A route table. It is probably a good idea to navigate to the VPC console by following the link, and update the name to something relevant. The route table will be associated automatically with the subnets created by the process. The routes will include the subnets for the relevant IP address ranges, and the internet gateway for everything else.
 
An Amazon EC2 route.
 
A [[Amazon_VPC_Concepts#Virtual_Private_Gateway|virtual private gateway]] attachment.
 
<font color=darkgray>Configure security group to allow access</font>
 
=Create a Task Definition=
 
{{External|[https://docs.aws.amazon.com/AmazonECS/latest/developerguide/create-task-definition.html Create a Task Definition - Reference]}}
 
This is the procedure to create a [[Amazon ECS Concepts#Task_Definition|task definition]]:


Amazon ECS -> Task Definitions -> Create a New Task Definition -> FARGATE -> Next Step


It will create:
Task Definition Name: themyscira


* ECS cluster
Requires Compatibilities: FARGATE
* CloudFormation Stack
 
* VPC
[[Amazon_ECS_Concepts#Task_Role|Task Role]]: If the task only needs generic permissions, which should be the case, it is a good idea to create a generic Task Role, shared across clusters, and use it here. This is how roles can be created:
* Subnet 1
 
* Subnet 1 route table association
{{Internal|AWS_Security_Operations#Create_an_ECS_Task_Role|Create an IAM Task Role}}
* VPC Availability Zones
 
* Internet gateway
After the task role is correctly created, it should show up in the "Task Role" drop-down box.
* Route table
 
* Amazon EC2 route
Network Mode: awsvpc
* Virtual private gateway attachment
 
Task execution IAM role - this is the role that authorizes Amazon ECS to pull private images and publish logs for the task. This takes the place of the EC2 Instance role when running tasks:
 
{{Internal|AWS_Security_Operations#Create_an_ECS_Task_Execution_Role|Create an IAM Task Execution Role}}
 
After the task role is correctly created, it should show up in the "Task execution role" drop-down box. If it does not show up, refresh the page.
 
Task size:
 
Task memory (GB): 4GB
 
Task CPU (vCPU): 2 vCPU
 
Container Definitions: Add Container
 
Container name: themyscira
 
Image: 673499572719.dkr.ecr.us-west-2.amazonaws.com/com.uplift/playground/themyscira:latest
 
If the repository does not exist, create it:
 
{{Internal|Amazon ECR Operations#Create_Repository|Amazon ECR Operations - Create Repository}}
 
No Private repository authentication.
 
Memory Limits (MiB): Hard Limit 4096
 
Port Mappings. Port mappings allow containers to access ports on the host container instance to send or receive traffic.: 10001 (tcp)
 
{{Warn|Host port mappings are not valid when the network mode for a task definition is host or awsvpc. To specify different host and container port mappings, choose the Bridge network mode.}}
 
Advanced container configuration
 
Healthcheck
 
===Environment===
 
CPU Units: 2048
 
Essential: If the essential parameter of a container is marked as true, the failure of that container will stop the task.
 
Entry point:
 
Command:
 
Working directory:
 
====Environment Variables====
 
This is where we can configure Spring applications' behavior:
{|
|SPRING_PROFILES_ACTIVE || Value || playground
|-
|SERVER_PORT ||Value ||10001
|}
 
===Network Settings===
 
===Storage and Logging===
 
Read only root file system
 
Mount points:
 
Volumes from:
 
Log configuration: Unselect "Auto-configure CloudWatch Logs"
 
Log driver: awslogs
 
Values:
 
awslogs-group: /playground
 
Note that the log group must be created if it does not exist, otherwise the container launch will fail:
 
{{Internal|Amazon_CloudWatch_Operations#Create_a_Log_Group|Create a Log Group in CloudWatch}}
 
awslogs-region: us-west-2
 
awslogs-stream-prefix: themyscira
 
===Resource Limits===
 
===Docker Labels===


=Create a Service=
=Create a Service=


Must create at least a Task Definition first.
Before creating the [[Amazon_ECS_Concepts#Service|service]], at least a Task Definition must be created in advance. See: {{Internal|#Create_a_Task_Definition|Create a Task Definition}}
 
If you plan to expose this service though a load balancer, the load balancer must be created first. See below for details: {{Internal|#Load_Balancing|Load Balancing}}


Clusters -> <''Cluster Name''> -> Services tab -> Create:
Clusters -> <''Cluster Name''> -> Services tab -> Create:


Launch Type: FARGATE
==Service Configuration==
 
====Launch Type====
 
FARGATE
 
====Task Definition====
 
* Family themyscira
* Revision: latest
 
====Platform version====
 
LATEST
 
====Cluster====
 
playground
 
====Service name====
 
themyscira
 
====Service type====
 
REPLICA.
 
More details: [[Amazon_ECS_Concepts#Service_Type|Service Type]].
 
====Number of Tasks====
 
1
 
====Minimum healthy percent====
 
100
 
====Maximum percent====
 
200
 
==Deployments==
 
====Deployment type====
 
The options are [[Amazon_ECS_Concepts#Rolling_Update|Rolling update]] and "[[Amazon_ECS_Concepts#Blue.2FGreen_Deployment|Blue/green deployment]] (powered by AWS CodeDeploy)".
 
For simple cases, "Rolling update" is sufficient and a redeployment can be triggered by deleting a task.
 
If you intend to drive deployments with AWS CodeDeploy, choose "Blue/green deployment (powered by AWS CodeDeploy)". For more details, see [[Amazon_ECS_Concepts#Blue.2FGreen_Deployment|ECS Concepts - Blue/Green Deployment]]. This configuration is required if an [[AWS_CodeDeploy_Operations#Prerequisites|(additional) AWS CodeDeploy deployment group is created for this service]]. Blue/green deployment requires the selection of a [[AWS_CodeDeploy_Concepts#Service_Role|Service role for CodeDeploy]].
 
{{Warn|Changing between "Rolling Update" and "Blue/green deployments" require recreation of the service.}}
 
====Task tagging configuration====
 
Enable ECS managed tags.
 
==Configure network==
 
===VPC and security groups===
 
Cluster VPC: vpc-*
 
Subnets: ...
 
Security groups: Edit it and insure the desired ports are properly exposed, change the name. If the security group already exists, it can be used.
 
Auto-assign public IP: ENABLED
 
===Health check grace period===
 
===Load Balancing===
 
{{Warn|Load balancing settings can only be configured on service creation. If the service is to be exposed as integration endpoint by the API Gateway, it needs a [[AWS_Elastic_Load_Balancing_Concepts#Network_Load_Balancer|network load balancer]].}}
 
If none exists, a network load balancer has to be created in advance as shown here: {{Internal|AWS_Elastic_Load_Balancing_Operations#Create_a_Network_Load_Balancer|Create a Network Load Balancer}}
 
Load balancer type: [[AWS_Elastic_Load_Balancing_Concepts#Network_Load_Balancer|Network Load Balancer]]
 
Service IAM role: <font color=darkgray>Return here.</font>
 
====Container to load balance====
 
Container name: port. themyscira:10001:10001 -> Add to load balancer.
 
This is does not actually add anything to the load balancer just yet, but allows target configuration:
 
'''themyscira:10001'''
 
Production listener port: You can use the existing listener (80:TCP) or create a new one. It seems that using "80:TCP" does not work, as the console queries the load balancer configuration in an attempt to find an "ip" target group associated with this listener and it displays an error message if it does not find it. <font color=darkgray>It might work if I create an "ip" target group, appropriately named. Try next time.</font>
 
Production listener port: create new: 10001. Creating "new one" refers to [[AWS_Elastic_Load_Balancing_Operations#Listeners|creating a new listener]].
 
Production listener protocol: TCP
 
Target group name: create new: themyscira
 
Target group protocol: TCP
 
Target type: "ip"
 
Health check protocol: TCP
 
After the service creation procedure completes, a new listener and target group pair is created with the load balancer:
 
:[[Image:NewLBListener.png]]
 
===Service Discovery===
 
Note that this is optional and it should only be completed if we want to access the ECS service endpoint with a DNS name. If a [[#Load_Balancing|load balancer]] was previously configured this is probably not necessary.
 
{{Warn|Updating existing services to configure service discovery for the first time or change the current configuration is not supported. Service discovery should be configured when the service is created.}}
 
{{Internal|Amazon_ECS_Service_Discovery_Concepts|Service Discovery Concepts}}
 
Enable service discovery integration: Check.
 
[[Amazon_ECS_Service_Discovery_Concepts#Namespace|Namespace]]:
 
Namespace name: <font color=darkgray>Even if I created a hosted zone in advance in Route53 console, I was not able to select it here, so I chose "create new private namespace", with the same name as the existing hosted zone: playground".
 
Nothing showed up in Custer VPC dropdown, I assume the VPC previously specified is used.
</font>
 
Configure [[Amazon_ECS_Service_Discovery_Concepts#Service_Discovery_Service|service discovery service]]: "Create a new service discovery service".
 
Service discovery name: themyscira
 
Enable ECS task health propagation: check.
 
Docker health checks.
 
Enable public DNS health check.
 
DNS records for service discovery.
 
DNS record type: A
 
TTL: 60 seconds.
 
===Set Auto Scaling===
 
Optional.
 
Do not adjust the service's desired count.
 
Create Service
 
=Force Deployment with AWS CLI=
 
aws ecs update-service --cluster ${ECS_CLUSTER} --service ${ECS_NAME} --force-new-deployment
 
Note that if the deployment is managed by a CODE_DEPLOY deployment controller (see [[Amazon_ECS_Concepts#Blue.2FGreen_Deployment|blue/green deployments]]), the attempt to redeploy from command line will fail with:
 
An error occurred (InvalidParameterException) when calling the UpdateService operation: Cannot force a new deployment on services with a CODE_DEPLOY deployment controller. Please use Code Deploy to trigger a new deployment.
 
=Troubleshooting=
 
{{External|[https://docs.aws.amazon.com/AmazonECS/latest/developerguide/troubleshooting.html ECS Troubleshooting]}}
 
==Troubleshooting Stopped Tasks==


Task Definition: Family playground-
If tasks go through a PROVISIONING, then PENDING status, and they disappear, their death cause can be investigated post-mortem by going to Cluster -> Services -> <''service-name''> -> Tasks -> Stopped -> Click on one of the stopped tasks IDs. The exit status and reason should be available there.

Latest revision as of 18:00, 30 March 2019

External

Internal

Overview

Amazon ECS and CloudFormation

Amazon ECS Deployment with CloudFormation

Create a Cluster

Create a Cluster - Reference

Procedure

This procedure describes how to create a cluster with the Amazon Console. Cluster should be preferably be created with CloudFormation.

Amazon ECS -> Clusters -> Create Cluster

Networking only (Fargate)

Cluster Name

Networking

Create VPC: Even if a cluster uses a VPC, it does not seem to be possible to create the VPC in advance, and just refer it during the cluster creation process - at least when the cluster is created from the console. If no VPC is created during the cluster creation process, the cluster probably uses one of the existing VPCs. Which one? Maybe the default VPC of the account? For more details see:

Relationship between a Cluster and a VPC

CIDR block

10.7.0.0/16

Subnet 1:

10.7.1.0/24

Subnet 2:

10.7.2.0/24

Result and Next Steps

The procedure will create the cluster and the following associated resources:

A CloudFormation stack. The stack automatically gets a name (EC2ContainerService-<cluster-name>).

A VPC. The VPC spans several availability zones. It is probably a good idea to navigate to the VPC console by following the link, and update the name to something relevant.

Subnets. It is probably a good idea to navigate to the VPC console by following the links, and update the name of the subnets to something relevant.

An Internet gateway. It is probably a good idea to navigate to the VPC console by following the link, and update the name to something relevant.

A route table. It is probably a good idea to navigate to the VPC console by following the link, and update the name to something relevant. The route table will be associated automatically with the subnets created by the process. The routes will include the subnets for the relevant IP address ranges, and the internet gateway for everything else.

An Amazon EC2 route.

A virtual private gateway attachment.

Configure security group to allow access

Create a Task Definition

Create a Task Definition - Reference

This is the procedure to create a task definition:

Amazon ECS -> Task Definitions -> Create a New Task Definition -> FARGATE -> Next Step

Task Definition Name: themyscira

Requires Compatibilities: FARGATE

Task Role: If the task only needs generic permissions, which should be the case, it is a good idea to create a generic Task Role, shared across clusters, and use it here. This is how roles can be created:

Create an IAM Task Role

After the task role is correctly created, it should show up in the "Task Role" drop-down box.

Network Mode: awsvpc

Task execution IAM role - this is the role that authorizes Amazon ECS to pull private images and publish logs for the task. This takes the place of the EC2 Instance role when running tasks:

Create an IAM Task Execution Role

After the task role is correctly created, it should show up in the "Task execution role" drop-down box. If it does not show up, refresh the page.

Task size:

Task memory (GB): 4GB

Task CPU (vCPU): 2 vCPU

Container Definitions: Add Container

Container name: themyscira

Image: 673499572719.dkr.ecr.us-west-2.amazonaws.com/com.uplift/playground/themyscira:latest

If the repository does not exist, create it:

Amazon ECR Operations - Create Repository

No Private repository authentication.

Memory Limits (MiB): Hard Limit 4096

Port Mappings. Port mappings allow containers to access ports on the host container instance to send or receive traffic.: 10001 (tcp)


Host port mappings are not valid when the network mode for a task definition is host or awsvpc. To specify different host and container port mappings, choose the Bridge network mode.

Advanced container configuration

Healthcheck

Environment

CPU Units: 2048

Essential: If the essential parameter of a container is marked as true, the failure of that container will stop the task.

Entry point:

Command:

Working directory:

Environment Variables

This is where we can configure Spring applications' behavior:

SPRING_PROFILES_ACTIVE Value playground
SERVER_PORT Value 10001

Network Settings

Storage and Logging

Read only root file system

Mount points:

Volumes from:

Log configuration: Unselect "Auto-configure CloudWatch Logs"

Log driver: awslogs

Values:

awslogs-group: /playground

Note that the log group must be created if it does not exist, otherwise the container launch will fail:

Create a Log Group in CloudWatch

awslogs-region: us-west-2

awslogs-stream-prefix: themyscira

Resource Limits

Docker Labels

Create a Service

Before creating the service, at least a Task Definition must be created in advance. See:

Create a Task Definition

If you plan to expose this service though a load balancer, the load balancer must be created first. See below for details:

Load Balancing

Clusters -> <Cluster Name> -> Services tab -> Create:

Service Configuration

Launch Type

FARGATE

Task Definition

  • Family themyscira
  • Revision: latest

Platform version

LATEST

Cluster

playground

Service name

themyscira

Service type

REPLICA.

More details: Service Type.

Number of Tasks

1

Minimum healthy percent

100

Maximum percent

200

Deployments

Deployment type

The options are Rolling update and "Blue/green deployment (powered by AWS CodeDeploy)".

For simple cases, "Rolling update" is sufficient and a redeployment can be triggered by deleting a task.

If you intend to drive deployments with AWS CodeDeploy, choose "Blue/green deployment (powered by AWS CodeDeploy)". For more details, see ECS Concepts - Blue/Green Deployment. This configuration is required if an (additional) AWS CodeDeploy deployment group is created for this service. Blue/green deployment requires the selection of a Service role for CodeDeploy.


Changing between "Rolling Update" and "Blue/green deployments" require recreation of the service.

Task tagging configuration

Enable ECS managed tags.

Configure network

VPC and security groups

Cluster VPC: vpc-*

Subnets: ...

Security groups: Edit it and insure the desired ports are properly exposed, change the name. If the security group already exists, it can be used.

Auto-assign public IP: ENABLED

Health check grace period

Load Balancing


Load balancing settings can only be configured on service creation. If the service is to be exposed as integration endpoint by the API Gateway, it needs a network load balancer.

If none exists, a network load balancer has to be created in advance as shown here:

Create a Network Load Balancer

Load balancer type: Network Load Balancer

Service IAM role: Return here.

Container to load balance

Container name: port. themyscira:10001:10001 -> Add to load balancer.

This is does not actually add anything to the load balancer just yet, but allows target configuration:

themyscira:10001

Production listener port: You can use the existing listener (80:TCP) or create a new one. It seems that using "80:TCP" does not work, as the console queries the load balancer configuration in an attempt to find an "ip" target group associated with this listener and it displays an error message if it does not find it. It might work if I create an "ip" target group, appropriately named. Try next time.

Production listener port: create new: 10001. Creating "new one" refers to creating a new listener.

Production listener protocol: TCP

Target group name: create new: themyscira

Target group protocol: TCP

Target type: "ip"

Health check protocol: TCP

After the service creation procedure completes, a new listener and target group pair is created with the load balancer:

NewLBListener.png

Service Discovery

Note that this is optional and it should only be completed if we want to access the ECS service endpoint with a DNS name. If a load balancer was previously configured this is probably not necessary.


Updating existing services to configure service discovery for the first time or change the current configuration is not supported. Service discovery should be configured when the service is created.

Service Discovery Concepts

Enable service discovery integration: Check.

Namespace:

Namespace name: Even if I created a hosted zone in advance in Route53 console, I was not able to select it here, so I chose "create new private namespace", with the same name as the existing hosted zone: playground".

Nothing showed up in Custer VPC dropdown, I assume the VPC previously specified is used.

Configure service discovery service: "Create a new service discovery service".

Service discovery name: themyscira

Enable ECS task health propagation: check.

Docker health checks.

Enable public DNS health check.

DNS records for service discovery.

DNS record type: A

TTL: 60 seconds.

Set Auto Scaling

Optional.

Do not adjust the service's desired count.

Create Service

Force Deployment with AWS CLI

aws ecs update-service --cluster ${ECS_CLUSTER} --service ${ECS_NAME} --force-new-deployment

Note that if the deployment is managed by a CODE_DEPLOY deployment controller (see blue/green deployments), the attempt to redeploy from command line will fail with:

An error occurred (InvalidParameterException) when calling the UpdateService operation: Cannot force a new deployment on services with a CODE_DEPLOY deployment controller. Please use Code Deploy to trigger a new deployment.

Troubleshooting

ECS Troubleshooting

Troubleshooting Stopped Tasks

If tasks go through a PROVISIONING, then PENDING status, and they disappear, their death cause can be investigated post-mortem by going to Cluster -> Services -> <service-name> -> Tasks -> Stopped -> Click on one of the stopped tasks IDs. The exit status and reason should be available there.