ECS Platform

Prerequisites

  1. AWS account

  2. Install the AWS CLI and Configure CLI credentials

  3. Install the Attini CLI

  4. Setup/Configure the Attini Framework with the following command:

attini setup --accept-license-agreement --create-init-deploy-default-role --give-admin-access --contact-email [my email]

Note

This configuration gives Attini a lot of access so only do this in a sandbox accounts.

Note

This solutions generate some AWS Cost and it takes ca 15 min to clean up.


Description

This is an example of a medium size container platform which should scale comfortably to 40-60 micro services and approximately 20 to 30 IT personal*. It is a “basic” cloud platform which allows us to create different roles (SysAdmin, DBA, Developer ect) with limited access.

The platform contains one VPC, one RDS database (shared between the services), one EFS (shared between the services), one NLB for the public facing traffic on port 80 (http) and the services runs on an ECS Fargate.

The platform is using these technology’s:

  • AWS ECS Fargate for container orchestration

  • AWS AppMesh for internal communication

  • AWS AccessPoint for EFS access

  • AWS IAM authentication for database access

  • AWS CloudWatch for logging and monitoring

Note

This is not a production ready platform and it have not been tested properly, it only works as and example of how to use Attini.

* These estimates are just a “best guess” based on personal experience and they can vary greatly depending on your organization and applications.


Application architecture

EcsContainerPlatform-AwsInfra

CloudFormation architecture

EcsContainerPlatform-CfnArchitecture

Deployment plan

EcsContainerPlatform-DeploymentPlan

Architectural considerations

Security setup pattern

When you build a cloud platform using CloudFormation you often want different people to update different CloudFormation stacks. For example, you want the DBA to update the database stacks, the developers to update the service stacks etc. This is easily achieved by tag bases access to the “cloudformation:UpdateStack” API.

The problem is that if you create security resources (AWS IAM roles, AWS KMS keys, secrets using AWS Secrets Manager etc) in these stacks, anyone with “cloudformation:UpdateStack” permission can escalate their own privileges.

If you for example give a Database stack the permission to create and update IAM roles, the DBA can use his/her “cloudformation:UpdateStack” permission to create an Administrator IAM role that can be assumed by anyone.

For this reason its good to break out security resources into separate stacks.

An other advantage with this design is that it often offers a workaround to circular dependency’s.


Security groups

Creating security groups using CloudFormation is a very painful exercise.

Security groups have a habit of becoming very complex and full of circular dependency’s and in a growing platform it becomes very hard to figure out in what CloudFormation stack a security group or its openings are in.

We have 2 recommended approaches to simplify this:

  1. Use as lite security groups a possible and replace them with other security services (ex AWS AppMesh can control access between services and AWS SSM Session manager can replace SysAdmins SSH access).

  2. Create all your security groups and their relationships in one CloudFormation stack in the beginning of the deployment plan (but after the VPC). The security groups template also have a tendency to become very big and repetitive so its a excellent use-case to the AWS CDK.


Service configuration

The platform provides a standard template for services. The problem with this is that we now have 2 teams maintaining the same stack. The central platform team creates the template and the development teams provides the container and some operational configuration.

The Attini Framework solves this by using the fallback configuration. This allows the central platform team to tell the Attini Framework to always respect the current configuration for specific parameters. If for example the container tag for a service is configured with the fallback configuration, the developers can update that parameter in the AWS Console or via a script running on a build server and the central platform team will never override it when they run the deployment plan. This way the platform team can provide the other roles (DBA or developers) with a collection of parameters that they can managed them self.

The next problem is to keep the service stacks up to date. We use the AttiniMap function which is based on the AWS StateMachine Map for this. AttiniMap allows us to manage multiple stacks based on a list if configuration files. In this example the configuration files are found under “/services/config/${environment}” and we use the “GetConfigLambda” to automatically list the files. However these files can be fetched from anywhere so they don’t need to origin from the Attini distribution.

To create a new service, simply create a new configuration file under “/services/config/${environment}” and it will automatically be picked up.

Note

The AttiniMap function is still under development and are missing documentation. We will try to fix this ASAP.

Because most of the service configuration is automatically done via the Attini deployment plan, and because many parameters need the same value for all services we use a default config file (using the extends property) with all the boiler plate configuration.


Platform advantages

A platform like this is good because it allows a central cloud team (“Cloud center of excellence”, “DevOps team” or whatever name your organization gives it) to set up a cloud environment that is quite similar to an on-site data center. This makes it a good design for a “fist migration step” to the cloud.

It also limits the AWS knowledge required by the developers so that they can continue working with their applications like they have done on-site.

It can also support minimum access model which is important when multiple teams with different roles and knowledge is working in the same AWS Account and the same network.

Platform limitations

A platform like this is designed for a central cloud team to “serve” the rest of the organization with a cloud platform. This often results in the central cloud team becoming a organization bottleneck.

It also limits the developers ability to freely work with the all available AWS resources which is very big limitation if you have an cloud mature organization.

Hard to configure certain parts of the platform

This setup makes it quite hard to configure the Services (especially the AWS AppMesh configuration) in a flexible manner. Therefore this is an excellent use-case for the AWS CDK. We might add that to this example in the future.

Noisy neighbor

This type of an architecture often results in “noisy neighbor” issues. Even if the services have their own directory in EFS and their own database in the postgres cluster, they still share resources on a hardware level. This means that one service can effect performance/availability of other services if they use to much of the Database RAM or CPU or hit any EFS limits.

An other problem is that you can not rollback the database for one service, you have to do a rollback on the whole cluster if any data is corrupted.

To avoid this tight coupling you can create separate databases and EFS for every service, but this will often result in higher cost and more management.


Deployment guide

  1. Clone the example repository:

    git clone git@github.com:attini-cloud-solutions/example-ecs-platform.git
    
  2. Deploy the distribution using the Attini CLI:

    attini environment create stage
    attini deploy run example-ecs-platform
    

    Note

    Now we name the environment “stage”, you can change this to any name you want. Keep in mind that it will effect the name of the CloudFormation stacks.


Test the platform

  1. Go to the EC2 LoadBalancer console and get the public DNS, that should show default ubuntu site.

  2. SSH into one of the containers using ECS Exec:

    aws ecs execute-command --cluster cluster-name \
        --task task-id \
        --container container-name \
        --interactive \
        --command "/bin/sh"
    

    Get the configuration you need by looking at one of the ECS Tasks in the the AWS Console.

    Note

    You need to the Session Manager plugin for the AWS CLI installed on your computer.

  3. From the service container you can access the database with the command psql -h <DNS> -U postgres. You need to get the DNS from the RDS console and the password from secrets manager.

  4. You can also use curl to make http request to the other services, find the DNS entries in Route53. Example command is curl billing-service.sd.stage.attini.internal but this can chance depending on your config.


Clean up

Delete all the cloudformation stacks. Some of them have dependency’s so you will for example have to wait for the service stacks to be deleted before you can delete their security-setup stacks. The NLB can sometimes fail to delete the first time, but it will work on a retry.

If you want to remove the whole Attini Framework, see Deleting/Clean up the Attini Framework