• AWS at Scale
  • Posts
  • AWS at Scale #Main 6: Building Apps at Scale on AWS Cloud Platform

AWS at Scale #Main 6: Building Apps at Scale on AWS Cloud Platform

In my previous post 'AWS Cloud Platform in Action' I detailed out what's included in an enterprise grade AWS cloud platform in action (and at scale) including an example engagement model for consumability. In this post, we'll discuss leveraging an enterprise grade cloud platform on AWS to build apps at scale.

Introduction

For context here’s my previous post around AWS cloud platform in action (and at scale) including an example engagement model for consumability.

Building on AWS at scale can be mind boggling at times, there are so many experts out there offering their good advice, opinions and best practices on the subject. One can easily become lost in the noise of it all and then forget about the core building blocks and decision making that will pay dividends in the future. With that said, here is my advice and opinion on building apps and services, at scale - all on AWS as a platform following the provider to consumer model that I have detailed out previous posts:

I’ve spent years doing AWS at scale. So here’s a blueprint for app/service/product hosting at scale - to support development at velocity, all on AWS.

Understanding this will depend on where you are on your AWS/Cloud architecture journey, so I’m not expecting you to attempt to use this concept, it’s just a glimpse into how to host and develop product on AWS at an industrial scale.

Click the image below for a high res downloadable version

Product development at scale on AWS - tap the image for a high res version

So what’s going on here?

The General Platform Tech Stack

  • Platform: AWS Cloud

    • AWS Organisations

    • AWS Control Tower

    • AWS EventBridge

    • AWS Service Catalogue API endpoint

    • AWS VPC (provider deployed from reference architecture)

  • DevOps foundation building blocks from Gruntwork

    • Enhanced AWS account vending

    • Battle tested consumer IaC modules

    • Consumer and provider IaC repos

  • Enhanced IaC, pipelines and repo functionality: Terragrunt

  • Zero trust perimeter access: CloudFlare

  • Code hosting, actions, security & vulnerability scanning: GitHub

SDLC AWS Account Patterns

For higher tiered workloads (stuff that pretty important to the business for various reasons) dev, stage and prod accounts are segregated at the top level (like a ships bulkhead).

You’re workload and AWS account structure should designed like a bulkhead

If Dev is compromised (more developers have access to dev) then it doesn’t bleed into stage or production. Cost codes are tagged at the account level, providing clear consumption costs across all environments for charge back.

Key points:

  • Accounts and subsequent environments are all workload specific, no mixing of unrelated workloads or environments within the same account.

  • Prod is obviously prod, access is significantly restricted, observability, security, threat modelling and other SRE tooling is deployed to support the criticality of the workload within this account.

  • Stage is a mirror of production, used for UAT/QA and load testing.

  • Dev is a place where new features can be tested against the code base.

  • Although Sandbox isn’t visible within this account pattern, it is a different use case from dev and they shouldn’t get confused.

  • Zero trust perimeter access is configured for external facing URIs in Dev & Stage.

Terragrunt makes this easy by providing a single repo with folders representing each account in the SDLC account structure.

The Provider IaC Module Directory

Hundreds of battle tested commodity IaC module are wrapped (via Gruntwork), documented and made available for consumption. No need for engineers write your own IaC, all that’s needed to consume modules is a wrapper that references the remote module and provides the configuration parameters.

Key points:

  • Consumers browse the modules, select and scaffold for immediate consumption, add to their consumer repo to deploy AWS resources at scale across dev, stage or prod within minutes.

  • Centralised modules are version controlled and tagged.

  • Centralised modules are a trusted source (over external modules) for battle tested IaC.

  • Consumers are notified when consumed modules require updates (and are expected to update their code within a given timeframe).

  • Consumers do have the ability to bring their own IaC if needed

  • Modules are available as OpenTofu

  • The module catalogue is provided by Gruntwork

The Consumer Repo

A single ‘consumer’ repo spans dev, stage and prod accounts. Each AWS account is represented and segregated as a folder within the repo. These is achieved through the use of Terragrunt.

IaC Consumer Repo - AWS at Scale

OIDC connected GitHub Actions combined with a pre-defined CI/CD pipeline & Terragrunt integration provides promotional workflows to deploy AWS resources into the relevant account as code moves from dev to stage and finally, into prod folders, all within a single repo.

Key points:

  • Consumers create clean Terraform HCL wrappers that reference remote modules for immediate deployment of resources.

  • Access outside of the pipeline should be restricted to promote the standard of ‘everything as code‘ over console access.

The Provider Repo

Centrally governed resources such as reference architecture based VPCs, AWS config, account security baselines and mandatory tagging are deployed through an OIDC connected GitHub Actions based ‘provider‘ pipeline.

IaC Provider Repo - AWS at Scale

  • The provider pipeline and repo is attached at account vend.

  • Centrally governed resources are deployed through automation once the accounts successfully vend.

  • Once provisioned, the VPC configuration can’t be changed (SCP governed), it also caters for all workload scenarios & requirements.

  • Mandatory tagging is deployed with values during the vending process and can only be updated centrally through automated approval workflows.

  • Consumers (workload builders) do not have access to this repo, they deploy AWS workload specific resources through their own consumer repos as outlined above.

Blue Green Deployments

Container building, versioning and release to Dev, Stage and Production is done outside of the SDLC AWS account pattern, using a standalone account.

  • Further segregation and separation of concerns.

  • Images for Dev & Stage are not stored in Prod (or any other SDLC account)

  • Image building and orchestration is automated.

  • Images are scanned for vulnerabilities.

  • Allows for different permissions than Dev and Stage accounts.

Dev, Stage and Prod Users

Access to URI’s in Dev and Stage are controlled by zero trust perimeter boundaries, providing a simple authentication strategy for UAT/QA and Load testing.

AWS at Scale

I hope you’ve found this useful, if so, please share the link to the web version of our newsletter on your socials for anyone else to discover.

You can also find me on LinkedIn and on X All the best, Lee

Now that you’re here

Some other posts on the main branch that might be worth your time..

Reply

or to participate.