AWS at Scale Side Quest. Platform Concepts

To understand, design and operate AWS at Scale, you’ll need a good grasp of AWS platform concepts.

Covered in this AWS at Scale Side Quest

Reading time: 5 minutes 🫡

Introduction

To understand, design and operate AWS at Scale, you’ll need a good grasp of AWS platform concepts. If you’re coming from the startup world or an SME, these concepts can be difficult to grasp.

I also want to give you an understand of the sequence of build when designing and deploying AWS platform configuration at scale.

So let’s get started with your very first AWS account!

AWS Management Account

Let’s start with the ‘outer’ container, the other shell so to speak…

If you’re a beginner to AWS and you’ve created your first account and started deploying resources into it, great, well done - you’re on your way 🎉

If you are building a product or platform that is growing and expanding and you are still deploying into your first and only AWS account, stop ✋.

Think about how you might want to manage your AWS platform at scale. Why? Because deploying AWS organisations into this account once it is full of AWS resources is a no-go, don’t do it as it becomes your ‘management account‘.

AWS Management Account - AWS at Scale

Q. What should I do instead?

A. Create another AWS account and treat it as your management account. A management account should be so clean that even the thought of deploying even the tiniest of workloads into it will make you shiver with disgust.

Avoid deploying any centralised services into your management account, even AWS are enabling the deployment of their centralised services outside your management account (another example with Guard Duty).

Key takeaway… who do you want logging into your management account?

It is your crown jewel! Delegate admin features outside of it as much as possible, this is why in large enterprises you’ll see AWS accounts for security, core networking, centralised logging, auditing, and break glass (more on that later..)

Now that’s clear, when you create a clean management account, you can enable AWS Organisations and invite your workload account (where you’ve been deploying resources) when the time is right to do so.

AWS Organisations

Once you have a clean management account, you can enable AWS Organisations (within it) and eventually planning & building your organisational units (more on that later). Organisational units provide logical grouping of your AWS accounts into groups that better reflect the operational setup of your organisation.

AWS Organisations - AWS at Scale

Think of AWS Organisations as the first step to providing some compliance and structure to your AWS empire at scale. Here you can apply policies at the root level of the organisation as well as at organisational unit level.

Don’t go to far down this rabbit hole though without considering AWS Control Tower (which we’ll come into next).

Two important things to cover here if you are working at scale:

  • Nobody wants to convert an existing AWS account into a management account, but we’ve covered that already.

  • If you build out an significant deployment of AWS Organisations then you may struggle to integrate AWS Control Tower at a later date due to overlap of configuration, rework and general risk management.

If you’re AWS Organisations is somewhat complex, AWS ProServe may recommend a migration to an new AWS Landing Zone, this is primarily down to risk mitigation as it is a bit of a one way door.

AWS Control Tower

You’e got AWS Organisations in place (setup from within your management account). Now it’s time to consider using AWS Control Tower (sooner rather than later for reasons discussed above).

AWS Control Tower is another service that you enable within your clean management account (you can’t delegate it out to another account so be mindful who needs to login and what time bound/permissions/polices they need).

You can think of AWS Control Tower like an orchestration and management tool for more opinionated granular governance and compliance over and above what AWS Organisations offers (under the hood, it is basically AWS Config).

AWS Control Tower - AWS at Scale

AWS Organisations and AWS Control tower work together provide a well structured and governed platform to meet specific governance and compliance regulations. It has some good continuous compliance monitoring capabilities, automated remediation, pre-configured controls and automated remediation.

AWS Control Tower also includes Integrated Identity and Access Management (IAM/SSO) for your AWS Organisation and member accounts..There is also AWS Account Factory for vending out new accounts (although I am dead against admins logging into the management account to use it) and AWS SSO/Identity Centre configuration (which can be delegated out to a different account now).

Worth noting that AWS Control Tower does not mandate that all accounts be governed by it. AWS member accounts (more on them later) need to be invited into Control Tower or be part of an OU that is registered with AWS Control Tower. For example, your management account isn’t governed by Control Tower, I can’t think of a use case where you wouldn’t invite all your member accounts in but hey ho, there is duality on offer here.

You’ll hear the term ‘landing zone’ a lot when working with AWS Control Tower, best to think of this in 2 ways:

  • A foundation landing zone:

    • AWS Organisations & AWS Control Tower provides a foundation Landing Zone, some people also call this a control plane, this is your core AWS Organisation, Control Tower, policies, roles, SSO, vending, and delegated core accounts (more on that later).

    • In some large organisations there are more than one of these:

      • A dev landing zone is a mirror of a production landing zone and is used to test high level global configurations that cannot be tested in isolation. This landing zone doesn’t host any live AWS member accounts for workloads. This is hosted in it’s own management account.

      • A production landing zone that hosts all the member accounts (again in some very large AWS deployments you’ll see more than one of these and they may be also known as consumer landing zones). Each production landing zone (aka consumer landing zones) are also hosted in their own management account.

  • Workload landing zones:

    • If you get account vending working correctly with good account baselines, controls, automated governance and compliance at scale, time bound SSO access, VPC vending and environment segregation (at account level) then you are basically building workload landing zones every time your vend.

    • Workload landing zones can be considered as an opinionated template for your workloads where everything plumbs together to ensure that the development team are building within the right controls, segregation and guard rails.

AWS Organisation Units

I’ve dropped these in after AWS Control Tower rather than with AWS Organisations. In the perfect world you’ll want to be designing your AWS Organisation Units (your operational structure) along with AWS Organisations and AWS Control Tower, not before and not as an after thought.

AWS Organisational Units - AWS at Scale

There’s a lot to consider here, it’s not as easy as you may think.. for example:

  • Where are you going to put mergers and acquisitions

  • Where are migrated accounts going (if you have any from a previous organisation)

  • What does your standard operating environment OU structure look like for net new account vends?

    • How are you splitting up Prod and Non Prod accounts?

    • Where are you putting shared accounts for workload utilities or lower tier workloads?

  • Where are sandbox accounts going?

  • What about stale accounts?

  • What about accounts that are being divested?

  • What about accounts that are being audited for suspicious activity

  • Where are the core accounts going?

  • Where are Control Tower vended accounts going?

  • Where are shared services accounts going?

Hence the need to plan them with AWS Organisations, AWS Control Tower and the future of your business in mind.

AWS Member Accounts

So far, in sequential order we have the following covered:

  1. Your first AWS accounts (management account).

  2. AWS Organisations.

  3. AWS Control Tower.

  4. AWS Organisational Units (which are logical account groupings of AWS Organisations)

Now we have a great baseline for any AWS member account. What’s an AWS member account? AWS member accounts are any accounts that are not your management account.

AWS Member Accounts - AWS at Scale

There are many different types of member accounts that will need to deployed within your AWS Organisation which will ultimately fall under the management of AWS Control Tower. Here’s a few:

  • Foundational Landing Zone core accounts

    • Core Network

    • Security

    • Auditing

    • Logging

    • Identity

    • Shared Services

  • Workload accounts (depending on tier)

    • Prod

    • Non Prod

      • Dev

      • Stage

    • Standalone

  • Developer accounts

    • Sandbox

  • Infrastructure accounts.

    • Native Cloud intelligence dashboards

    • SaaS Integrations

  • Migrated accounts

    • Coming in from acquistions

    • Coming in from legacy AWS Organisations

  • Departing accounts

    • Divestments

    • Time bound sandbox

    • Under audit

This should give you a good understanding of all the different types of account patterns you may need to cater for. The good thing is if you’ve designed and planned for this during your AWS Organisations and AWS Control Tower designs you’ll be in a good place to cater for them but you also need to very opinionated and decisive on what comes in through that front door!

Quick reminder that I’ll be covering all these accounts and other general stuff throughout my AWS at Scale series here (don’t forget to subscribe!):

AWS Regions

Now we are getting into the overall workload architecture within an AWS member account.

It’s worth noting that AWS Control Tower allows you to design and configure for supported regions in your foundational landing zone which then rolls into the support for your workload landing zones.

It’s important to get a grasp of the regions you’ll be supporting during your initial AWS Organisations and AWS Control Tower design. You don’t need to be supporting all of them, especially if you are deployment east-west traffic inspection and other expensive services to govern network and cross VPC communications and if you have governance and compliance regulations that prohibit the provisioning of certain services and resources in a region that will violate those regulations and get your ass fined.

Typically you can consolidate regions into something more manageable and then raise exceptions when needed through good change and risk management.

AWS VPCs

Designing and deploying VPCs at scale is really important. You’ll need a reference architecture with consistency and strict limits on what your workload teams can change if anything, you’ll also need the ability to provision accounts with or without VPCs depending on the requirement.

VPC design is a significant component of AWS at Scale, in your reference design you’ll want the following features:

  • Consistent reference architecture based vending

  • Centralised logging

  • A good solution for IP availability for serverless and microservices (they are worse than Teams eating up RAM)

  • Eliminate NACLs and use blackhole routing (NACLs are fecking awful at scale)

  • Strong opinionated SCPs and Controls

  • Prohibit VPC peering

  • Well thought out subnet layers

  • IPAM management of internal routable IPs (if needed)

  • Integration with Transit Gateway and east west inspection (if you are deploying that)

  • I good solution for how you connect services together without VPC peering (see the below post)

AWS Availability Zones

Heavily depending on a tiering model and tightly coupled with your VPC vending.. I wouldn’t deploy a reference architecture to anything less than 3 AWS Availability Zones.

AWS Availability Zones - AWS at Scale

I’ve been caught out by this in the past, sometimes AWS drops the AZ’s down to new AWS accounts (although you get plenty of notice).

Understanding availability zones is an essential learning path towards AWS at Scale especially if you are tiering as part of your overall architecture engagement, understand how AWS provides uptime through compute and storage is solid place to start:

AWS VPC Subnets

And that brings us to our last platform concept, subnets (which we have pretty much covered in the VPC section above).

Obviously you’re subnets will cover all your availability zones and you need to design for this accordingly.

AWS VPC Subnets - AWS at Scale

Final Thoughts

People complain that AWS is a steep learning curve.

It is and it isn’t… depends on the scale of the operation, doing AWS at Scale is a steep learning curve, but it’s worth the effort. Cloud Platform roles are becoming more and more popular, in order to perform this role successfully a solid understanding of AWS at Scale within the enterprise is critical to success.

Thank you for reading this post, if you like it please subscribe below and share with your network. This is a side quest post but I’ll be getting back to my main AWS at Scale series soon.

Final GIF!

Reply

or to participate.