- AWS at Scale
- Posts
- AWS at Scale #3: Platform Concepts at Scale
AWS at Scale #3: Platform Concepts at Scale
To understand, design and operate AWS at Scale, you’ll need a good grasp of AWS platform concepts at scale.
Introduction
To understand, design and operate AWS at Scale, you’ll need a good grasp of AWS platform concepts. If you’re coming from the startup world or an SME, these concepts can be difficult to grasp.
I also want to give you an understand of the sequence of build when designing and deploying AWS platform configuration at scale.
So let’s get started with your very first AWS account!
An AWS Management Account
Let’s start with the ‘outer’ container, the other shell so to speak…
If you’re a beginner to AWS and you’ve created your first account and started deploying resources into it, great, well done - you’re on your way 🎉
If you are building a product or platform that is growing and expanding and you are still deploying into your first and only AWS account, stop there and think ✋.
Think about how you might want to manage your AWS platform as it scales. Why? Because deploying AWS organisations into this account (once it’s full of AWS resources) is a no-go, don’t do it, eventually this AWS account will become your ‘management account‘.
Don’t deploy everything in a single account - AWS at Scale
Read this post for more information:
Q. What should I do instead?
Create an additional (clean) AWS account and treat it as your management account. A management account should be so clean that even the thought of deploying even the tiniest of workloads into it will make you shiver with disgust.
Avoid deploying any centralised services into your clean AWS management account, (even AWS are enabling the deployment of their centralised services outside of your management account, here’s another example with AWS Guard Duty). AWS use the term ‘delegated administrator account’ to describe this practice, by leveraging delegated administrator accounts (additional AWS accounts for a specific purpose) you’re removing the need for administrators to log into your primary AWS management account for day to day admin tasks. Your goal here is simple, keep everything you can out of your AWS management account, as your AWS platform scales, the management account becomes the crown jewel of your AWS organisation.
I just want to reiterate that last point as I see it so often across many organisations that have grown in lockstep with AWS, the key takeaway… think.. who do you want logging into your management account? Delegate core services & admin outside of it as much as possible, in large enterprises you’ll see AWS accounts for security, core networking, centralised logging, auditing, and break glass (more on that one later..)
Now that’s super clear, when you create a clean management account, you can enable AWS Organisations and invite your workload accounts (where you’ve been deploying workload resources) when the time is right to do so.
AWS Organisations
Once you have a clean management account, you can enable AWS Organisations (within it) and eventually planning & building your organisational units (more on that later). Organisational units provide logical grouping of your AWS accounts into groups that better reflect the operational setup of your organisation.
Once your OU structure is established, this enables the application of various policies and permission boundaries across your accounts depending on which OU they sit in, this is very useful for different business units/divisions as well as segregation of prod and non-prod accounts and sandbox accounts.
AWS Organisations - AWS at Scale
Think of AWS Organisations as the first step to providing some compliance and structure to your AWS platform at scale. Here you can apply policies at the root level of the organisation as well as at organisational unit level.
Don’t go to far down this rabbit hole though without considering AWS Control Tower (which we’ll come into next).
Two important things to cover here if you are working at scale:
Nobody wants to convert an existing AWS account into a management account, but we’ve covered that already.
If you build out a significant deployment of AWS Organisations then you may struggle to integrate AWS Control Tower at a later date (due to the overlap of configuration, rework and general risk management).
If you’re AWS Organisations is somewhat complex, AWS ProServe may recommend a migration to an new AWS Landing Zone, this is primarily down to risk mitigation as it is a bit of a one way door.
AWS Control Tower
You’ve got AWS Organisations in place (setup from within your management account). Now it’s time to consider using AWS Control Tower (sooner rather than later for reasons discussed above).
AWS Control Tower is another service that you enable within your clean management account (you can’t delegate it out to another account so be mindful who needs to login and what time bound/permissions/polices they need).
You can think of AWS Control Tower like an orchestration and management tool for more opinionated granular governance and compliance over and above what AWS Organisations offers (under the hood, it’s basically AWS Config).
AWS Control Tower - AWS at Scale
AWS Organisations and AWS Control tower work together provide a well structured and governed platform to meet specific governance and compliance regulations. It has some good continuous compliance monitoring capabilities, automated remediation, pre-configured controls and preventive measures to ensure all accounts and workloads/resources within them meet governance & compliance objectives.
AWS Control Tower also includes Integrated Identity and Access Management (IAM/SSO) for your AWS Organisation and member accounts..AWS Account Factory for vending out new accounts is also included (although I’m against admins logging into the management account to use it) and AWS SSO/Identity Centre configuration (which can be delegated out to a different account).
Worth noting that AWS Control Tower does not mandate that all accounts be governed by it. AWS member accounts (more on them later) need to be invited into Control Tower or be part of an OU that is registered and governed by AWS Control Tower. For example, your management account isn’t governed by Control Tower, I can’t think of a use case where you wouldn’t invite all your member accounts in but hey ho, there is duality on offer here.
You’ll also frequently hear the term ‘Landing Zone’ when working with AWS Control Tower, best to think of this in 2 ways:
A Foundation Landing Zone:
AWS Organisations & AWS Control Tower provides a foundation Landing Zone, some people also call this a ‘control plane’, this is your core AWS Organisation, Control Tower, policies, roles, SSO, vending, and delegated core accounts (more on that later).
In some large organisations there are more than one of these:
A dev LZ is a mirror of a production landing zone and is used to test high level global configurations that cannot be tested in isolation. This landing zone doesn’t host any live AWS member accounts for workloads. This is hosted in it’s own management account.
A production LZ that hosts all the member accounts (again in some very large AWS deployments you’ll see more than one of these and they may be also known as consumer landing zones). Each production landing zone (aka consumer landing zones) are also hosted in their own management account.
Workload Landing Zones:
If you get account vending working correctly with good account baselines, controls, automated governance and compliance at scale, time bound SSO access, VPC vending and environment segregation (at account level) then you are basically building workload landing zones every time your vend.
Workload landing zones can be considered as an opinionated template for your workloads where everything plumbs together to ensure that the development team are building within the right controls, segregation and guard rails.
AWS Organisation Units
I’ve dropped these in after AWS Control Tower rather than with AWS Organisations. In the perfect world you’ll want to be designing your AWS Organisation Units (your operational structure) along with AWS Organisations and AWS Control Tower, not before and not as an after thought.
AWS Organisational Units - AWS at Scale
There’s a lot to consider here (it’s not as easy as you may think..) for example:
Where are you going to put mergers and acquisitions?
Where are migrated accounts going (if you have any from a previous organisation)?
What does your standard operating environment OU structure look like for net new and refactored account vends?
How are you splitting up Prod and Non-Prod accounts?
Where are you putting shared accounts for workload utilities or lower tier workloads?
Where are sandbox accounts going?
What about stale accounts?
What about accounts that are being divested?
What about accounts that are being audited for suspicious activity?
Where are the core accounts going (SSO, Networking, Shared Services)?
Where are the native Control Tower vended accounts going?
Where are shared services accounts going?
Hence the need to plan them with AWS Organisations, AWS Control Tower and the future of your business in mind and avoid any future, painful rework.
AWS Member Accounts
So far, in sequential order we have the following covered:
Your first AWS account (management account).
AWS Organisations.
AWS Control Tower.
AWS Organisational Units (which are logical account groupings of AWS Organisations)
Now we have a great baseline for any AWS member account that you might want to vend for specific projects. What’s an AWS member account?
AWS member accounts are any accounts that are not your management account.
AWS Member Accounts - AWS at Scale
There are many different types of member accounts that will need to deployed within your AWS Organisation which will ultimately fall under the management of AWS Control Tower. Here’s a few:
Foundational Landing Zone core accounts
Core Network
Security
Auditing
Logging
Identity
Shared Services
Workload accounts (depending on tier)
Prod
Non Prod
Dev
Stage
Standalone
Developer accounts
Sandbox
Infrastructure accounts.
Native Cloud intelligence dashboards
SaaS Integrations
Migrated accounts
Coming in from acquistions
Coming in from legacy AWS Organisations
Departing accounts
Divestments
Time bound sandbox
Under audit
This should give you a good understanding of all the different types of account patterns you may need to cater for. The good thing is if you’ve designed and planned for this during your AWS Organisations and AWS Control Tower designs you’ll be in a good place to cater for them but you also need to very opinionated and decisive on what comes in through that front door!
Quick reminder that I’ll be covering all these accounts and other general stuff throughout my AWS at Scale series here (don’t forget to subscribe!)
AWS Regions
Now we are getting into the overall workload architecture within an AWS member account.
It’s worth noting that AWS Control Tower allows you to design and configure for supported regions in your foundational landing zone which then rolls into the support for your workload landing zones.
It’s important to get a grasp of the regions you’ll be supporting during your initial AWS Organisations and AWS Control Tower design. You don’t need to be supporting all of them, especially if you are deploying east-west traffic inspection and other expensive services to govern network and cross VPC communications and if you have governance and compliance regulations that prohibit the provisioning of certain services and resources in a region that will violate those regulations and get your ass fined.
Typically you can consolidate regions into something more manageable and then raise exceptions when needed through good change and risk management.
AWS VPCs
Designing and deploying VPCs at scale is really important. You’ll need a reference architecture with consistency and strict limits on what your workload teams can change if anything, you’ll also need the ability to provision accounts with or without VPCs depending on the requirement.
VPC design is a significant component of AWS at Scale, in your reference design you’ll want the following features:
Consistent reference architecture based vending
Centralised logging
A good solution for IP availability for serverless and microservices (they are worse than Teams eating up RAM)
Eliminate NACLs and use blackhole routing (NACLs are fecking awful at scale)
Strong opinionated SCPs and Controls
Prohibit VPC peering
Well thought out subnet layers
IPAM management of internal routable IPs (if needed)
Integration with Transit Gateway and east west inspection (if you are deploying that)
I good solution for how you connect services together without VPC peering (see the below post)
AWS Availability Zones
Heavily dependent on a tiering model and tightly coupled with your VPC vending.. I wouldn’t deploy a reference architecture to anything less than 3 AWS Availability Zones.
AWS Availability Zones - AWS at Scale
I’ve been caught out by this in the past, sometimes AWS drops the AZ’s down to new AWS accounts (although you get plenty of notice).
Understanding availability zones is an essential learning path towards AWS at Scale especially if you are tiering as part of your overall architecture engagement, understand how AWS provides uptime through compute and storage is solid place to start:
AWS VPC Subnets
And that brings us to our last platform concept, subnets (which we have pretty much covered in the VPC section above).
Obviously you’re subnets will cover all your availability zones and you need to design for this accordingly.
AWS VPC Subnets - AWS at Scale
Final Thoughts
People complain that AWS is a steep learning curve.
It is and it isn’t… it depends on the scale of the operation. Doing AWS at Scale is a steep learning curve, but it’s worth the effort. Cloud Platform roles are becoming more and more popular and in order to perform this role successfully, a solid understanding of AWS at Scale within the enterprise is critical to it’s success.
Thank you for reading this post, if you like it please subscribe below and share with your network.
Reply