- Cost effective - Pay on demand
- Global - Launch anywhere in the world
- Secure - Cloud servvices can be secure by default
- Reliable - Data backup, disaster recovery, data replication, fault tolerance
- Scalable - Increase or decrease resources and services based on demand
- Elastic - Automate scaling during spikes and drop in demand
- Current - The underlying hardware and managed software is patched, upgraded and replaced by the cloud provider without interruption to you
The AWS Global Infrastructure is globally distributed hardware and datacenters that are physically networked together to act as one large resource for the end customer
When you choose a region there are four factors you need to consider:
- What Regulatory Compliance does this region meet?
- What is the cost of AWS services in this region?
- What AWS services are available in this region?
- What is the distance or latency to my end-users?
AWS scope their AWS Management Console on a selected Region. This will determine where an AWS service will be launched what will be seen within an AWS Service's console (you generally don't explicitly set the Region for a service at the time of creation).
Some AWS Services operare across multiple regions and the region will be to "Global" - e.g, Amazon S3, CloudFront, Route53, IAM.
For these global services at the time of creation:
- There is no concept of region (IAM user)
- A single region must be explicitly chosen (S3 Bucket)
- A group of regions are chosen (CloudFront Distribution)
- A region has multiple Availability Zones. An AZ is made up of 1 or more datacenters.
- All AZs in an AWS Region are interconnected with high-bandwidth, low-latency networking, over fully redundant, dedicated metro fiber providing high throughput, low-latency networking between
- All traffic between AZs is encrypted
- AZs are within 100lm (60 miles) of each other
What is a fault domain? A fault domain is a section of a network that id vulnerable to damage if a critical device or system fails. The purpose of a fault domain is that if a failure occurs it will not cascade outside that domain, limiting the damage possible.
What is a fault level? A fault level is a collecion of fault domains
The scope of a fault domain could be:
- Specific servers in a rack
- An entire rack in a datacenter
- An entire room in a datacenter
- The entire datacenter in a building
It's up to the Cloud Service Provider (CSPs) to define the boundaries of the domain.
Each Amazon is designed to completely isolated from the other Amazon Regions.
- This achieves the greatest possbile fault tolerance and stability.
Each AZ is isolated, but the AZs in a Region are connected through low-latency links
Each AZ is designed as an independent failure zone
A "Failure Zone" is AWS describing a Fault Domain
Failure Zone:
- AZs are physically separated within a typical metropolitan region and are located in lower risk flood plains
- Discrete uninterruptible power supply (UPS) and onsite backup generation facilities
- Data centers located in different AZs are designed to be supplied by independent substations to reduce the risk of an event on the power grid impacting more than one AZ
- AZs are all redundnatly connected to multiple tier-1 transit providers
Multi-AZ for High Availability If an application is partitioned across AZs, companies are better isolated and protected from issues such as power outages, lightning strikes, tornadoes, earthquakes, and more.
The AWS Global Network represent the interconnections between AWS Global Infrastructure, commonly referred to as the "The Backbone AWS". Think of it as a private expressway, where things can move very fast between datacenters.
- Edge Locations: can act as on and off ramps to the AWS Global Network
- AWS Global Accelerator, AWS S3 Transfer Acceleration: uses Edge Locations as an on-ramp to quickly reach AWS resources in other regions by traversing the fast AWS Global Network
- Amazon Cloudfront (CDN): uses Edge Locations as an off-ramp, to provide at the Edge Storage and computer near the end user
- VPC Endpoints: ensuring your resources stay within the AWS Network and do not traverse over the public internet
- Points of Presence (POP) is an intermediate location between an AWS Region and the end user, and this location could be a datacenter or collection of hardware.
- For AWS a Point of Presence is a data center owned by AWS or a trusted partner that is utilized by AWS Services related for content delivery or expediated upload.
PoP resources are: • Edge Locations • Regional Edge Caches
Edge Locations are datacenters that hold cached (copy) on the most popular files (eg. web pages,images and videos) so that the delivery of distance to the end users are reduce
Regional Edge Locations are datacenters that hold much larger caches of less-popular files to reduce a full round trip and also to reduce the cost of transfer fees.
Scenario:
- S3 Bucket (Your Origin)
- Edge Location & Regional Edge Cache
- Edge Location (left and right path)
- End User (left and right path)
- PoPs are live at the edge/intersection of two networks
- Tier 1 network is a network that can reach every other network on the Internet without purchasing IP transit or paying for peering.
- AWS Availability Zones are all redundantly connected to multiple tier-1 transit providers
The following AWS Services use PoPs for content delivery or expediated upload:
-
Amazon CloudFront is a Content Delivery Network (CDN) service that:
-
- You point your website to CloudFront so that it will route requests to nearest Edge Location cache
-
- Allows you to choose an origin (such as a web-server or storage) that will be source of cached
-
- Caches the contents of what origin would returned to various Edge Locations around the world
-
Amazon S3 Transfer Acceleration allows you to generate a special URL that can be used by end users to upload files to a nearby Edge Location. Once a file is uploaded to an Edge Location, it can move much faster within the AWS Network to reach S3.
-
- Once a file is uploaded to an Edge Location, it can move much faster within the AWS Network to reach S3.
-
AWS Global Accelerator can find the optimal path from the end user to your web-servers
-
- Global Accelerator are deployed within Edge Locations so you send user traffic to an Edge Location instead of directly to your web-application.
AWS Direct Connect is a private/dedicated connection between your datacenter, office, co-location and AWS.
Direct Connect has two very-fast network connection options:
- Lower Bandwidth 50MBps-500MBps
- Higher Bandwidth 1GBps or 10GBps
A co-location (aka carrier-hotel) is a data 000 center where equipment, space, and bandwidth are available for rental to retail customers
- Helps reduce network costs and increase bandwidth throughput. (great for high traffic networks)
- Provides a more consistent network experience than a typical internet- based connection. (reliable and secure)
- Direct Connect Locations are trusted partnered datacenters that you can establish a dedicated high speed, low-latency connection from your on-premise to AWS.
- eg: You would use the AWS Direct Connect service to order and establish a connection to a company datacenter
- Local Zones are datacenters located very close to a densely populated area to provide single-digit millisecond low latency performance (eg. 7ms) for that area. The purpose of Local Zone is the support highly-demanding applications sensitive to latencies:
-
- Media & Entertainment
-
- Electronic Design Automation
-
- AdTech
-
- Machine Learning
To use Local Zones you need to Opt-in
- AWS Wavelength Zones allows for edge-computing on 5G Networks. Applications will have ultra-low latency being as close as possible to the users
- AWs has paretnered with various Telecom companies to utilize their 5G networks (Verizon, KDDI, Vodafone)
Scenario: You create a Subnet tied to a Wavelength Zone and then you can launch Virtual Machines (VMs) to the edge of the targeted 5G Networks.
What is Data Residency? The physical or geographic location of where an organization or cloud resources reside.
What is Compliance Boundaries? A regulatory compliance (legal requirement) by a government or organization that describes where data and cloud resources are allowed to reside
What is Data Sovereignty? Data Sovereignty is the jurisdictional control or legal authority that can be asserted over data because it's physical location is within jurisdictional boundaries
For workloads that need to meet compliance boundaries strictly defining the data residency of data and cloud resources in AWS you can use:
-
- AWS Config is a Policy as Code service. You can create rules to continuous check AWS resources configuration. If they deviate from your expectations you are alerted or AWS Config can in some cases auto-remediate.
-
- IAM Policies can be written explicitly deny access to specific AWS Regions. A Service Control Policy (SCP) are permissions applied organization wide.
-
- AWS Outposts is physical rack of servers that you can put in your data center. Your data will reside whenever the Outpost Physically resides
What is GovCloud? A Cloud Service Provider (CSP) generally will offer an isolated region to run FedRAMP workloads.
- AWS GovCloud Regions allow customers to host sensitive Controlled Unclassified Information and other types of regulated workloads.
THE CLIMATE PLEDGE Amazon co-founded the Climate Pledge to achieve Net-Zero Carbon Emissions by 2040 across all of Amazon's business (this includes AWS)
AWS Cloud's Sustainability goals are composed of three parts:
- Renewable Energy
- AWS is working towards having their AWS Global Infrastructure powered by 100% renewable energy by 2025.
- AWS is working towards having their AWS Global Infrastructure powered by 100% renewable energy by 2025.
- Cloud Efficiency
- AWS's infrastructure is 3.6 times more energy efficient than the median of U.S. enterprise data centers surveyed.
- AWS's infrastructure is 3.6 times more energy efficient than the median of U.S. enterprise data centers surveyed.
- Water Stewardship
- Direct evaporative technology to cool our data center
- Use of non-potable water for cooling purposes (recycled water)
- On-site water treatment allows us to remove scale-forming minerals and reuse water for more cycles
- Water efficiency metrics to determine and monitor optimal water use for each AWS Region
AWS purchases and retires environmental attributes to cover the non-renewable energy for AWS Global Infrastructure: • Renewable Energy Credits (RECs) • Guarantees of Origin (GOs)
AWS Ground Station is a fully managed service that lets you control satellite communications, process data, and scale your operations without having to worry about building or managing your own ground station infrastructure
Use cases for Ground Station:
- weather forecasting
- surface imaging
- communications
- video broadcasts
To use Ground Station:
- You schedule a Contact (select satellite, start and end time, and the ground location)
- Use the AWS Ground Station EC2 AMI to launch EC2 instances that will uplink and downlink data during the contact or receive downlinked data in an Amazon S3 bucket.
Scenario: A company reaches an agreement with a Satellite Imagery Provider to take satellite photos of a specific region. They use AWS Ground Station to communicate that company's Satellite and download the S3 image data.
- AWS Outposts is a fully managed service that offers the same AWS infrastructure, AWS services, APIs, and tools to virtually any datacenter, co- location space, or on-premises facility for a truly consistent hybrid experience.
- AWS Outposts is rack of servers running AWS Infrastructure on your physical location
What is a Server Rack?
- A frame design to hold and organize IT equipment.
Rack Heights
- U stands for "rack units" or "U spaces" with is equal to 1.75 inches. The industry standard rack size is 48U (7 Foot Rack) Full-size rack cage is 42U high
- Equipment is typically: 1U, 2U, 3U, or 4U high
AWS Outposts comes in 3 form factors: 42U, 1U and 2U
- 42U: AWS delivers it to your preferred physical site fully assembled and ready to be rolled into final position. It is installed by AWS and the rack needs to be simply plugged into power and network.
- 1U: suitable for 19-inch wide 24-inch deep cabinets AWS Graviton2 (up to 64 vCPUs) 128 GiB memory 4 TB of local NVMe storage
- 2U: suitable for 19-inch wide 36-inch deep cabinets, Intel processor (up to 128 vCPUs) 256 GiB memory 8TB of local NVMe storage
What is a Solutions Architect?
A role in a technical organisation that architects a technical solution using multiple systems via researching, documentation, experimentation.
What is a Cloud Architect?
A solutions architect that is focused soley on architecting technical solutions using cloud services
A cloud architect needs to understand the following terms and factor them into their designed architecture based on the business requirements:
- Availability - your ability to ensure a service remains available (HA)
- Scalability - your ability to grow rapidly or unimpeded
- Elasticity - your ability to shrink and grow to meet the demand
- Fault Tolerance - your ability to prevent a failure
- Disaster Recovery - your ability to recover from a failure (Highly Durable)
A Solution Architect needs to always consider the following business factors:
- (Security) How secure is this solution?
- (Cost) How much is this going to cost?
Your ability to increase your capacity based on the increasing demand of traffic, memory and computing power
- Vertical Scaling - scaling up (Upgrade to a bigger server)
- Horizontal Scaling - scaling out (Add more servers of the same size)
Your ability for your service to remain available by ensuring there is *no single point of failure and/or ensure a certain level of performance
Service usage:
- Elastic Load Balancer: A load balancer allows you to evenly distribute traffic to multiple servers in one or more datacenter. If a datacenter or server becomes unavailable (unhealthy) the load balancer will route the traffic to only available datacenters with servers.
Running your workload across multiple AZs ensures that if 1 or 2 AZs become unavailable your service/applications remain available.
Your ability to automatically increase or decrease your capacity based on the current demand of traffic, memory and computing power
Horizontal Scaling:
- Scaling Out - Add more servers of the same size
- Scaling In - Removing underutilised servers of the same size
Service usage:
- Auto Scaling Groups (ASG): is an AWS feature that will automatically add or remove severs based on scaling rules you define based on metrics
Vertical Scaling is generally hard for traditional architecture so you'll usually only see horizontal scaling described with Elasticity
Your ability for your service to ensure there is no single point of failure. Preventing the chance of failure
Fail-overs is when you have a plan to shift traffic to a redundant system in case the primary system fails
Service usage:
- RDS Multi-AZ: is when you run a duplicate standby database in another AZ in case your primary database falls.
A common example is having a copy (secondary) of your database where all ongoing changes are synced. The secondary system is not in-use until a fail-over occurs and it becomes the primary database.
A business continuity plan (BCP) is a document that outlines how a business will continue operating during an unplanned disruption in services
- Recovery Point Objective (RTO)
- Recovery Time Objective (RPO)
Recovery Time Objective (RTO) is the maximum delay between the interruption of service and restoration service. This object determines what is considered an acceptable time window when service is unavailable and is defined by the organisation.
- Recovery Point Objective (RPO) is the maximum acceptable amount of time since the last data recovery point.
- This objective determines what is considered an aceptable loss of data between the last recovery point and the interruption of service and is defined by the organisation
(cover what API endpoints there are make requests too. Not commonly dealt with)
- The AMS Management Console is a web-based unified console
- Build, manage, and monitor everything from simple web apps to complex cloud deployments
- Every AWS Account has a unique Account ID.
- Account ID can be easily found vy dropping down the current user in global navigation
The AWS Account ID is used: - When loggin in with a non-root user account
- Cross-account roles
- Support cases
- Generally good practice to keep your Account ID private as it is one of many components used to identify an account for attack by a malicious actor
- Amazon Resource Names (ARN) uniquely identify AWS resources.
The ARN have following formations: arn:partition:service:region:account-id:resource-id
arn:partition:service:region:account-id:resource-type/resource-id
arn:partition:service:region:account-id:resource-type:resource-id
Partition:
- aws: AWS Regions
- aws-cn: AWS Regions
- aws-us-gov: AWS GovCloud (US) Regions
Service - Identifies the service:
- ec2
- s3
- iam
Region - which AWS resource
- us-east-1
- ca-central-1
Account ID
- 121212121212
- 123456789012
Resource ID:
- user/Bob
- instance/i-1234567890abcdef0
In the AWS Management Console its common to be able to copy the ARN to your clipboard
arn:aws:s3:::my-bucket
- Resource ARNs can include a path
- Paths can include a wildcard character, namely an asterisk(*)
IAM Policy ARN Path
arn:aws:iam::123456789012:user/Development/product_1234
S3 ARN Path
arn:aws:s3:::my_corporatae_bucket/Development
AWS Command Line (Interface) allows users to programatically interact with the AWS API via entering single or multi-line commands into a shell or terminal
- The AWS CLI is a python executable program
- Python is required to install AWS CLI
- A SDK is a collection of software development tools in one installable package
- You can use AWS SDK to programatically create, modify, delete or interact with AWS resources
AWS CloudShell is a browser-based built into the AWS Management Console. (AWS CloudShell is scoped per region, Same crdential as logged in user. Free Service)
Preinstalled Tools
- AWS CLI, Python, Noe.js, git, make , pip, sudo, tar, tmux, vim + more Storage included
- 1GB of storage free per AWS Region Saved files and settings
- Files saved in your home directory are available in future sessions for the same AWS Region Shell Environments
- Bash, PowerShell, Zsh
*AWS CloudSherll is available in select regions
You write configuration script to automate creating, updating or destroyign cloud infrastructure
- IaC is a blueprint fo your infrastructure
- IaC allows you to easily share, version or inventory your Cloud Infrastructure
AWS CloudFormation (CFN) - CFN is a Declarative IaC tool
- What you see is what you get (explicit)
- More verbose, but zero chance of mis-configuration
- Uses scripting languages (JSON, YAML)
AWS Cloud Dvelopment Kit (CDK) - CDN is an Imperative IaC tool
- You say what you want, and the rest is filled (Implicit)
- Les verbose, you could end up with mis-configuration
- Uses scripting languages (JSON, YAML)
- Does more than Declarative
- Uses programming languages
Access Keys is a key secret required to have programmatic access to AWS resources when interaction with the AWS API outside of the AWS Management Console
The Shared Responsibility Model is a cloud security framework that defines the security obligations of the customer versa the Cloud Service Provider (CSP - e.g: AWS)
- Customers are responsible for Security IN the cloud
- AWS is responsible for Security OF the cloud
The Shared Responsibility Model is a simple visualisation that helps determine what the customer is responsible for and what the CSP is responsible for related to AWS.
- The customer is responsible for the data and the configuration of access controls that reside in AWS.
- The customer is responsible for the configuration of cloud services and granting access to users via permissions.
- CSP is generally responsible for the underlying Infrastructure
- Responsibility of in the cloud: If you can configure or store it then you (the customer) are responsible for it.
- Responsibility of the cloud: If you can not configure it then CSP is responsible for it.
Elastic Compute Cloud (EC2) allows you to launch Virtual Machines (VM)
What is a Virtual Machine?
A Virtual Machine (VM) is an emulation of a physical computer using software.
Server Virtualization allows you to easily: create, copy, resize or migrate your server.
Multiple VMs can run on the same physical server so you can share the cost with other customers
(Imagien if your server or computer was an executable file on your computer)
EC2 is highly configurable server, where you can choose AMI that affects options such as:
- The amount of CPUs
- The amount of Memory (RAM)
- The amount of Network Bandwidth
- The OS
- Attach multipl,e virtual hard-drives for storage (eg - Elastic Book Store)
(An Amazon Machine Image is a predefined configuration for a Virtual Machine)
EC2 is considered the backbone of AWS because the majority of AWS services are using EC2 as their underlying servers (eg - S3, RDS, DynamoDB, Lambdas)
Computing Services Diagram
The Nitro System - A combination of dedicated hardware and lightweight hypervisor enabling faster innovation and enhanced security. All new EC2 instances use Nitro System.
- Nitro Cards: specialised cards for VPC, EBS and Instance Storage and controller card
- Nitro Security Chips: integrated into mothetrboard. Protects hardware resources
- Nitro Hypervisor: lightweight hypervisor Memory and CPU allocation Bare Metal-like performance
Bare Metal Instance: you can launch EC2 instance that have no hypervisor so you can run workloads directly on the hardware for maximum performance and control. The M5 and R5 EC2 instances run bare metal.
Bottlerocket: is a Linux-based open-source operation system that is purpose-built by AWS for running containers on Virtual Machines or bare metal hosts.
What is High Performance Computing (HPC)?: a cluster of hundreds of thousands of servers with fast connections between each of them with the purpose of boosting computing capacity.
AWS ParallelCluster: is an AWS-supported open soruce cluster management tool that makes it easy for you to deploy and manage High Performance Computing (HPC) clusters on AWS
What is Edge Computing?
When you push your computing workloads outside of your networks to run close to the destination location. (eg - Pushing computing to run on phones, IoT Devices, or external servers not within your cloud network)
What is Hybrid Computing?
When you're able to run workloads on both your on-premise datacenter and AWS Virtual Private Cloud (VPC)
AWS Outposts is a physical rack of servers that you can put in your datacenter. AWS Outposts allows you to use AWS API and Services such as EC2 right in your datacenter.
AWS Wavelength allows you to build and launch your applications in a telecom datacenter. By doing this your apps wit have ultra-low latency since they will be pushed over a 5G Network and be closest as possible to the end user.
VMWare Cloud on AWS allows you to manage on-premise virtual machines using VMWare as EC2 instances. The data-center must be using VMWare for Virtualisation.
AWS Local Zones are edge datacenters located outside of an AWS region so you can use AWS closer to end destination.
AWS Snow Family are storage and compute devices used to physically move data in our out the cloud when moving data over the internet or private connection - when it's to slow, difficult or costly
- A database is a data-store that stores semi-structure and structured data
- A database is more complex data stores because it requires using formal design and modeling techniques
Database can be generally categorised as either: Relational Databases: - Structured data that stronly represents tabular data (tables, rows & columns) Non-Relational Databases:
- Semi-structuted that may or may not distantly tabular data
Databases have a rich set of functionality:
- Specialised language to query (retrieve data)
- Specialised modelling strategies to optimise retrieval for different use cases
- More fine tune control over the transformation of the data into useful data structures or reports
Normally a database infers someone is using a relational row-oriented data store
A relational datastore designed for analytical workloads, which is generally column-oriengted data-store
- Companies will have terabytes and millions of rows of data, and they need a fast way to be able to produce analytics reports.
Data warehouses generally perform aggregation:
- Data warehouses are optimised around columns since they need to quickly aggregate column data
Data warehouses are generally designed to be HOT: - Hot means they can return queries very fast even though they have vasts amount of data
Data warehouses are infrequently accessed meaning they aren't intented for real-time reporting but maybe once or twice a day or once a week to generate business and user reports
A data warehouse needs to consume data from a relational database on a regular basis
Virtual Private Cloud (VPC) is a logically isolated section of the AWS Network, where you launch your AWS resources. You choose a range of IPs using CIDR Range.
CIDR Range: A "CIDR range" in the cloud refers to a block of IP addresses defined using Classless Inter-Domain Routing (CIDR) notation
-
Subnets are a logical partition of an IP network into multiple smaller network segments. You are breaking up your IP range for VPC into smaller networks.
-
Subnets need to have a smaller CIDR range than the VPC that represent their portion (eg - Subnet CIDR Range 10.0.0.0/24 = 256 IP Addresses)
-
A Public Subnet is one that can reach the internet
-
A Private Subnet is one that cannot reach the internet
Network Access Control Lists (NACLs):
- Acts as a virtual firewall at the subnet level
- You create Allow and Deny rules
- eg: Block a specific IP address known for abuse
Security Groups:
- Acts as a virtual firewall at the instance level
- Implicitly denies all traffic. You create only Allow rules
- eg: Allow an EC2 instance access on port 22 for SSH
- eg: You cannot block a single IP address
- Elastic Compute Cloud (EC2) is a highly configurable virtual server/machine.
- EC2 is resizable compute capacity. It takes minutes to launch new instances.
- Anything and everything on AWS uses EC2 instances underneath.
What are Instance Families?
- Instance families are different combinations of CPU, Memory, Storage and Networking capacity
- Instance families allow you to choose the appropriate combination of capacity to meet your application's unique requirements
- Different instance families are different because of the varying hardware used to give them their unique properties
- Commonly instance families are called Instance Types but an instance type is a combination of size and family
An instance type is a particular instance size and instance family
A common pattern for instance sizes:
- nano
- micro
- small
- medium
- large
- xlarge
- 2xlarge
- 4xlarge
- 8xlarge
- ...
There are many exceptions to this pattern for sizes:
- c6g.metal: is a bare metal machine
- C5.9xlarge: is not a power of 2 or even number size
EC2 Instance Sizes generally double in price and key attributes
Dedicated Hosts are single-tenant EC2 instances designed to let you Bring-Your-Own-License (BOYL) based on machine characteristics
EC2 has 3 levels of tenancy
- Dedicated Host: your server lives here and you have control of the physical attributes
- Dedicated Instance: your server always lives here
- Default: your instance lives here until reboot
There are 5 different ways to pay for EC2 (Virtual Machines)
- On-Demand (Least Commitment)
- Spot (Biggest Savings)
- Reserved (Best Long-term)
- Dedicated (Most Expensive)
AWS Savings Plan is another way to save but can be used for more than just EC2
On-Demand is a Pay-As-You-Go (PAYG) model, where you consume compute and then you pay
- When you launch an EC2 instance, it is by default using On-Demand pricing
- On-Demand has no up-front payment and no long-term commitment
- You are charged by the second (min of 60 seconds) of the hour
- When looking up pricing it will always show EC2 pricing is the hourly rate
- On-Demand is for applications where the workload is for short-term, spikey or unpredictable
- When you have a new app for development or you want to run experiment
Designed for applications that have a steady-state, predictable usage, or require reserved capacity
Reduced Pricing is based on Term x Class Offering x RI Attributes x Payment Option
- You commit to a 1 year or 3 year contract
- When it expires, your instance will use On-Demand with no interruption to service
- Standard: up to 75% reduced pricing compared to On-Demand. You can modify RI Attributes
- Covertible: up to 54% reduced pricing compared to On-Demand. You can exchange RI based on RI Attributes if greater or equal in value
AWS no longer offers Scheduled RI57
-
All Upfront: full payment is made at the start of the term
-
Partial Upfront: a portion of the cost must be paid upfront and the remaining hours in the term are billed at a discounted hourly rate
-
No Upfronot: you are billed a disocunted hourly rate for every hour within the term, regardless of whether the Reserved Instance is being used
-
RIs can be shared between multiple accounts within an AWS Organisation
-
Unused RIs can be sold in the Reserved Instance Marketplace
RI Attributes (aka Instance Attributes) are limited based on Class Offering, and can affect the final price of an RI instance.
There are 4 RI Attributes:
-
Instance Type: for example, m4.large. This is composed of the instance family (for example, m4) and the instance size (for example, large)
-
Region: the region in which the Reserved Instance is purchased
-
Tenancy: whether your instance runs on shared (default) or single-tenant (dedicated) hardware
-
Platform: the operating system (for example, Windows or Linux/Unix)
When you purchase a RI, you determine the scope of the Reserved Instance
(The scope does not affect the price)
- does not reserve capacity
- RI discount applies to instance usage in any AZ in the Region
- RI discount applies to instance usage within the instance family, regardless of size. Only supported on Amazon Linux/Unix Reserved Instances with default tenancy
- You can queue purchases for Regional RI
- reserves capacity in the specified AZ
- RI discount applies to instance in the selected AZ (No AZ flexibility)
- No instance size flexibility. RI discounts applies to instance usage for the specified instance type and size only
- You can't queue purchases for Zonal RI
There is a limit to the number of Reserved Instances that you can purchase per month
Per month you can purchase:
- 20 Regional RIs per Region
- 20 Zonal RIs per AZ
Regional Limits
- You cannot exceed your running On-Demand Instance limit by purchasing Regional RIs. The default On-Demand Instance limit is 20
- Before purchasing RI ensure your On-Demand limit is equal to or greater than your RI you intend to purchase
Zonal Limits
- You can exceed your running On-Demand Instance limit by purchasing Zonal RIs
- If you already haev 20 running On-Demand Instances and you purchase 20 Zonal RIs, you can launch a further 20 On-Demand Instances that match the specifcations of your Zonal RIs
EC2 instances are backed by different kinds of hardware, and so there is a finite amount of servers available within an AZ per instance type or family
Capacity Reservation is a service of EC2 that allows you to request a reserve of EC2 instance type for a specific Region and AZ
- The reserved capacity is charged at the selected instance type's On-Demand rate whether an instance is running in it or not
- You can also use your Regional RIs with your Capacity Reservatioons to benefit from billing discounts
There are some key differences between Standard and Convertible
EC2 Reserved Instance Marketplace allows you to sell your unused Standard RI to recoup your RI spend for RI you do not intend or cannot use
- Reserved Instances can be sold after they have been active for at least 30 days and once AWS has received the upfront payment (if applicable).
- You must have a US bank account to sell Reserved Instances on the Reserved Instance Marketplace.
- There must be at least one month remaining in the term of the Reserved Instance you are listing.
- You will retain the pricing and capacity benefit of your reservation until it's sold and the transaction is complete.
- Your company name (and address upon request) will be shared with the buyer for tax purposes.
- A seller can set only the upfront price for a Reserved Instance. The usage price and other configuration (e.g., instance type, Availability Zone, platform) will remain the same as when the Reserved Instance was initially purchased.
- The term length will be rounded down to the nearest month. For example, a reservation with 9 months and 15 days remaining will appear as 9 months on the Reserved Instance Marketplace.
- You can sell up to $20,000 in Reserved Instances per year. If you need to sell more Reserved Instances.
- Reserved Instances in the GovCloud region cannot be sold on the Reserved Instance Marketplace.
AWS has unused compute capacity that they want to maximise the utility of their idle servers
(It's like when a hotel offers booking discounts to fill vacant suites or planes offer discounts to fill vacant seats)
- Spot Instances provide a discount of 90% compared to On-Demand Pricing
- Spot Instances can be terminated if the computing capacity is needed by other On-Demand customers
- Designed for applications that have flexible start and end times or applications that are only feasible st very low compute costs
AWS Batch is an easy and convenient way to use Spot Pricing
Termination Conditions
- Instances can be terminated by AWS at anytime
- If your instance is terminated by AWS, YOU DO NOT GET CHARGED for a partial hour of usage
- If you terminate an instance YOU WILL STILL BE CHARGED for any hour that it ran
Dedicated Instances is designed to meet regulatory requirements
- When you have strict server-bound licensing that won't support multi-tenancy or cloud deployments you use Dedicated Hosts.
Types:
Multi-tenant:
- when customers are running workloads on the same hardware
- virtual isolation is what separate customers Single-tenant:
- when a single customer has dedicated hardware
- physical isolation is what separates customers
Dedicated can be offered for:
- On-demand
- Reserved (up to 60% savings)
- Spot (up to 90% savings)
You choose tenancy when you launch your EC2
Enterprises and Large Organisations may have security concerns or obligations about sharing the same hardware with other AWS Customers
Savings Plan offers you the similar discounts as Reserved Instances (RIs) but simplifies the purchasing process
There are 3 types of Savings Plans:
- Compute Savings Plans
- EC2 Instance Savings Plans
- SageMaker Savings Plans
You can choose 2 different terms: - 1 year
- 3 year
You can choose the following Payment Options - All Upfront
- Partial Upfront
- No Upfront
(You can choose an hourly commitment)
AWS Savings Plan has 3 different savings types: Compute
- Compute Savings Plans provide the most flexibility and help to reduce your costs by up to 66%
- These plans automatically apply to EC2 instance usage, AWS Fargate, and AWS Lambda service usage regardless of instance family size, AZ, region, OS or tenancy EC2 Instances
- Provied the lowest prices, offering savings up to 72% in exchange for commitment to usage of indiidual instance families in a region
- Auto reduces your cost on the selected instance family in that region regardless of AZ size, OS or tenancy
- Give you the flexibility to change your usage between instances within a family in that region SageMaker
- Helps you reduce SageMaker costs by up to 64%
- Auto apply to SageMaker usagr regardless of instance familu, size, component, or AWS region
The model operates on the principle of "trust no one, verify everything"
Malicious actors being able to by-pass conventional acess controls demonstrates traditional security measures are no longer sufficient
In the Zero Trust Model Identity becomes the primary security perimeter
What is the Primary Security Perimeter?
- The primary or new security perimeter defines the first line of defense and its security controls that protect a company's cloud resources and assets
Network-Centric (Old Way):
- Traditional security focused on firewalls and VPNs since there were few employees or workstations outside the office or they were in specific remote offices
Identity-Centric (New Way):
- Bring-your-own-device, remote workstations is much more common
- Cannot trust if the employee is in a secure location (identity based security controls like MFA, or providing provisional access based on the level of risk from where, when and what a user wants to access)
Identity-Centric does not replace but *augments* Network-Centric security
Identity Security Controls you can implement on AWS to meet the Zero Trust Model
AWS Identity and Access Management (IAM)
- IAM Policies
- Permission Boundaries
- Service Control Policies (Organisation-wide Policies)
- IAM Policy Conditions
- aws:SourceIp -> restrict on IP Address
- aws:RequestedRegion -> restrict on region
- aws:MultiFactorAuthPresent -> restrict if MFA is turned off
- aws:CurrentTime -> restrict access based on time of day
AWS does not have ready-to-use identity controls that are intelligent, which is why AWS is considered to not have a true Zero Trust offering for customers, and 3rd-party services need to be used
A collection of AWS Services can be setup to intelligent-ish detection of identity concerns but requires expert knowledge 1. **AWS CloudTrail**: tracks all API calls 2. **AWS GuardDuty**: detects suspicious or malicious activity based on CloudTrail and other logs 3. **AWS Detective**: used to analyse, investigate and quickly identify security issues (can ingest findings from Guard Duty)
AWS does technically implement a Zero Trust Model but does not allow for intelligent identity security controls
Third Party Identity solutions:
- Azure Active Directory (Azure AD)
- Google BeyondCorp
- JumpCloud (all have more intelligent security controls for real-time detection)
What is directory service?
A service that maps the name of network resources to their network addresses
- A directory service is shared information infrastructure for location, managing, administrating and organising resources
- Is a critical component of a network operating system
- A directory server (name server) is a server which provides a directory service
- Each resource on the network is considered an object by the directory server.
- Information about a particular resource is stored as a collection of attributes associated with that resource or object
To give organisations the ability to manage multiple on-premises inftrastructure components and systems using a single identity per user
- A system that creates, maintains and manages identity information for principals and provides authentication services to applications withiin a federation or distributed network
- A trusted provider of your user identity that lets you use authetentication to access other services
Federated identity is a method of linking a user's identity across multiple separate identity management systems
(examples - OpenID, OAuth2.0, SAML: Single-Sign-On via web browser)
Single-sign-on is an authentication scheme that allows a user to log in wih a single ID and password to different systems and software
SSO allows IT departments to adminstrate a single identity that can access many machines and cloud services
Login for SSO is seamless, where once a user is logged into their primarry directory, as soon as they utilise this software they are not presented with a login screen
An open, vendor-neutral, industry standard application protocol for accessing and maintaining distributed directory information services over an Internet Protocol (IP) network
- A common use of LDAP is to provide a central place to store usernames and passwords
- LDAP enables for same sign-in. Same sign-on allows users to use a single ID and password, but they have to enter it in everytime they want to login
Why use LDAP when SSO is more convenient?
- Most SSO systems are using LDAP
- LDAP was not designed natively to work with web applications
- Some systems only support integration with LDAP and not SSO
What is a Security Key?
A secondary device used as a second step in the authentication process to gain access to a device workstation or application (example - Yubikey)
AWS Identity and Access Management (IAM) you can create and manage AWS users and groups, and use permissions to allow and deny their access to AWS resources
IAM Policies: JSON documents which grant permissions for a specific user, group, or role to access services. Policies are attached to IAM Identities
IAM Permission: The API actions that can or cannot be performed. They are represented in the IAM Policy document
IAM Users: End users who log into the console or interact with AWS resources programatically or via clicking UI interfaces
IAM Groups: Group up your users so they all share permission levels of the group (Admins, Developers, etc.)
IAM Roles: Roles grant AWS resources permissions to specific AWS API actions. Associate policies to a role and then assign it to an AWS resource
Principle of Least Privilege (PoLP) is the computer security concept of providing a user, role, or application the least amount of permissions to perform a operation or action
Just-Enough-Access:
- Permitting only the exact actions for the identity to perform a task
Just-In-Time:
- Permitting the smallest length of duration an identity can use permissions
Risk-based adaptive policies:
- Each attempt to access a resource generates a risk score of how likely the request is to be from a compromised source
- The risk score could be based on many factors (eg - device, user location, IP address, what service is being accessed and when)
Administrative Tasks that only the Root User can perform:
- Change your account settings
- Close your AWS account
- Change or Cancel AWS Support Plan
AWS Single-Sign On (AWS SSO) is where you create, or connect, your workforce identities in AWS once and manage access centrally across your AWS organisation
What is Application Integreation?
Application Integration is the process of letting two independent applications communicate and work with each other, commonly facilitated by an intermediate system
Cloud workloads encourage systems and services to be loosely coupled and so AWS has many service for the specific purpose of application integration
Common systems or design patterns utilised for Application Integration generally are:
- Queueing
- Streaming
- Pub/Sub
- API Gateways
- State Machine
- Event Bus
What is a Messaging System?
- Used to provide async communication and decouple processes via messages / events
- From a sender and receiver ( producer and consumer)
What is a Queueing System?
- A messgaing system that generally will delete messages once they are consumed
- Simple communication. Not Real-time
- Have to pull. Not reactive
Simple Queueing Serviec (QSS): fully managed queueing service that enables you to decouple and scale microservices, distributed systems, and serverless applications
Use Cases:
- You need to queue up transaction emails to be sent (eg - SignUp, Reset Password)
What is streaming?
- Multiple consumers can react to events (messages)
- Events live in the stream for long periods of time, so complex operations can be applied
- Real-time
Amazon Kinesis: is the fully managed solution for collection, processing, and analysing streaming data in the cloud
What is Pub/Sub?
- Publish-subscribe pattern commonly implemented in messaging systems
- In a pub/sub system, the sender of messages (publishers) do not send their messages directly to receivers
- They send their messages to an event bus. The event bus categorises their messages into groups
- The receivers of messages (subscribers) subcribe to these groups
- Whenever new messages appear within their subscription the messages are immediately delivered to them
- Publishers have no knowledge of who their subscribers are
- Subscribers do not pull for messages
- Messages are instead automatically and immediately pushed to subscribers
- Messages and events are interchangeable terms in pub/sub
Use Case:
- a real-time chat system
- web-hook system
Simple Notificaton Service (SNS): is a highly available, secure, fully managed pub/sub messaging service that enables you to decouple microservices, distributed systems, and serverless applications
What is an API Gateway?
- A program that sits between a single-entry point and multiple backends
- Allows for throttling, logging, routing logic or formatting of the request and repsonse
Amazon API Gateway: is a solution for creating secure APIs in your cloud environment at any scale. Create APIs that act as a front door for applications to access data, business logic, or functionality from back-end services
What is a State Machine?
An abstract model which decides how one state moves to another based on a series of conditions. Think of a state machine like a flow chart
What is AWS Step Functions?
- Coordinate multiple AWS Services into a serverless workflow
- A graphical console to visualise the components of your application as a series of steps
- Automatically triggers and tracks each step, and retries when there are errors, so your application executes in order as expected, every time
- Logs the state of each step, so when things go wrong, you can diagnose and debug problems quickly
What is an Event Bus?
Receives events from a source and routes events to a target based on rules
EventBridge is a serverless event bus service that is used for application integration by streaming real-time data to your applications (formeerly called Amazon CloudWatch Events)
VMs:
- VMs do not make best use of space
- Apps are not isolated, which could cause config conflicts, security problems or resource hogging
Containers:
- Containers allow you to run multiple apps which are virtually isolated from each other
- Launch new containers and configure OS Depdencies per container
AWS Organisations allow the creation of new AWS acounts. Centrally manage billing, control access, compliance, security, and share resources across your AWS accounts
Root Account User: is a single sign-in identity that has complete access to all AWS services and resources in an account. Each account has a Root Account User
Organisaton Units: are a group of AWS accounts within an organisation which can also contain other organisational units - creating a hierarchy
Service Control Policies: give central control over the allowed permissions for all accounts in your organisation, helping to ensure your accounts stay within your organisation's guidelines
- AWS Organisations must be turned on, and once turned on it cannot be turned off
- You can create as many AWS Accounts as you like, one account will be the Master/Root Account
AWS Account is not the same as User Acount
- Helps Enterprises quickly set-up a secure, AWS multi-account
- Provides you with a baseline environment to get started with a multi-account architecture
Landing Zone A landing zone is a baseline environment following well-architected and best practices to start launching production ready workloads
- AWS SSO enabled, Centralised logging for AWS CloudTrail, cross-account security auditing
AWS Factory
- Automates provisioning of new accounts in your organisation
- Standardise the provisioning of new accounts with pre-approved network configuration and region selections
Guardrails
- Pre-packaged governance rules for security, operations, and compliance that customers can select and apply enterprise-wide or to specific groups of accounts
AWS Control Tower is the replacement for retired AWS Landing Zones
AWS Config is a Compliance-as-Code framework that allows us to manage change in your AWS accounts on a per region bases
When should you use AWS Config?
- I want this resource to stay configured a specific way for compliance
- I want to keep track of configuration changes to resources
- I want a list of all resources within a region
- I want to analyse potential security weaknesses, you need detailed historical information
What is Change management?
Change management in the context of Cloud Infrastructure is when we have formal process to:
- Monitor changes
- Enforce changes
- Remediate changes
AWS Quick Starts are Prebuilt templates by AWS and AWS partners to help deploy wide range of stacks
Reduce hundreads of manual procedures into just a few steps
A Quick Start is composed of 3 parts:
- A reference architecture for the deployment
- AWS CloudFormation templates that automate and configure the deployment
- A deployment guide explaining the architecture and implementation in detail
A tag is a key and value pair that you can assign to AWS resources
Tags allow you to organise your resources such as:
Resource management Cost management and optimisation
- Specific workloads, environmnets (eg - Developer Environments)
- Cost tracking, Budgets, Alerts Operations management
- Business commitments and SLA operations (eg - Mission Critical Services) Security
- Classification of data and security impact Governance and regulatory compliance Automation Workload optimisation
Resource Groups are a collection of resources that share one or more tags
Helps your organise and consolidate information based on your project and the resources that you use
Resouces Groups can display details about a group of resource based on:
- Metrics
- Alarms
- Configuration Settings
At any time you can modify the settings of your resource groups to change what resources appear
Resource Groups appears in the Global Console Header and under Systems Manager
What is provisioning?
- The allocation or creation of resources and services to a customer
- AWS Provisioning Services are responsible for setting up and then managing those AWS Services
Elastic Beanstalk: Platform as a Service (PaaS) to easily deploy web-applications
AWS OpsWorks: Configuration management service
CloudFormation: Infrastructure modeling and provisioning service
AWS QuickStarts: Pre-made packages that can laucnh and configure your AWS compute, network, storage and other services required to deploy a workload on AWS
AWS Marketplace: digital catalogue of thousands of sofware listings from independent software vendors you can use to find, buy, test and deploy software
AWS Amplify: Mobile and web-application framwork for severless computing
AWS App Runner: Fully managed service that makes it easy for devs to quickly deploy containerised web apps and APIs, at scale and with no prior infra experience required
AWS Copilot: CLI that enables customers to quickly launch and easily manage containerised apps on AWS
AWS CodeStar: Provides a unified UI, enabling you to easily manage your software development activities in one place
AWS CDK: An Iac (Infra as code) tool. Allows you to use your fav programming language. Generates out CloudFormation templates as the means for IaC
Elastic Beanstalk is powered by a CloudFormation template setup for you:
- Elastic Load Balancer
- Autoscaling Groups
- RDS Database
- EC2 Instance preconfigued (or custom) platforms
- Monitoring (CloudWatch, SNS)
- In-Place and Blue/Green deployment methodologies
- Security (Rotates passwords)
- Can run Dockerised environments
Serverless Services
- When the underlying servers, infrastructure and Operating System is taken care of by the Cloud Service Provider (CSP).
- Servless is generally by default highly available, scalable and cost-effective. You pay for what you use
DyanmoDB: NOSQL key/value and document database Simple Storage Service (S3): Serverless object storage service ECS Fargate: Serverless orchestration container service AWS Lambda: Serverless functions service Step Functions: State machine service Auora Servlerss: Serverless on-demand versions or Aurora
- Servless architecture generally describes fully managed cloud-services
- The classification of a cloud service being serverless is not a Boolean answer
A serverless service could have all or most of the following characteristics:
- Highly elastic and scalable
- Highly available
- Highly Durable
- Secure by default
- Abstracts away the underlying infrastructure and are billed based on the execution of your business task
- Serverless can Scale-to-Zero meaning when not in use the serverless resources cost nothing
Pay-for-Value (you don't pay for idle servers)
(Some services are more serverless than others)
AWS has multiple cloud services and tools that make it easy for you to run Windows workloads on AWS
- Windows Servers on EC2
- SQL Server on RDS
- AWS Directory Service
- AWS License Manager
- Amazon FSx for Windows File Server
- Amazon WorkSpaces
- AWS Lambda
- AWS Migration Accelaration Program (MAP)
What is Bring-Your-Own-License? (BYOL)
- The process of reusing an existing software license to run vendor software on a cloud vendor's computing service
- BYOL allows companies to save money since they may have purchased the license in bulk or at a time that provided a greater discount then if purchased again
AWS License Managewr is a service that makes it easier for you to manage your software licenses from software vendors centrally across AWS and your on-premises environments
AWS License Manager is software that is licensed based on *virtual cores (vCPUs, physical cores, sockets, or number of machines)
AWS License Manager works with:
- EC2 - Dedicated Instances, Dedicated Hosts, Spot Instances
- RDS - (Only for Oracle databases)
For Microsoft Windows Server and Microsoft SQL Server License you generally need to use Dedicated Host
CloudTrail - logs all API calls (SDK, CLI) between AWS services (who can we blame)
- Detect developer misconfiguration
- Detect malicious actors
- Automate responses
CloudWatch is a collection of multiple services
- CloudWatch Logs - A centralised place to store your cloud services log data or application logs
- CloudWatch Metrics - Represents a time-ordered set of data points. A variable to monitor
- CloudWatch Events (EventBridge) - Trigger an event based on a condition (eg - Every hour take snapshot of server)
- CloudWatch Alarms - Triggers notifications based on metrics
- CloudWatch Dashboard - Create visualisations based on metrics
*AWS X-Ray
- Is a *distributed tracing system
- You can use it pinpoint issues with your microservices
- See how data moves from one app to another, how long it took to move, and if it failed to move forward
AWS CloudTrail is a service that enables governance, compliance, operational auditing, and risk auditing of your AWS account
-
AWS CloudTrail is used to monitor API calls and Actions made on an AWS account
-
Easily identify which users and accounts made the call to AWS (
- Where - Source IP Address
- When - EventTime
- Who - User, UserAgent
- What - Region, Resource, Action
)
-
CloudTrail is already logging by default and will collect logs for last 90 days via Event History
-
If you need more than 90 days you need to create a Trail
-
Trails are output to S3 and do not have GUI like Event History
-
To analuyse a Trail you'd have to use Amazon Athena
A CloudWatch Alarm monitors a CloudWatch Metric based on a defined threshold
When alarm breaches (goes outside the defined threshold) then it changes state
Metric Alarm States:
- OK
- The metric expression is within the defined threshold
- ALARM
- The metric expression is outside of the defined threshold
- INSUFFICIENT_DATA
- The alarm just started
- The metric is not available
- Not enough data is available
When it changes state we can define what action it should trigger:
- Notification
- Auto Scaling Group
- ECC2 Action
Anatomy diagram of an Alarm
Log Streams
A log stream represents a sequnce of events from a application or instance being monitored
Log Events
Represents a single event in a log file. Log events can be seen within a Log Stream
CloudWatch Logs Insights enables you to interactively search and analyse your CloudWatch log data and has the following advantages:
- More robust filtering than using the simple Filter events in a Log Stream
- Less burdensome than having to export logs to S3 and analyse them via Athena
- CloudWatch Logs Insights supports all types of logs
- CloudWatch Logs Insights is commonly used viathe console to ad-hoc queries against log groups
- CloudWatch Insights has its own language called: CloudWatch Logs Insights Query Syntax
- A single request can query up to 20 log groups
- Queries time out after 15 minutes, if they have not completed
- Query results are available for 7 days
- A CloudWatch Metric represents a time-ordered set of data points
- Its a variable that is monitored over time
CloudWatch comes with many predefined metrics that are generally name spaced by AWS Service
Amazon SageMaker is a fully managed service to build, train, and deploy machine learning models at scale
- Apache MXNet on AWS: open source deep learning framework
- TensorFlow on AWS: open source machine intelligence framework
- PyTorch on AWS: open source machine learning framework
Amazon SageMaker Ground Truth is a data-labeling service.
Amazon Augmented AI is a human-intervention review service.
Amazon CodeGuru: machine learning code analysis service
Amazon Lex: conversion interface service
Amazon Personalise: real-time recommendations
Amazon Polly: text-to-speech
Amazon Rekognition: image and video recognition service
Amazon Transcribe: speech-to-text service
Amazon Textract: OCR (extract text from scanned documents) service
Amazon Translate: neural machine leanring translation service
Amazon Comprehend: natural language processor (NLP) service
Amazon Forecast: time-series forecasting service
Amazon Deep Learning AMIs: pre-installed with popular deep learning frameworks
Amazon Forecast: docker image instances pre-installed with deep learning frameworks and interfaces
Amazon DeepComposer: machine learning enabled musical keyboard
Amazon DeepLens: video-camera that uses deep learning
Amazon DeepRacer: toy race car, autonomous driving
Amazon Elastic Interface: allows you to attacth low-cost GPU-powered acceleration to EC2 instances to reduce the cost of running deep learning interference by up to 75%
Amazon Fraud Detector: fully managed fraud detection as a service
Amazon Kendra: enterprise machine learning search engine service
Amazon Bedrock: large language model (LLM) cloud service offer to gen text and image (eg - chatGPT)
Amazon CodeWhisper: an AI code generator that will predict code to your use case
Amazon DevOps Guru: uses ML to analyse your op data and application metrics and events to detect op abnormalities
Amazon Lookout: for Equipment / Metric / Vison. Uses ML models for quality control and performed automated inspections
Amazon Monitron: uses ML models to predict unplanned equipment downtime. Monitor has an IOT sensor that captures vibrations and sensor data
Amazon Neuron: an SDK used to run deep learning workloads on AWS Inferentia and AWS Trainium based instance
What is Big Data?
A term used to describe massive volumes of structured/unstructured data that is so large it is difficult to move and process using traditional dataabse and software techniques
Amazon Athena: serverless interactive query service
Amazon CloudSearch: full-text search service
Amazon Elastic Service (ES): managed ElasticSearch cluster
Amazon Elastic MapReduce (EMR): for data processing and analysis
Kinesis Data Streams: real-time streaming data service
Kinesis Firehose: serverless and simpleer version of Data Streams
Amazon Kinesis Data Analytics: allows you to run queries against data that is flowing through your real-time stream so you can create reports and analysis on emerging data
Amazon Kinesis Video Streams: allows you to analyse or apply processing on real-time streaming video
Managed Kafa Service (MSK): fully managed Apache Kafka service
Redshift: is a petabyte data-warehouse
Amazon QuickSight: is a business intelligence (BI) dashboard
AWS Data Pipeline: automates the movement of data
AWS Glue: extract, transform, load (ETL) service
AWS Lake Formation: centralised, curated, and secured repository that stores all your data
AWS Data Exchange: a catalogue of third-party datasets
Amazon QuickSight is a Business Intelligence (BI) Dashboard that allows you to ingest data from various AWS storage or database services to quickly visualise business data with minimal programming or data formula knowledge
QuickSight uses SPICE (super-fast, parallel, in-memory, calculation engine) to achieve blazing fast performance at scale
- Amazon QuickSight ML Insights: Detect Anomalies, Perform accurate forecasting, Generate Natural Language Narratives
- Amazon QuickSight Q: Ask question using natural language, on all your data, and receive answers in seconds
Apache MXNet: adopted by AWS, supports both imperative and symbolic
PyTorch: optimised tensor library for deep learning using GPUs and CPU (created by Facebook)
TensorFlow: low-level machine learning framework (created by Google)
Apache Spark: unified analytics engine for large-scale data processing
Chainer: powerful, flexible and intuitive deep learning framework, supports CUDA
Hugging Face: an AI community of ML models and dataset
- Intel is a multionational corporation and is one of the world's largest semiconductor chip manufacturers.
- Intel is the inventor of the x86 instruction set
There is another popular instruction set called ARM which uses fewer instructions and usually results in better power efficiency which results in lower costs
Intel Xeon Scalable Processor
- The Intel Xeon Scalable Processor is a high performant CPU designed for enterprise and server applications, commonly used in AWS instances
Intel Habana Gaudi
- AI training processor developed by Habana Labs, a company acquired by Intel
- Tailored for training deep learning models
- Often viewed as a competitor to NVIDIA's GPUs
- Offers a specialised alternative that's optimised specifically for AI training
(Business Value)
- Operational Excellent Pillar - Run and monitor systems
- Security Pillar - Protect data and systems, mitigate risk
- Reliability Pillar - Mitigate and recover from disruptions
- Performance Efficiency Pillar - Use computing resources effectively
- Cost Optimisation Pillar - Get the lowest price
Component - Code, Configuration and AWS Resource against a requirement
Workload - A set of components that work together to deliver business value
Milestones - Key changes of your architecture through product life cycle
Architecture - How components work together in a workload
Technology Portfolio - A collection of workloads required for the business to operate
- AWS Well-Architected Framework is designed around a different kind of team structure
- AWS has distributed kind of team structure
- Distributed teams can come with new risks, AWS mitigates these with Practices, Mechanisms and Leadership Principles
The Amazon Leadership Principles are a set of principles used during the company decision-making, problem solving, simple brainstorming and hiring
A Pillar of the Well-Architected Framework is structured as follows:
- Design Principles: A list of design principles that need to be considered during implementation
- Definition: Overview of the best practice categories
- Best Practices: Detailed information about each best practice with AWS Services
- Resources: Additional documentation, whitepapers and videos to implement this pillar
Perform operations as code
Apply the same engineering discipline you would like to application code to your cloud infrastructure
Make frequent, small, reversible changes
Design workloads to allow components to be updated regularly
Refine operations procedures frequently
Look for continuous opportunities to improve your operations
Anticipate failiure
Perform post-mortems on system failures to better improve, write test code, kill production servers to test recovery
Learn from all operational failures
Share lessons learned in a knowledge base for operational events and failures across your entire organisation
Implement a strong identity foundation
Implement Principle of Least Privilege (PoLP).
Use Centralised identity.
Avoid Long-Lived credentals.
Enable traceability
Monitor alert and audit actions and changes to your environment in real-time
Integrate log and metric collection and automate investigation and remediation
Apply security at all layers
Take Defense in Depth approach with multiple security controls for everything (eg - Network, VPC, Load Balancing Instances, OS, Application Code)
Automate security best practices
Protect data in transit and at rest
Keep people away from data
Prepare for security events
Incident management systems and investigation policy and processes.
Tools to detect, invetiagte and recover from incidences.
Automatically recover from failure
Monitor Key Performance Indicators (KPIs) and trigger automation when threshold is breached
Test recovery procedures
Test how your workload fails, and you validate your recvoery procedures
You can use automation to simulate different failures or to recreate scenarios that led to failures before
Scale horizontally to increase aggregate system availability
Replace one large resource with multiple small resources to reduce the impact of a single failure on the overall workload
Distribute requests across multiple, smaller resources to ensure that they don't share a common point of failure.
Stop guessing capacity
In On-Premise it takes a lot of guess work to determine the elasticity of your workload demands
With Cloud you don't need to guess how much you need because you can request the right-size of resoruces on-demand
Manage changes in automation
Making changes via Infrastructure as Code, will allow for a formal process to track and review infrastructure
Democratise advanced technologies
Go global in 5 minutes
Use serverless architecture
Experiment more often
Consider mechanical sympathy
Implement Cloud Financial Management
Adopt a consumption model
Measure overall efficiency
Stop spending money on undifferentiated heavy lifting
Analyse and attribute expenditure
The Well-Architected Tool is an auditing tool to be used to asses your cloud workloads for alignment with the AWS Well Architected Framework
(Its essentially a checklist, with nearby references to help you assemble a report to share with executives and key stake-holders)
The AWS Architecture Center is a web-portal that contains best practices and reference architectures for a variety of different workloads
What is the Total Cost of Ownership? (TCO)
- TCO is a financial estimate intended to help buyers and owners determine the direct and indirect costs of a product or service
- Creating a TCO report is useful when your company is looking to migrate from on-premise to cloud
Capital Expenditure (CAPEX)
- Spending money upfront on physical infrastructure
- Deducting that expense from your tax bill over time
With Capital Expenses you have to guess upfront what you plan to spend
Operational Expenditure (OPEX)
- The costs associated with an on-premises datacenter that has shifted the cost to the service provider
- The customer only has to be concerned with non-physical costs
With Operation Expenses you can try a product or service without investing in equipment
The AWS Pricing Calculator is a free cost estimate tool that can be used within your web browser without the need for an AWS Account to estimate the cost of various AWS services
- The AWS Pricing Calcultor contains 100+ services that you can configure for cost estimate
- To calculate TCO an organisation needs to compare their existing cost against the AWS costs and so the AWS Pricing Calculator can be used to determne that cost
AWS Migration Evaluator is an estimate tool used to determine an organisation existing on-premise cost so it can compare it against AWS costs for planned cloud migration
- Migration Evaluator uses an Agentless Collector to collect data from your on-premise infrastructure to extract your on-premise costs
VM Import/Export allows users to import Virtual Machine images into EC2
- Prepare your Virtual Image for upload
- Upload your Virtual Image to S3
- Use the AWS CLI to import your image - It will generate an Amazon Machine Image (AMI)
- AWS Database Migration Service (DMS) allows you to quickly and securely migrate one database to another
- DMS can be used to migrate your on-premise database to AWS
AWS Schema Conversion Tool is used in many cases to automatically convert a source database schema to a target database schema
The AWS Cloud Adoption Framework is a whitepaper to help you plan your migration from on-premise to AWS
At the highest level, the AWS CAF organise guidance into six focus areas:
- Business Perspective
- People Perspective
- Governance Perspective
- Platform Perspective
- Security Perspective
- Operations Perspective
A Technical Account Manager provides both proactive guidance and reactive support to help you succeed with your AWS journey
(TAMs follow the Amazon Leadership Principles, especially about being Customer obsessed)
TAMs are only available at the Enterprise Support tier
- AWS Marketplace is a curated digital catalogue with thousands of software listings from independent software vendors
- Easily find, buy, test and deploy software that already runs on AWS
- The product can be free to use or can have an associated charge. The charge becomes part of your AWS bill, and once you pay, AWS Marketplace pays the provider
- The sales channel for ISVs and Consulting Partner allows you to sell your solutions to other AWS customers
- A feature of AWS Organisations that allows you to pay for multiple AWS accounts with one bill
- You can designate one master account that pays the charges of all the other member accounts
- Use Cost Explorer to visualise usage for consolidated billing
- You can combine the usage across all accounts in the organisation to share the volume pricing discounts
- AWS has Volume Discounts for many services
- The more you usem the more you save
- Consolidated Billing lets you take advantage of Volume Discounts
- Consolidated Billing is a feature of AWS Organisations
AWS Trusted Advisor is a recommendation tool which automatically and actively monitors your AWS account to provide actional recommendations across a series of categories
Think of AWS Trusted Advisor like an automated checklist of best practices on AWS
Cost Optimisation
- Idle Load Balancers
- Unassociated Elastic IP Addresses
Performance
- High Utilisation Amazon EC2 Instances
Security
- MFA on Root Account
- IAM Access Key Rotation
Fault Tolerance
- Amazon RDS Backups
Service Limits
- VPC
What is a Service Level Agreement (SLA)?
A formal commitment; expective level of service (example - Financial or Service Credits)
What is a Service Level Indicator (SLC)?
A metric/measurement
What is a Service Level Objective (SLO)?
A target percentage
- DynamoDB SLA
- Compute SLAs
- RDS SLA
The Service Health Dashboard shows the general status of AWS services
- Provides alerts and guidance for AWS events that might affect your environment
- All customers can access the Personal Health Dashboard
- Shows recent events to help you manage active events, and shows proactive notifications so that you can plan for scheduled activities
- Use these alerts to get notified about changes that can affect your AWS resources, and then follow the guidance to diagnose and resolve issues
AWS Trust & Safety is a team that specifically deals with abuses ocurring on the AWS platform for the following issues:
- Spam
- Port scanning
- Denial-of-service (DoS) attacks
- Intrusion attempts
- Hosting prohibited content
- Distributing malware
AWS Support does not deal with Abuse tickets. You need to contact abuse@amazonaws.com or fill out the Report Amazon AWS abuse form