Infrastructure Concepts: Difference between revisions
Line 130: | Line 130: | ||
On the downside, the configuration registry becomes a dependency for other components of the system and possibly a single point of failure. Since such a component is required in disaster recovery, makes it a component of the critical path. | On the downside, the configuration registry becomes a dependency for other components of the system and possibly a single point of failure. Since such a component is required in disaster recovery, makes it a component of the critical path. | ||
=Configuration Hierarchy= | ==Configuration Hierarchy== | ||
Most tools support a chain of configuration options with a predictable hierarchy of precedence (command line parameters take precedence over environment variables, which take precedence over configuration files). | Most tools support a chain of configuration options with a predictable hierarchy of precedence (command line parameters take precedence over environment variables, which take precedence over configuration files). | ||
Revision as of 00:36, 1 January 2022
External
- http://www.infrastructures.org
- Infrastructure as Code: Dynamic Systems for the Cloud Age by Kief Morris
Internal
Overview
In a cloud environment, infrastructure, as viewed by the user, is no longer represented by hardware, but by virtual constructs like servers, subnets and block devices. The hardware still exists, but infrastructure elements accessible to users "float" across it, can be manipulated by the infrastructure platform APIs and can be created, duplicated, changed and destroyed at will. They are referred to as infrastructure resources. Infrastructure resources can be instantiated, changed and destroyed by infrastructure as code to provide the infrastructure foundation for application runtimes and applications.
Application
Applications and services provide domain-specific capabilities to organizations and users. They exist in the form of application packages, container instances or serverless code. The underlying layers (application runtime, infrastructure platform) exist to enable this layer. Applications can be directly offered to users as part of a cloud service delivery model, under the generic name of Software-as-a-Service (SaaS). The users do not need to manage anything, but they also do not control anything, including the design of the application. This works well, unless the customer needs functionality that is not available in the application. Examples: Intuit Quickbooks, batch services based on Spark, aimed at data scientists, salesforce.com, etc.
Application Runtime
The application runtime layer provides application runtime services and capabilities to the application layer. It consists of container clusters, serverless execution environments, application servers, messaging systems, databases and operating systems. Services like databases can be considered application runtime services, belonging to the application runtime layer, but at the same time, they can be seen as composite resources exposed by the infrastructure platform: an application runtime is laid upon the infrastructure platform layer and it is assembled from infrastructure resources. This layer is also referred to as Platform as a Service (PaaS) or "cloud application platform" and can be exposed directly to users, as it is the case for EKS, AKS, OpenShift, GKS, etc.
Application Runtime Service
The line between application runtime services, living in the application runtime layer, and composite resources, living in the infrastructure platform layer, is blurred.
Infrastructure Platform
An infrastructure platform is a set of infrastructure resources and the tools and services that manage them. The infrastructure platform lets its users to provision compute, storage, networking and other infrastructure resources on which they can run operating systems and applications. Service like Databases can be categorized as application runtime services belonging to the application runtime layer, or as composite resources exposed by the infrastructure platforms. The infrastructure resources are managed dynamically via an API, which is a definitely characteristic of a cloud. This is also the essential element that allows expressing the resources as code. The infrastructure platform abstracts out the hardware and the virtualization layers. The service users do not control the underlying hardware resources except select networking configuration or perhaps the physical location of the resources at gross geographical level. On the other hand, they control - but they must install and manage - the operating system, the middleware and the application code. Infrastructure as a service is targeted at teams that are building out infrastructure. The infrastructure platforms are also known as Infrastructure as a Service (IaaS). Examples of public cloud platforms: AWS, Azure, GCP, Oracle Cloud. Example of private cloud platforms: OpenStack, VMware vCloud.
Infrastructure Resources
There are three essential resources provided by an infrastructure platform: compute, network and storage. Some times these resources are referred to as "primitives". An infrastructure platform abstracts infrastructure resources from physical hardware. Infrastructure resources are assembled to provide application runtime instances. Infrastructure resources can be expressed as code and grouped in stacks.
Compute
A compute resource executes code. In essence, any compute resource is backed by physical server CPU cores, but the infrastructure platform exposes then in more useful ways: physical servers, virtual machines, server clusters, containers, application hosting clusters and serverless runtimes.
Physical Servers
The infrastructure may provision and expose physical servers on demand. These are also called "bare metal".
Virtual Machines
The virtual machines are exposed by hypervisors that run in top of a pool of physical servers, managed by the infrastructure platform.
Server Clusters
A server cluster is a pool of server instances, either virtual machines or physical servers, that the infrastructure platform provisions and manages as a group. These are called auto-scaling groups on AWS, Azure virtual machine scale sets on Azure and Google managed instance groups on GCP.
Containers
Some infrastructure platforms offer Container as a Service infrastructure, which allows deploying and running container instances.
Container Clusters
A container cluster, also referred to as application hosting cluster, is a pool of servers onto which the infrastructure platform deploys and manages multiple applications. A container cluster should not be confused with a PaaS. The container cluster manages provisioning of compute resources for applications, which is one of the core functions of a PaaS, but a PaaS provides a variety of services beyond compute. Examples of container clusters are Amazon Elastic Container Service (ECS), Amazon Elastic Kubernetes Service (EKS), Azure Kubernetes Service (AKS) and Google Kubernetes Engine (GKE). A container cluster can be deployed and managed on an on-premises infrastructure platform using a supported Kubernetes release such as OpenShift.
Application Servers
Application Server Clusters
Serverless Runtimes
Serverless runtimes, also known as Function as a Service (FaaS) runtimes, execute code on demand, in response to events or schedules. Example: AWS Lambda, Azure Functions and Google Cloud Functions.
Network
An infrastructure platform provides the following network primitives: network address blocks, VLANs, routes, etc.
Network Address Blocks
A network address block is a fundamental structure for grouping resources to control routing of traffic between them and isolate them from other resources that do not belong to the group. Network address blocks are known as VPCs in AWS and as virtual networks in Azure and GCP. The top level block is divided into smaller blocks known as VLANs.
VLANs
VLANs are smaller sub-divisions of a network address block. They are known as subnets in AWS.
Names
This includes DNS names, which are mapped onto IP addresses.
Routes
A route configures what traffic is allowed between and within address blocks.
Gateways
A gateway directs traffic in and out network address blocks.
Load Balancing Rules
Forward connections coming into a single address to a pool of resources.
Proxies
Accept connection and use rules to transform or route them.
API Gateways
Handle authentication and throttling.
VPNs
Connect different network address blocks across locations so they appear to be part of a single network.
Direct Connections
Cloud - Data Center network connections.
Network Access Rules (Firewall Rules)
Asynchronous Messaging
Queues for messages.
Caches
Storage
Block Storage
Object Storage
Networked File System Storage
Structured Data Storage
Composite Resources
Cloud platforms combined primitive infrastructure resources into composite resources. The line between a primitive resource and a composite resource is arbitrary, as is the line between a composite infrastructure resource and an application runtime service.
Database as a Service
Most infrastructure platforms provide managed Database as a Service (DBaaS) that can be defined and managed as code. They may be standard commercial or open source database applications such as MySQL or PostgreSQL, column stores, document databases, graph databases or distributed key-value stores.
Load Balancing
DNS
Identity Management
Secrets Management
Storage for security-sensitive configuration such as passwords and keys.
Infrastructure Services
Cloud
NIST cloud definition:
Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model promotes availability and is composed of five essential characteristics (On-demand self-service, Broad network access, Resource pooling, Rapid elasticity, Measured Service); three service models (Cloud Software as a Service (SaaS), Cloud Platform as a Service (PaaS), Cloud Infrastructure as a Service (IaaS)); and, four deployment models (Private cloud, Community cloud, Public cloud, Hybrid cloud). Key enabling technologies include: (1) fast wide-area networks, (2) powerful, inexpensive server computers, and (3) high-performance virtualization for commodity hardware.
Hybrid Cloud
Hybrid cloud means hosting applications and services for a system across both private infrastructure and a public cloud service. This is required for legacy services that can't be migrated to public cloud or for legal reasons, where data must reside in countries the public cloud provider does not have a presence in.
Cloud Agnostic
Systems that can run on multiple public cloud platforms. This is done to avoid lock-in to one vendor, but in practice this results in lock-in to software that promises to hide difference between clouds.
Polycloud
Running an application or service on more than one public cloud platform.
Environment
Containers
Cluster
Cluster as code.
Server
Serverless Execution Environment
Configuration
Configuration Drift
Configuration drift is variation that happens over time across systems that were once identical. Manually making changes in configuration (performance optimizations, permissions, fixes), even if the base was laid down by automation, causes configuration drift. Selectively using automation on some of the initially identical systems, but not on others, also causes configuration drift. This is how snowflake systems come into existence. Also see Minimize Variation. Once manually-introduced configuration drift occurs, the trust in automation goes down, because people are not sure how an automation will modify a manually-changed system. Interestingly, manual configuration creeps in because the automation is not run frequently and consistently, leading to a vicious circle. To avoid this spiral, make everything reproducible automatically and consistently run automation. Operational automation combined with good monitoring exposes configuration drift.
Configuration Registry
A configuration registry is a service that stores configuration values that may be used for many purposes, including service discovery, stack integration, etc. The configuration registry can be used to provide stack configuration values when stacks are being instantiated. Using a configuration registry separates configuration from implementation. Parameters in the registry can be set, used and view by other tools. The configuration registry can act as a Configuration Management Database (CMDB), a source fo truth for the system configuration.
If the configuration registry properly protects security-sensitive information, this is a serious advantage because it eliminates the need for other service specialized in protecting secrets.
On the downside, the configuration registry becomes a dependency for other components of the system and possibly a single point of failure. Since such a component is required in disaster recovery, makes it a component of the critical path.
Configuration Hierarchy
Most tools support a chain of configuration options with a predictable hierarchy of precedence (command line parameters take precedence over environment variables, which take precedence over configuration files).
Governance
Lightweight Architectural Governance
Lightweight architectural governance aims to balance autonomy and centralized control. More in EDGE: Value-Driven Digital Transformation by Jim Robert Highsmith, Linda Luu, David Robinson and the The Goldilocks zone of lightweight architectural governance Jonny LeRoy talk.
Security
- IaC Chapter 3. Infrastructure Platform → Network Resources → Zero-trust security model with SDN.
Organizatorium
- Integration points