Terraform Concepts

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

Internal

Overview

Terraform is a tool for building, changing and managing infrastructure as code. It uses a configuration language named Hashicorp Configuration Language (HCL). Terraform is platform agnostic, and achieves that by using different provider APIs for resource provisioning, via plug-ins. A heterogenous environment can be managed with the same workflow.

Workflow

The typical Terraform workflow is:

  • Scope - define what resources are needed.
  • Author - create the configuration file in HCL.
  • Initialize – run terraform init in the project directory with the config files. This will download the correct provider plugins.
  • Plan and Apply - terraform plan (verification) and then terraform apply.

Configuration

The set of files used to describe infrastructure is known as Terraform configuration. Configuration files have a .tf extension.

"Configuration" is an important concept, and Hashicorp documentation refers to it repeatedly. A somewhat appropriate synonym for it would be "infrastructure project". Terraform was built to help manage and enact change. The configuration is changed locally and Terraform builds an execution plan that only modifies what is necessary to reach the desired state. Configuration and *state* can be version controlled. How? Changes in configuration are also “applied” with terraform apply.

Hashicorp Configuration Language (HCL)

HCL is human-readable. Configuration can also be JSON, but JSON is only recommended when the configuration is generated by a machine. Internally, the declarative language that drives provider API for resource provisioning. It contains support for input variables, output variables, etc. For more details, see:

Hashicorp Configuration Language

Provider

https://www.terraform.io/docs/providers/index.html

A provider is responsible for creating and managing resources. Terraform uses provider plug-ins to translate its configuration into API instructions for the provider. A provider is specified in a "provider" block in a configuration file. Multiple provider blocks can exist in a Terraform configuration file.

Provider Plug-In

Provider-specific resources are managed with provider plugins. Each provider plugin is a an encapsulated binary, distributed separated by Terraform. They are downloaded by terraform init and stored in a subdirectory of the current working directory.

Available Providers

AWS

Terraform AWS Provider

Kubernetes

https://www.terraform.io/docs/providers/kubernetes/index.html

Helm

https://www.terraform.io/docs/providers/helm/index.html

Resource

https://www.terraform.io/docs/configuration/resources.html

A Terraform resource represents an actual resource that exists in the infrastructure. A resource can be a physical components, such an EC2 instance, or a logical resource, such an application. A Terraform resource has a type and a name. In a configuration file, a resource is described in a "resource" block.

The primary kind of resource, declared by a resource block, is known as a managed resource. A managed resource is different from a data resource, which provides read-only data exposed as a data source. Both kinds of resources take arguments and export attributes for use in configuration, but while managed resources cause Terraform to create, update, and delete infrastructure objects, data resources cause Terraform only to read objects. For brevity, managed resources are often referred to just as "resources" when the meaning is clear from context.

resource "resource-name" "local-name" {
  ...
}

Resource Type

The resource type and name together serve as an identifier for a given resource and so must be unique within a module.

Resource Name

The resource name is used to refer to this resource from elsewhere in the same Terraform module, but has no significance outside of the scope of a module. The resource type and name together serve as an identifier for a given resource and so must be unique within a module.

Resource Syntax

Resource HCL Syntax Details

Resource Dependencies

https://learn.hashicorp.com/terraform/getting-started/dependencies

Resource parameters may use information from other resources. This relationship is expressed syntactically via an interpolation expression.

instance = aws_instance.example.id

If the resources are not dependent, they can be created in parallel, which will be done by Terraform whenever possible.

Implicit Dependency

Implicit dependencies via interpolation expressions are the primary way to inform Terraform about these relationships and should be used whenever possible.

Explicit Dependency

Explicit dependencies are expressed with “depends_on”. This is when the dependency is configured inside the application code, and it has to be explicitly mirrored in the infrastructure configuration.

depends_on = [aws_s3_bucket.example]

Tainted Resource

When provisioning fails, resources are marked as "tainted". Resources can be manually tainted with the “taint” command. This command does not modify infrastructure, but it modifies the state file to mark the resource as tainted – the next plan will show that the resource will be destroyed and recreated.

Data Source

https://www.terraform.io/docs/configuration/data-sources.html

A data source allows data to be fetched or computed for use in Terraform configuration, in a read-only manner, from a data resource. The underlying resource is queried, but not created, updated or destroyed, unlike in the managed resource case. Use of data sources allows a Terraform configuration to make use of information defined outside of Terraform, or defined by another separate Terraform configuration. A data source is accessed via a special kind of resource known as a data resource, declared with a data block.

data "data-source-name" "local-name" {
  ...
}

The data block requests Terraform to read from a given data source ("aws_ami") and export the result under the given local name. The name is used to refer to this resource from elsewhere in the same Terraform module, but has no significance outside of the scope of a module. The data source and name together serve as an identifier for a given resource and so must be unique within a module.

Provisioning

In this context, provisioning means initialization of the resources created by the “apply” step by performing software provisioning. Another name for provisioning is instance initialization.

Provisioner

https://www.terraform.io/docs/provisioners/index.html

A provisioner uploads files, runs shell scripts, installs and trigger other software like configuration management tools. A provisioner is only run when the resource is created. The provisioner is declared inside a resource block with the “provisioner” keyword.

resource "aws_instance" "example" {
  … 
  provisioner "local-exec" {
    command = "echo ${aws_instance.example.public_ip} > ip_address.txt"
  }
}

Multiple provisioner blocks can be added.

Failed Provisioner

https://learn.hashicorp.com/terraform/getting-started/provision#failed-provisioners-and-tainted-resources

If a resource is successfully created but fails during provisioning, it is marked as “tainted”.

Available Provisioners

Module

A module is a self-contained package of Terraform configuration that is managed as a group. Modules are used to create reusable components, and treat pieces of infrastructure as a black box. There has been a change in semantics in Terraform 0.12. Modules can be nested to decompose complex systems into manageable components. A module may include automated tests, examples and documentation. A good module should raise the level of abstraction by describing a new concept in your architecture that is constructed from resource types offered by providers. Hashicorp documentation recommends against writing modules that are just thin wrappers around existing resources. If you have trouble finding a name for your module that isn't the same as the main resource type inside it, that may be a sign that your module is not creating any new abstraction and so the module is adding unnecessary complexity. Just use the resource type directly in the calling module instead.

Root Module

When terraform apply is executed, all .tf files in the working directory terraform is executed from form the root module. The root module may call other modules and connect them together by passing output values from one to input values of another.

Using a Module

https://www.terraform.io/docs/configuration/modules.html

To call a module from its dependent module means to include the contents of that module into the configuration with specific values for its input variables. The intention to call (or use) a module is declared in a module block, specified within the dependent module, which contains the source, a set of input values, which are listed in the module's "Inputs" documentation. The only required attribute is source attribute, which tells Terraform where the dependency module can be retrieved. In is also highly recommended to specify the module's version. Terraform automatically downloads and manages modules. Terraform can retrieve modules from a variety of sources, including the local filesystem, Terraform Registry, private module registries, Git and HTTP. For more details see Accessing a Remote Module below.

terraform {
  required_version = "0.11.11"
}

provider "aws" {
  ...
}

module "consul" {
  source      = "hashicorp/consul/aws"
  version     = "0.7.3"
  num_servers = "3"
}

Accessing a Remote Module

https://www.terraform.io/docs/modules/sources.html

Terraform can retrieve modules from a variety of remote sources, including Terraform Registry, private module registries, GitHub, Git and HTTP.

GitHub

source = "github.com/hashicorp/terraform-aws-consul/modules/consul-cluster?ref=v0.7.3"

TODO This did not work, more research is needed. The error message was:

Error: Failed to download module

Could not download module "test" (root.tf:2) source code from
"github.com/hashicorp/terraform-aws-consul/modules/consul-cluster?ref=v0.7.3":
subdir "modules/consul-cluster%253Fref=v0.7.3" not found

Module Examples (from Terraform Registry)

https://registry.terraform.io/modules/hashicorp/consul/aws/0.7.3
https://github.com/hashicorp/terraform-aws-consul

Using a Module

TODO

Module Versioning

Writing a Terraform Module - Module Versioning

Module Syntax

Module HCL Syntax Details

Module Initialization

If a module is referred in configuration, it is necessary to run - or re-run - terraform init, which obtains and installs the new module's source code.

Module Outputs

A module's outputs are values produced by the module: the ID of each resource it creates:

${module.module-name.output-name}

Module Destruction

All resources created by the module will be destroyed.

Writing a Module

Writing a Terraform Module

Terraform Registry

Terraform Registry includes ready-to-use modules for various common purposes - they can serve as larger building-blocks for the infrastructure.

https://registry.terraform.io/

State

https://www.terraform.io/docs/state/index.html

The normal Terraform workflow consists in reading configuration, which represents codified infrastructure, and enacting this specification by instantiating or changing managed resources. Terraform modifies the state of the platform it acts upon. Normally, there should be no need to represent that state, as it is reflected by managed resources themselves. However, from a practical perspective, accessing resources to read their state every time that state is needed is impractical and ineffective, especially when the size of the problem is large.

The solution Terraform comes up with is to represent and cache the managed resources state. That representation is known as "the state". The state is used by Terraform to map the real world resources to configuration. This solution improves performance for large infrastructures.

By default, the state is stored as a JSON file named terraform.tfstate.


Relationship between state and configuration.

Backend

Local Backend

.terraform Directory

.terraform directory is created by default in the root module by terraform init command and it has the follwing structure:

.
└── plugins
    └── darwin_amd64
        ├── lock.json
        └── terraform-provider-aws_v2.49.0_x4

.terraform contains the plugins subdirectory.

.terraform.tfstate.lock.info

The file is created is created by default in the root module while an operation that modifies state is underway.

{
  "ID":"121f201d-a963-35fd-5d93-9061a4168511",
  "Operation":"OperationTypeApply",
  "Info":"",
  "Who":"ovidiufeodorov@Ovidiu-Feodorov.local",
  "Version":"0.12.13",
  "Created":"2020-02-20T02:38:55.865036Z",
  "Path":"terraform.tfstate"
}

terraform.tfstate

The file is created is created by in the root module by default. It is a JSON file.

terraform.tfstate.backup

The file is created is created by in the root module by default.

Remotel Backend

What it is?

Why it is needed?

Where it is maintained: locally or remotely.

Backends

Workspace.


















Need for state.

terraform.tfstate State File

A state file is created when the project is first initialized. It is maintained in the root of the project as terraform.tfstate. State is used to create plans and manage changes to infrastructure. Prior to any operation, the state is refreshed from the real infrastructure – making the state the source of truth. The content of the file can be inspected with terraform show.

What about state.tfstate. Is this a standard file? What role does it play?

Backend

https://www.terraform.io/docs/backends/

S3 Backend

S3 Backend

Remote State

https://www.terraform.io/docs/state/remote.html

It is recommended to setup remote state.

State Locking

https://www.terraform.io/docs/state/locking.html

Workspace

https://www.terraform.io/docs/state/workspaces.html