Amazon Elastic File System Concepts: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
Line 46: Line 46:
An EFS file system can be accessed concurrently from multiple NFS clients, deployed within multiple Availability Zones within the same AWS Region.
An EFS file system can be accessed concurrently from multiple NFS clients, deployed within multiple Availability Zones within the same AWS Region.


Amazon EFS provides the close-to-open consistency semantics that applications expect from NFS.
Amazon EFS provides the [[Linux_NFS_Concepts#Close-to-Open_Consistency|close-to-open consistency]] semantics that applications expect from NFS.


In Amazon EFS, write operations are durably stored across Availability Zones in these situations:
In Amazon EFS, write operations are durably stored across Availability Zones in these situations:

Revision as of 01:53, 20 August 2020

Internal

Overview

Amazon Elastic File System (EFS) provides a scalable filesystem for Linux-based workloads for use with AWS cloud services and on-premises resources. More details about the specifics of the filesystem are available in the EFS File System section below.

EFS is built to scale on demand to petabytes without disrupting applications, growing and shrinking automatically as files are added and removed. It is designed to provide massively parallel shared access to thousands of Amazon EC2 instances, enabling applications to achieve high levels of aggregate throughput and IOPS with consistent low latencies. Amazon EFS is a fully managed service that requires no changes to existing applications and tools, providing access through a standard file system interface. Amazon EFS is a regional service storing data within and across multiple Availability Zones. The filesystems can be accessed across Availability Zones and regions via AWS Direct Connect or AWS VPN.

Amazon EFS is well suited to support a broad spectrum of use cases from highly parallelized, scale-out workloads that require the highest possible throughput to single-threaded, latency-sensitive workloads. Use cases such as lift-and-shift enterprise applications, big data analytics, web serving and content management, application development and testing, media and entertainment workflows, database backups, and container storage.

EFS and EC2

EFS and EC2

EFS File System

The EFS file system materializes as a POSIX-compliant file system mounted in an EC2 instance, a Docker container or a EKS pod. Most common use case is to mount EFS file systems on instances running in the AWS Cloud, but it is also possible to mount these file systems on instances running in on-premises data center, with AWS Direct Connect or AWS VPN.

The persistent storage backing up the EFS file system is managed by the AWS infrastructure and it is not accessible directly. However, an EFS file system can be accessed from a VPC via a mount target. The underlying protocol used to mount EFS file systems is NFS 4.0 and 4.1, so the appropriate NFS client must be available in the environment attempting to mount the file system.

An EFS file system is a primary resource, it has an ID, creation token, creation time and a lifecycle state.

Each file system has a DNS name of the following form:

<file-system-id>.efs.<aws-region>.amazonaws.com

Mount Target

A mount target can be logically thought of as an NFS server with a fixed IP address that exists in a particular subnet in a VPC. The mount target provides the IP address for the NFSv4 endpoint at which the EFS file system can be mounted. The mount target IP address is associated with a DNS name. Mount targets are designed to be highly available.

In case of multiple subnets in the same Availability Zone, a mount target must be created for each subnet, and each EC2 instances in the subnet can share the mount target. An EFS file system can only have mount targets in one VPC at a time.

Each mount target has an ID, the subnet ID in which it was created and the file system ID for which it was created, the IP address at which the file systems may be mounted, VPC security groups and a state. The mount target is a sub-resource of a file system, it can only be created within the context of an existing file system.

Efs.png

Access Point

https://docs.aws.amazon.com/efs/latest/ug/efs-access-points.html

An access point is an abstraction that applies an operating system user, group and file system path to any file system request made using the access point. The access point's operating system user and group override any identity information provided by the NFS client. Access points can also enforce a different root directory for the file system so that clients can only access data in the specified directory or its subdirectories: the file system path is exposed to the client as the access point's root directory. This ensures that each application always uses the correct operating system identity and the correct directory when accessing shared file-based datasets. Applications using the access point can only access data in its own directory and below.

Data Sharing and Consistency

An EFS file system can be accessed concurrently from multiple NFS clients, deployed within multiple Availability Zones within the same AWS Region.

Amazon EFS provides the close-to-open consistency semantics that applications expect from NFS.

In Amazon EFS, write operations are durably stored across Availability Zones in these situations:

  1. An application performs a synchronous write operation (for example, using the open Linux command with the O_DIRECT flag, or the fsync Linux command).
  2. An application closes a file.

Depending on the access pattern, Amazon EFS can provide stronger consistency guarantees than close-to-open semantics. Applications that perform synchronous data access and perform non-appending writes have read-after-write consistency for data access.

EFS Operations

  • Create a file system