GlusterFS Concepts

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

External

Internal

Overview

GlusterFS is a storage technology that uses volumes hosted on different servers to construct a distributed and/or replicated network file system. GlusterFS concatenates existing filesystems into one or more big chunks so that data being written into or read out of GlusterFS gets distributed across multiple hosts simultaneously. The file system is fully POSIX compliant. It also supports storage paradigms as Block Storage and Object Storage. GlusterFS stores the data on stable kernel file-systems like ext4 or XFS (recommended), and offers the storage for consumption volumes. GlusterFS allows for rapid provisioning of additional storage. It incorporates automatic failover. All these are implemented without the use an additional metadata server; GlusterFS uses instead an unique hash tag for each file, stored within the file system itself.

GlusterFS is free and open source software and can utilize common off-the-shelf hardware.

Volume

A volume is a logical share exposed to clients. A volume may consist in several bricks, generally hosted by different servers. The servers in question host kernel space file-systems. A volume is also known as a global namespace. A volume is also analogous to a /etc/exports entry for NFS.

Volume Types

The volume types below can be mixed.

Distributed Volume

A distributed volume spreads the data across the available bricks. 100 files written on a volume built by two bricks, on average 50 will end up on one brick, and 50 on the other. The system is similar to RAID0 for physical disks.

Replicated Volume

A replicated volume transparently replicates the data with configurable multiplicity (number of file replicas). The multiplicity is a characteristic of a volume, chosen when the volume is created.

Striped Volume

Brick

Brick is a device - in most cases a file system - that is being used for GlusterFS storage. Each brick for every volume on the host requires its own port.

Translator

Trusted Pool

Trusted pool refers collectively to the hosts in a given cluster.

Node

A node or server refers to any server that is part of the trusted pool.

Failover

Failover is a default feature. If one of the servers goes down, access to data is not lost. No manual steps are required for failover. When the server that failed is fixed, you don’t have to do anything to get the data back except wait. In the meantime, the most current copy of your data keeps getting served from the node that was still running.

Organizatiorium

Provides two interface to storage: POSIX file-system and via a REST gateway for object storage support.

Provides high availability of data and metadata.

Heketi Heketi provides a RESTful management interface which can be used to manage the life cycle of GlusterFS volumes https://github.com/heketi/heketi