GlusterFS Concepts
Internal
Overview
GlusterFS is a storage technology that uses volumes hosted on different servers to construct a distributed and/or replicated network file system. GlusterFS concatenates existing filesystems into one or more big chunks so that data being written into or read out of GlusterFS gets distributed across multiple hosts simultaneously. The file system is fully POSIX compliant. It also supports storage paradigms as Block Storage and Object Storage. GlusterFS stores the data on stable kernel file-systems like ext4 or XFS (recommended), and offers the storage for consumption volumes. GlusterFS allows for rapid provisioning of additional storage. It incorporates automatic failover. All these are implemented without the use an additional metadata server; GlusterFS uses instead an unique hash tag for each file, stored within the file system itself.
GlusterFS is free and open source software and can utilize common off-the-shelf hardware.
Volume
A volume is a logical share exposed to clients. A volume may consist in several subvolumes, generally hosted by different servers. The servers in question host kernel space file-systems.
Volume Types
The volume types below can be mixed.
Distributed Volume
A distributed volume spreads the data across the available bricks. 100 files written on a volume built by two bricks, on average 50 will end up on one brick, and 50 on the other. The system is similar to RAID0 for physical disks.
Replicated Volume
A replicated volume transparently replicates the data with configurable multiplicity (number of file replicas). The multiplicity is a characteristic of a volume, chosen when the volume is created.
Striped Volume
Subvolume
A subvolume is built by a brick.
Brick
Brick is a device - in most cases a file system - that is being used for GlusterFS storage.
Translator
Trusted Pool
Trusted pool refers collectively to the hosts in a given cluster.
Node
A node or server refers to any server that is part of the trusted pool.
Failover
Failover is a default feature. If one of the servers goes down, access to data is not lost. No manual steps are required for failover. When the server that failed is fixed, you don’t have to do anything to get the data back except wait. In the meantime, the most current copy of your data keeps getting served from the node that was still running.
Organizatiorium
Provides two interface to storage: POSIX file-system and via a REST gateway for object storage support.
Provides high availability of data and metadata.
Heketi Heketi provides a RESTful management interface which can be used to manage the life cycle of GlusterFS volumes https://github.com/heketi/heketi