Linux 7 Storage Concepts: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
 
(57 intermediate revisions by the same user not shown)
Line 1: Line 1:
=Internal=
=Internal=


* [[Storage Concepts]]
* [[Linux#Subjects|Linux Subjects]]
* [[Linux#Subjects|Linux Subjects]]
* [[Linux Logical Volume Management]]
=Commands=
* [[blkid]]
* [[fdisk]]
* [[sfdisk]]
* [[parted]]
* [[lsblk]]
* [[blockdev]]


=Block=
=Block=
Line 15: Line 26:
=Block Devices=
=Block Devices=


Unlike a character device, a block device provides random access to fixed-size blocks of data. On Linux, the block devices can be either "mapped", offering access to a logical volume in a volume group (/dev/mapper/VolGroup00-LogVol01), or "static", which is a traditional storage volume (/dev/sdba).
Unlike a character device, a [[Storage_Concepts#Block_Device|block device]] provides random access to fixed-size blocks of data. On Linux, the block devices can be either "mapped", offering access to a logical volume in a volume group (/dev/mapper/VolGroup00-LogVol01), or "static", which is a traditional storage volume (/dev/sdba).


=Block Driver=
=Block Driver=
Line 24: Line 35:


Alos see: {{Internal|Linux NFS Concepts#NFS_Driver|NFS Driver}}
Alos see: {{Internal|Linux NFS Concepts#NFS_Driver|NFS Driver}}
=Partition Table=
==Master Boot Record (MBR) Partition Table==
==GUID Partition Table (GPT)==
{{External|https://en.wikipedia.org/wiki/GUID_Partition_Table}}


=File System=
=File System=


==ext4==
==File System Concepts==
 
{{External|https://web.archive.org/web/20150509080554/http://www.ibm.com/developerworks/linux/library/l-linux-filesystem/}}
 
{{External|https://linuxgazette.net/105/pitcher.html}}
 
===inode===
 
An ''inode'' is a file system data structure that describes a filesystem object such as a file or directory. It has an unique identifier. Each inode stores file attributes (creation, last modification and last access time, permissions, etc.) and disk location of the fileystem object data. The inode does not store the name if the file, nor the file data. The name of the file and the association with a specific inode is stored in a ''directory'', which is also a (special) file. The directory, as a file, it is just an array of filenames and their associated inodes. The inode number of a file can be displayed with the [[Ls#-i|-i]] argument of the ls command.
 
An inode is deleted when the last hard link to it is deleted and all processes that keep the file open release it: there's a count in the inode that indicates how many filenames point to this file, and how many times the file had been opened, but not closed yet. The count is decremented by 1 each time one of those filenames is deleted or closed. When the count makes it to zero, the inode and its associated data are deleted.
 
===Hard Link===
 
The ''hard link'' is the association between a specific inode and a filename. An inode can have multiple file names, and those names are attached to the inode as hard links:
 
[[Ln#Create_a_Hard_Link|ln]] <''target''> <''hard-link-name''>
 
Because all hard links point to the same inode, changing metadata on a hard link is reflected on all other hard links.
 
===Symbolic Link===
 
A ''soft link'' or ''symbolic link'' is a special file that carries a path to another file. The OS recognizes it as a path, and redirects opens, reads and writes so instead of accessing the data within the special file, they access the data in the file named by the data in the special file.
 
==File System Types==
 
===ext4===


An ext4 filesystem is created with [[mkfs.ext4]].
An ext4 filesystem is created with [[mkfs.ext4]].
Line 37: Line 81:
'''Resize'''
'''Resize'''


==XFS==
===XFS===
 
{{Internal|XFS|XFS}}
 
==Distributed File Systems==
{{Internal|Distributed File Systems|Distributed File Systems}}
 
=Filesystem Encryption=
 
'''Stacked filesystem encryption''' solutions are implemented as a layer that stacks on top of an existing filesystem. All files written into an encryption-enabled folder are encrypted on the fly before the underlying filesystem writes them to disk, and decrypted whenever the filesystem reads them from disk. The content of the files are encrypted, but other than that, the content still exists in that filesystem as they would without encryption, as normal files/symlinks/hardlinks/etc. The alternative is to use [[#Block_Device_Encryption|block device encryption]], which operates below the filesystem layer and makes sure that everything written to a certain block device (whole disk, partition, etc.) is encrypted. More details are available in the "[[#Block_Device_Encryption|Block Device Encryption]]" below.
 
Filesystem encryption can be used for a remote backup system. This is a remote backup system security analysis:
 
{{Internal|Incremental Remote Backup System Security Analysis|Incremental Remote Backup System Security Analysis}}


An XFS filesystem is created with [[mkfs.xfs]].
=<span id='devicemapper'></span>Device Mapper=


'''Journal Recovery''' is done in kernel space at mount time. An fsck.xfs command exists, but it does not perform any useful action. If the journal needs repairing, unmount and mount the filesystem
Device Mapper is a foundational kernel-based framework for volume management: {{Internal|Device Mapper Concepts|Device Mapper Concepts}}


'''Metadata Error Behavior'''. When an unrecoverable metadata error is encountered, the filesystem will be shut down.
==Logical Volume Management==


'''Resize'''. The filesystem can be extended while online with xfs_growfs. It cannot be shrunk.
{{Internal|Linux Logical Volume Management Concepts|Logical Volume Management Concepts}}


'''Speculative allocation'''. XFS uses speculative preallocation to allocate blocks past EOF as files are written. This avoids file fragmentation due to concurrent streaming writes on NFS servers. This temporarily increases the size of the file, but if the preallocated space is not used for five minutes, the preallocation will be discarded. Because of this, fragmentation is rarely a significant issue on XFS filesystems.
==RAID==


Mounting an XFS fileystem in [[/etc/fstab]]:
{{Internal|RAID Concepts|RAID Concepts}}


/dev/vdb1                      /support-nfs-storage  xfs defaults 0 0
==Block Device Encryption==


'''Quota per Directory'''
Block device encryption operates below the filesystem layer and makes sure that everything written to a certain block device (whole disk, partition, etc.) is encrypted. This means that while the block device is offline, the entire content looks like a large blob of random binary data, without any resemblance to a filesystem. The data can be accessed only mounting the block device at arbitrary location in a special way. This solution is also known as "full disk encryption". One of the most common ways to implement block device encryption on Linux is to use <tt>dm-crypt</tt>, which is the standard [[Device Mapper Concepts#Encryption|device mapper]] encryption functionality. For more details, see:{{Internal|dm-crypt|dm-crypt}}
* https://scriptthe.net/2014/08/06/setting-up-a-hard-quota-with-a-directory-on-xfs/
* https://docs.oracle.com/cd/E37670_01/E37355/html/ol_quoset_xfs.html
* https://docs.oracle.com/cd/E37670_01/E37355/html/ol_prjquo_xfs.html
* https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/xfsquota


=<span id='devicemapper'></span>Device Mapper=
Also see "[[#Filesystem_Encryption|Filesystem Encryption]]", above.
 
==Device Mapper Operations==


Device Mapper is a kernel-based framework for volume management. It has thin provisioning and snapshotting capabilities. Used as one of Docker [[Docker device-mapper Storage Backend#Overview|storage backend options]].
{{Internal|Device Mapper Operations|Device Mapper Operations}}


=Loopback Device=
=Loopback Device=
Line 77: Line 132:
External resources:
External resources:
* https://ops.tips/blog/lvm-on-loopback-devices/
* https://ops.tips/blog/lvm-on-loopback-devices/
=Subjects=
* [[Linux Logical Volume Management Concepts]]

Latest revision as of 21:27, 10 December 2019

Internal

Commands

Block

A block is a fixed-size chunk of data. The size is determined by the kernel, and depends on the system's architecture and the filesystem being used. The block size the kernel uses to access a specific device can be obtained with:

blockdev --getbsz /dev/sda2

Sector

A sector is a small block whose size is usually determined by the underlying hardware.

Block Devices

Unlike a character device, a block device provides random access to fixed-size blocks of data. On Linux, the block devices can be either "mapped", offering access to a logical volume in a volume group (/dev/mapper/VolGroup00-LogVol01), or "static", which is a traditional storage volume (/dev/sdba).

Block Driver

A block driver provides access to block devices - devices that transfer randomly accessible data in fixed-size blocks. These are primarily hard drives. Aside providing user space processes read and writing access to block storage, the block drivers act as conduit between the core memory of the system and secondary storage, therefore they can be seen as making part of the virtual memory subsystem.

Much of the design of the block layer is centered on performance, as the entire system cannot run well if the block I/O subsystem is not well tuned.

Alos see:

NFS Driver

Partition Table

Master Boot Record (MBR) Partition Table

GUID Partition Table (GPT)

https://en.wikipedia.org/wiki/GUID_Partition_Table

File System

File System Concepts

https://web.archive.org/web/20150509080554/http://www.ibm.com/developerworks/linux/library/l-linux-filesystem/
https://linuxgazette.net/105/pitcher.html

inode

An inode is a file system data structure that describes a filesystem object such as a file or directory. It has an unique identifier. Each inode stores file attributes (creation, last modification and last access time, permissions, etc.) and disk location of the fileystem object data. The inode does not store the name if the file, nor the file data. The name of the file and the association with a specific inode is stored in a directory, which is also a (special) file. The directory, as a file, it is just an array of filenames and their associated inodes. The inode number of a file can be displayed with the -i argument of the ls command.

An inode is deleted when the last hard link to it is deleted and all processes that keep the file open release it: there's a count in the inode that indicates how many filenames point to this file, and how many times the file had been opened, but not closed yet. The count is decremented by 1 each time one of those filenames is deleted or closed. When the count makes it to zero, the inode and its associated data are deleted.

Hard Link

The hard link is the association between a specific inode and a filename. An inode can have multiple file names, and those names are attached to the inode as hard links:

ln <target> <hard-link-name>

Because all hard links point to the same inode, changing metadata on a hard link is reflected on all other hard links.

Symbolic Link

A soft link or symbolic link is a special file that carries a path to another file. The OS recognizes it as a path, and redirects opens, reads and writes so instead of accessing the data within the special file, they access the data in the file named by the data in the special file.

File System Types

ext4

An ext4 filesystem is created with mkfs.ext4.

Journal Recovery is done with e2fsck in userspace at boot time.

Metadata Error Behavior. When metadata errors are encountered, the behavior is configurable. The default is to continue.

Resize

XFS

XFS

Distributed File Systems

Distributed File Systems

Filesystem Encryption

Stacked filesystem encryption solutions are implemented as a layer that stacks on top of an existing filesystem. All files written into an encryption-enabled folder are encrypted on the fly before the underlying filesystem writes them to disk, and decrypted whenever the filesystem reads them from disk. The content of the files are encrypted, but other than that, the content still exists in that filesystem as they would without encryption, as normal files/symlinks/hardlinks/etc. The alternative is to use block device encryption, which operates below the filesystem layer and makes sure that everything written to a certain block device (whole disk, partition, etc.) is encrypted. More details are available in the "Block Device Encryption" below.

Filesystem encryption can be used for a remote backup system. This is a remote backup system security analysis:

Incremental Remote Backup System Security Analysis

Device Mapper

Device Mapper is a foundational kernel-based framework for volume management:

Device Mapper Concepts

Logical Volume Management

Logical Volume Management Concepts

RAID

RAID Concepts

Block Device Encryption

Block device encryption operates below the filesystem layer and makes sure that everything written to a certain block device (whole disk, partition, etc.) is encrypted. This means that while the block device is offline, the entire content looks like a large blob of random binary data, without any resemblance to a filesystem. The data can be accessed only mounting the block device at arbitrary location in a special way. This solution is also known as "full disk encryption". One of the most common ways to implement block device encryption on Linux is to use dm-crypt, which is the standard device mapper encryption functionality. For more details, see:

dm-crypt

Also see "Filesystem Encryption", above.

Device Mapper Operations

Device Mapper Operations

Loopback Device

A loopback device is an O/S mechanism that allows exposing a file as a block device. The loopback devices are usually named /dev/loop0, /dev/loop1, etc.

Loopback devices are managed with losetup, which can associate a loopback device with a file:

losetup /dev/loop0 ./lvm0.img

Once setup, the loopback devices should be reported as block devices by lsblk.

Loopback devices are used, among other things, to set up storage with Docker devicemapper driver.

External resources: