Linux General Concepts

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

Internal

Kernel

Kernel Concepts

Process

A process is an operating system level construct that holds all the resources an application maintains and uses at runtime. These resources include, but are not limited to a memory address space, file handles, devices and threads. Each process contains at least one thread, and the initial thread for the process is called the main thread. When the main thread terminates, the process, and subsequently the application, terminates.

Process States

A Linux process can be in one of the following states.

LInuxProcessStates.png

Runnable (R)

The process is running or it is runnable, which means it sits in the run queue.

Uninterruptible Sleep (D)

The process sits in the wait queue, waiting for an event.

Interruptible Sleep (S)

The process enters an interruptible sleep state if its write call encounters a full kernel buffer, as described for TTY, or in similar situations. The process sits in the wait queue, waiting for some event or signal. When the process is in this state, ps -l displays a "wait change" (WCHAN) column, which tells on what kernel event the process is waiting for:

F S   UID    PID   PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
4 S     0      1      0  0  80   0 - 48361 ep_pol ?        00:00:02 systemd
1 S     0      2      0  0  80   0 -     0 kthrea ?        00:00:00 kthreadd

Stopped (T)

Stopped either by a job control signal or because is being debugged.

Zombie (Z)

Process was terminated but not yet reaped by its parent.

Execution Context

An execution context represents the means for the kernel to allocate CPU cycles to a process. A process is said to be alive if it has an execution context. An alive process can perform actions, as opposite to a device driver. A device driver has some data fields and some methods, but the only way it can actually do something is when one of its method gets called from an execution context of a process or a kernel interrupt handler.

Foreground Process

A foreground process is a process whose stdin/stdout/strerr are connected to a TTY device.

ForegroundProcess.png

When a shell starts, its stdin/stdout/strerr are connected a TTY device so the shell becomes automatically the foreground process for that device. The associated TTY device of the shell is reported by:

tty

Once a command is executed in the shell, the process associated with that command gets by default its stdin/stdout/strerr connected to the TTY device the shell was associated with, and it becomes the new foreground process.


There can be just one foreground process per TTY device at a time. However, from a practical perspective, a windowed graphical environment supports several active windows that effectively serve as multiple simultaneous foreground processes.

It is the responsibility of the TTY device to direct the user input to the foreground process only.

Background Process

A background process is a process whose stdin/stdout/strerr are detached from a TTY device. All processes running on a system, except at most one one foreground process per TTY device, are background processes.

Process Group

A process group is the same thing as a job. For examples, two processes that communicate through a pipe are part of the same process group (job), because every process in the pipeline should be manipulated (stopped, resumed, killed) simultaneously. They may have their stdin/stdout/stderr associated with the same TTY. By default, a fork places the newly created child process in the same process group as its parent.

Process Group Leader

Job

A job is the same as a process group. Internal shell commands like bg, fg, and jobs can be used to manipulate existing jobs within a session. The shell, as part of its session leader duties, creates a new process group each time it launches a pipeline.

Pipeline

Every pipeline is a job, because every process in the pipeline should be manipulated (stopped, resumed, killed) simultaneously.

Session

The session ID for a process can be obtained with

ps <pid> -o sess

Each session is managed by the session leader.

Session Leader

The session leader, usually a shell, which is cooperating tightly with the kernel using a protocol of signals and system calls. The session leader keeps track of its jobs using the SIGCHLD signal.

Threads

The operating system schedules the threads to run against physical (or virtual) processors.

Signals

Linux Signals

TTY

TTY

Sockets

Unix domain sockets are bidirectional communication mechanisms that allow processes running within the same host operation system to exchange data. IP (network) sockets are bidirectional communication mechanisms allowing processes running on different hosts to exchange data over the network. Because of simplifying assumptions, UNIX sockets are faster and lighter, so they should be preferred over network sockets when we are sure the process are collocated. UNIX and network sockets share the API. They are subject to filesystem permissions. More details: https://en.wikipedia.org/wiki/Unix_domain_socket, https://en.wikipedia.org/wiki/Network_socket.

The /sys Filesystem

cgroups

Linux cgroups

Namespaces

Linux Namespaces

File Descriptor

A file descriptor is a non-negative integer used internally by a process to identity and access a file and other input/output resource such as a pipe or a network socket. File descriptors are part of the POSIX application programming interface. Each process should expect to have three standard POSIX file descriptors, corresponding to the three standard streams:

  • standard input (stdin), whose file descriptor is 0.
  • standard output (stdout), whose file descriptor is 1.
  • standard error (stderr), whose file descriptor is 2.

The file descriptor is an index into a per-process file descriptor table maintained by the kernel, that in turn indexes into a system-wide table of files opened by all processes, called the file table. This table records the mode with which the file or other resource has been opened: for reading, writing, appending, etc. It also indexes into a third table called the inode table that describes the actual underlying files. To perform input or output, the process passes the file descriptor to the kernel through a system call, and the kernel will access the file on behalf of the process. The process does not have direct access to the file or inode tables.

On Linux, the set of file descriptors open in a process can be accessed under the path /proc/<pid>/fd/.

File Handle

Memory-Mapped File

lsof tells if a file opened by a process is memory mapped, by displaying "mem" in the FD column.

Peripheral Component Interconnect (PCI)

https://en.wikipedia.org/wiki/Conventional_PCI

USER_HZ

CPU statistics are expressed in ticks of 1/100h of a second, also called "user jiffies" There are USER_HZ "jiffies" per second, and on x86 systems, USER_HZ is 100. Historically, this mapped exactly to the number of scheduler "ticks" per second, but higher frequency scheduling and tickless kernels have made this number irrelevant.

Some documents express USER_HZ as 1/100th of a second.

For the actual value on the system, use sysconf(_SC_CLK_TCK) or execute:

getconf CLK_TCK