Linux General Concepts
Internal
Kernel
Process
A process is an operating system level construct that holds all the resources an application maintains and uses at runtime. These resources include, but are not limited to a memory address space, file handles, devices and threads. Each process contains at least one thread, and the initial thread for the process is called the main thread. When the main thread terminates, the process, and subsequently, the application, application terminates.
Process States
A Linux process can be in one of the following states.
Runnable (R)
The process is running or it is runnable, which means it sits in the run queue.
Uninterruptible Sleep (D)
The process sits in the wait queue, waiting for an event.
Interruptible Sleep (S)
The process enters an interruptible sleep state if its write call encounters a full kernel buffer, as described for TTY, or in similar situations. The process sits in the wait queue, waiting for some event or signal. When the process is in this state, ps -l displays a "wait change" (WCHAN) column, which tells on what kernel event the process is waiting for:
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD 4 S 0 1 0 0 80 0 - 48361 ep_pol ? 00:00:02 systemd 1 S 0 2 0 0 80 0 - 0 kthrea ? 00:00:00 kthreadd
Stopped (T)
Stopped either by a job control signal or because is being debugged.
Zombie (Z)
Process was terminated but not yet reaped by its parent.
Execution Context
An execution context represents the means for the kernel to allocate CPU cycles to a process. A process is said to be alive if it has an execution context. An alive process can perform actions, as opposite to a device driver. A device driver has some data fields and some methods, but the only way it can actually do something is when one of its method gets called from an execution context of a process or a kernel interrupt handler.
Foreground Process
A foreground process is a process whose stdin/stdout/strerr are connected to the controlling TTY device.
When a shell starts, its stdin/stdout/strerr are connected to the controlling TTY device so the shell becomes automatically the foreground process.
The controlling TTY device of a shell is reported by:
tty
Once a command is executed in the shell, the process associated with that command gets by default its stdin/stdout/strerr connected to the controlling TTY device and it becomes the foreground process.
Just one process can be in foreground at one time.
It is the responsibility of the TTY subsystem, specifically of the TTY driver, to direct the user input to the foreground process only.
Background Process
Process Group
A process group is the same thing as a job. For examples, two processes that communicate through a pipe are part of the same process group (job), because every process in the pipeline should be manipulated (stopped, resumed, killed) simultaneously. They may have their stdin/stdout/stderr associated with the same TTY. By default, a fork places the newly created child process in the same process group as its parent.
Process Group Leader
Job
A job is the same as a process group. Internal shell commands like bg, fg, and jobs can be used to manipulate existing jobs within a session. The shell, as part of its session leader duties, creates a new process group each time it launches a pipeline.
Pipeline
Every pipeline is a job, because every process in the pipeline should be manipulated (stopped, resumed, killed) simultaneously.
Session
The session ID for a process can be obtained with
ps <pid> -o sess
Each session is managed by the session leader.
Session Leader
The session leader, usually a shell, which is cooperating tightly with the kernel using a protocol of signals and system calls. The session leader keeps track of its jobs using the SIGCHLD signal.
Threads
The operating system schedules the threads to run against physical (or virtual) processors.
Signals
TTY
Sockets
Unix domain sockets are bidirectional communication mechanisms that allow processes running within the same host operation system to exchange data. IP (network) sockets are bidirectional communication mechanisms allowing processes running on different hosts to exchange data over the network. Because of simplifying assumptions, UNIX sockets are faster and lighter, so they should be preferred over network sockets when we are sure the process are collocated. UNIX and network sockets share the API. They are subject to filesystem permissions. More details: https://en.wikipedia.org/wiki/Unix_domain_socket, https://en.wikipedia.org/wiki/Network_socket.
The /sys Filesystem
cgroups
Namespaces
File Descriptor
A file descriptor is a non-negative integer used internally by a process to identity and access a file and other input/output resource such as a pipe or a network socket. File descriptors are part of the POSIX application programming interface. Each process should expect to have three standard POSIX file descriptors, corresponding to the three standard streams:
- standard input (stdin), whose file descriptor is 0.
- standard output (stdout), whose file descriptor is 1.
- standard error (stderr), whose file descriptor is 2.
The file descriptor is an index into a per-process file descriptor table maintained by the kernel, that in turn indexes into a system-wide table of files opened by all processes, called the file table. This table records the mode with which the file or other resource has been opened: for reading, writing, appending, etc. It also indexes into a third table called the inode table that describes the actual underlying files. To perform input or output, the process passes the file descriptor to the kernel through a system call, and the kernel will access the file on behalf of the process. The process does not have direct access to the file or inode tables.
On Linux, the set of file descriptors open in a process can be accessed under the path /proc/<pid>/fd/.
File Handle
Memory-Mapped File
lsof tells if a file opened by a process is memory mapped, by displaying "mem" in the FD column.
Peripheral Component Interconnect (PCI)
USER_HZ
CPU statistics are expressed in ticks of 1/100h of a second, also called "user jiffies" There are USER_HZ "jiffies" per second, and on x86 systems, USER_HZ is 100. Historically, this mapped exactly to the number of scheduler "ticks" per second, but higher frequency scheduling and tickless kernels have made this number irrelevant.
Some documents express USER_HZ as 1/100th of a second.
For the actual value on the system, use sysconf(_SC_CLK_TCK) or execute:
getconf CLK_TCK