PostgreSQL Concepts: Difference between revisions
No edit summary |
|||
(75 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
=External= | |||
* https://www.postgresql.org/docs/current/index.html | |||
* https://www.postgresql.org/docs/16/glossary.html | |||
=Internal= | =Internal= | ||
* [[PostgreSQL#Subjects|PostgreSQL]] | |||
* [[PostgreSQL# | =Cluster= | ||
A collection of databases and global SQL objects, and their common static and dynamic metadata. A clusters is also referred to as [[#Instance|instance]]. There is no relationship within the PostgreSQL cluster and the SQL term "CLUSTER". | |||
==Cluster Operations== | |||
* [[PostgreSQL_DDL_Operations#initdb|Create a database cluster]] | |||
=Instance= | |||
An PostgreSQL instance is a group of backend and auxiliary processes that communicate using a common shared memory area. There is one postmaster process that manages the instance. One instance manages exactly one [[#Cluster|database cluster]] with all its databases. Many instances can run on the same server machine as long as the TCP ports do not conflict. The instance handles all key features of a RDBMS: read and write access to files and shared memory, enforcement of [[ACID]] properties, connections to clients, privilege verification, crash recovery, replication, etc. | |||
==Session== | |||
A database client must establish a session to the PostgreSQL instance before being able to perform data operations with SQL commands. The session implies a lower-level [[#Connection|connection]] to the instance backend, over local or network socket. Session and connection are used interchangeably. | |||
= | ==<span id='Connection_Types'></span>Connection== | ||
A connection is an established line of communication between a client process and a backend process, usually over a network, supporting a [[#Session|session]]. Session and connection are used interchangeably. There are two types of connections, over a [[#Local_Socket_Connection|local socket]] and over [[#TCP|network]] using the TCP protocol. | |||
===<span id='Local_Socket'></span>Local Socket Connection=== | |||
This is the default type of connection, when the <code>psql</code> client and the PostgreSQL instance run on the same machine. | |||
===<span id='TCP'></span>TCP Network Connection=== | |||
The default port is 5432. | |||
==Identity and Permissions== | |||
<font color=darkkhaki> | |||
===Authentication=== | |||
====User==== | |||
Users are shared across [[#Database|databases]]. | |||
= | The "user" concept is equivalent with the "[[#Role|role]]" concept. They mean the same thing. | ||
{{Internal|PostgreSQL_DDL_Operations#Users|User Operations}} | |||
=====Master User===== | |||
An administrative user that exists when the RDBMS instance is created and that has privileges to create other database and other users. It is used to bootstrap the administration of the RDBMS instance, by defining all users, objects, and permissions in the databases of your DB instance. Master Username must start with a letter. The RDS documentation refers to it as "Master username". | |||
====Role==== | |||
The "role" concept is equivalent with the "[[#User|user]]" concept. They mean the same thing. | |||
=====Role Attributes===== | |||
A specific role may: | |||
* be a superuser | |||
* create another role | |||
* create a database | |||
</font> | |||
==Instance Operations== | |||
* Connecting to an instance means connecting to a specific database. See [[#Connecting_to_a_database|Connecting to a database]] below. | |||
=Tablespace= | =Tablespace= | ||
{{External|https://www.postgresql.org/docs/current/manage-ag-tablespaces.html}} | |||
A tablespace corresponds to a directory on disk. Tablespaces allow database administrators to define locations in the filesystem where the files representing database objects are stored. Once created, the table space can be referred to by name when creating database objects. Tablespaces give the opportunity to control the disk layout of a PostgreSQL installation. | |||
= | =Database= | ||
{{External|https://www.postgresql.org/docs/14/manage-ag-overview.html}} | |||
A database is a named collection of local SQL objects. A local SQL object is any object that can be created with an SQL CREATE command. | |||
== | A PostgresSQL usually comes with 4 pre-existing databases (postgres, admin, template0, template1). "postgres" is fit for general use and it should be used by default. | ||
==Database Name Case Sensitivity== | |||
<font color=darkkhaki> | |||
Verify this: | |||
It seems that the database name is case sensitive, even if a database is created with "CREATE DATABASE TEST_DB", the database name becomes "test_db", and this is what it should be used in the connect URL. "jdbc:postgres://localhost/test_db" will work, but "jdbc:postgres://localhost/TEST_DB" won't. | |||
</font> | |||
==Database Operations== | |||
* [[PostgreSQL_DDL_Operations#List_Databases|List available databases]] | |||
* [[PostgreSQL_DDL_Operations#Create_a_Database|Create a database]] | |||
* <span id='Connecting_to_a_database'></span>[[PostgreSQL_DDL_Operations#Connect_to_a_Database|Connecting to a database]] | |||
* [[PostgreSQL_DDL_Operations#Drop_a_Database|Drop a database]] | |||
{{ | =Schema= | ||
{{External|https://www.postgresql.org/docs/current/ddl-schemas.html}} | |||
A database contains one or more named '''schemas''', which in turn contain tables. Schemas also contain other kinds of named objects, including data types, functions, and operators. The same object name can be used in different schemas without conflict. Schemas are useful when multiple users want to use the database and don't want to interfere with each other. Schemas are also helpful to organize database objects into logical groups, to make them more manageable. Schemas are analogous to directories at the operating system level, except that schemas cannot be nested. | |||
Each database has by default a public schema, called "public". Other schemas can be created. | |||
Objects that exist within a schema are referred to by a qualified name consisting of the schema name and table name separated by a dot <code>someschema.sometable</code>. There is an even more general syntax <code>somedatabase.someschema.sometable</code>, but this is just for pro forma compliance with the SQL standard. If you write a database name, it must be the same as the database you are connected to. | |||
==Schema Operations== | |||
* | * [[PostgreSQL_DDL_Operations#List_Schemas|List schemas]] | ||
* | * [[PostgreSQL_DDL_Operations#Show_Current_Schema|Show current schema]] | ||
* | * [[PostgreSQL_DDL_Operations#Set_Default_Schema|Set default schema]] | ||
* [[PostgreSQL_DDL_Operations#Create_a_Schema|Create a schema]] | |||
* [[PostgreSQL_DDL_Operations#Drop_a_Schema|Drop a schema]] | |||
=Data Types= | =Data Types= | ||
{{External|https://www.postgresql.org/docs/16/datatype.html}} | |||
{{Internal|SQL_Data_Types|SQL Data Types}} | |||
For <code>CREATE TABLE</code> syntax see: {{Internal|PostgreSQL_DDL_Operations#Create_a_Table|Create a PostgreSQL Table}} | |||
==Identity== | ==Identity== | ||
Line 42: | Line 95: | ||
==Timestamp== | ==Timestamp== | ||
==Character Types== | |||
{{External|https://www.postgresql.org/docs/16/datatype-character.html}} | |||
===Variable Length with Limit=== | |||
A type designated by <code>character varying(n)</code> or <code>varchar(n)</code>. The forms are equivalent. This type can store strings up to n characters (not bytes) in length. An attempt to store a longer string into a column of these types will result in an error, unless the excess characters are all spaces, in which case the string will be truncated to the maximum length. If the string is shorter than n, <code>character varying</code> column will store the shorter string. | |||
===Fixed-Length with Blank Padding=== | |||
A type designated by <code>character(n)</code>, <code>char(n)</code> or <code>bpchar(n)</code>. It stores fixed-length, blank-padded strings of length n, which represents characters, not bytes. | |||
===Variable Unlimited Length=== | |||
The type is designated by <code>text</code>. This type stores strings of any length. It is not in the SQL standard. <code>text</code> is PostgreSQL's native string data type. | |||
==Numeric Types== | ==Numeric Types== | ||
Line 72: | Line 136: | ||
<FONT COLOR='darkgray'>TODO</font> | <FONT COLOR='darkgray'>TODO</font> | ||
==Arrays== | |||
{{External|https://www.postgresql.org/docs/current/arrays.html}} | |||
{{Internal|SQL_ARRAY#Overview|SQL ARRAYs}} | |||
PostgreSQL allows columns of a table to be defined as variable-length multidimensional arrays of any built-in or user-defined base type, enum type, range type or domain. An array data type is named by appending square brackets <code>[]</code> to the data type name of the array elements: | |||
<syntaxhighlight lang='sql'> | |||
CREATE TABLE some_table ( | |||
acl text[] | |||
... | |||
) | |||
</syntaxhighlight> | |||
To specify a default value: | |||
<syntaxhighlight lang='sql'> | |||
CREATE TABLE some_table ( | |||
acl text[] default '{}'::text[] | |||
... | |||
) | |||
</syntaxhighlight> | |||
The syntax allows the exact size of arrays to be specified (current implementation ignores it): | |||
<syntaxhighlight lang='sql'> | |||
CREATE TABLE some_table ( | |||
acl text[10] | |||
... | |||
) | |||
</syntaxhighlight> | |||
To insert data in such a table: | |||
<syntaxhighlight lang='sql'> | |||
INSERT INTO some_table (acl, ...) VALUES (ARRAY['acl1', 'acl2', ...], ...) | |||
</syntaxhighlight> | |||
===Searching in Arrays=== | |||
{{External|https://www.postgresql.org/docs/current/arrays.html#ARRAYS-SEARCHING}} | |||
{{Internal|SQL_ARRAY#Searching_in_Arrays|SQL ARRAYs | Searching in Arrays}} | |||
===Array Functions and Operators=== | |||
{{External|https://www.postgresql.org/docs/current/functions-array.html}} | |||
=Parameterized SQL Statements= | |||
{{Internal|SQL#Parameterized_SQL_Statements|Parameterized SQL Statements}} | |||
PostgreSQL allows declaring parameterized SQL statements, which can be thought as named functions with parameters. The parameterized statement can be any DML statement, including queries. | |||
The parameterized statement is declared once with the <code>PREPARE</code> statement: | |||
<syntaxhighlight lang='sql'> | |||
PREPARE some_parameterized_dml_statement (parameter_1_type, parameter_2_type, ...) AS | |||
<some_sql_statement_with_placeholders>; | |||
</syntaxhighlight> | |||
The SQL statement contains <code>$1</code>, <code>$2</code>, ... placeholders where <parameter_1_type> corresponds to <code>$1</code>, <parameter_2_type> corresponds to <code>$2</code>, etc. See examples below. | |||
The named parameterized statement declared as such can then be executed multiple times with <code>EXECUTE</code> that is passed different arguments: | |||
<syntaxhighlight lang='sql'> | |||
EXECUTE some_parameterized_dml_statement(arg_1, arg_2, ...); | |||
</syntaxhighlight> | |||
==Placeholder Format== | |||
<font size=-1.5> | |||
$<number> | |||
</font> | |||
==Parameterized <tt>INSERT</tt>== | |||
<syntaxhighlight lang='sql'> | |||
PREPARE parameterized_insert (integer, text, text) AS | |||
INSERT INTO person (id, name, city) VALUES ($1, $2, $3); | |||
</syntaxhighlight> | |||
<syntaxhighlight lang='sql'> | |||
EXECUTE parameterized_insert(1, 'Alice', 'Albuquerque'); | |||
</syntaxhighlight> | |||
==Parameterized Query== | |||
<syntaxhighlight lang='sql'> | |||
PREPARE parameterized_query(integer, text) AS | |||
SELECT * FROM person WHERE id=$1 OR name=$2; | |||
</syntaxhighlight> | |||
<syntaxhighlight lang='sql'> | |||
EXECUTE parameterized_query(1, 'Bob'); | |||
</syntaxhighlight> | |||
=Constraints= | |||
==Not Null== | |||
{{External|https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-NOT-NULL}} | |||
=Environment Variables= | |||
==<tt>PGDATA</tt>== | |||
Set it after installation, see: {{Internal|PostgreSQL_Installation_on_Mac#Configure_Environment|PostgreSQL Installation on Mac}} |
Latest revision as of 18:19, 31 May 2024
External
Internal
Cluster
A collection of databases and global SQL objects, and their common static and dynamic metadata. A clusters is also referred to as instance. There is no relationship within the PostgreSQL cluster and the SQL term "CLUSTER".
Cluster Operations
Instance
An PostgreSQL instance is a group of backend and auxiliary processes that communicate using a common shared memory area. There is one postmaster process that manages the instance. One instance manages exactly one database cluster with all its databases. Many instances can run on the same server machine as long as the TCP ports do not conflict. The instance handles all key features of a RDBMS: read and write access to files and shared memory, enforcement of ACID properties, connections to clients, privilege verification, crash recovery, replication, etc.
Session
A database client must establish a session to the PostgreSQL instance before being able to perform data operations with SQL commands. The session implies a lower-level connection to the instance backend, over local or network socket. Session and connection are used interchangeably.
Connection
A connection is an established line of communication between a client process and a backend process, usually over a network, supporting a session. Session and connection are used interchangeably. There are two types of connections, over a local socket and over network using the TCP protocol.
Local Socket Connection
This is the default type of connection, when the psql
client and the PostgreSQL instance run on the same machine.
TCP Network Connection
The default port is 5432.
Identity and Permissions
Authentication
User
Users are shared across databases.
The "user" concept is equivalent with the "role" concept. They mean the same thing.
Master User
An administrative user that exists when the RDBMS instance is created and that has privileges to create other database and other users. It is used to bootstrap the administration of the RDBMS instance, by defining all users, objects, and permissions in the databases of your DB instance. Master Username must start with a letter. The RDS documentation refers to it as "Master username".
Role
The "role" concept is equivalent with the "user" concept. They mean the same thing.
Role Attributes
A specific role may:
- be a superuser
- create another role
- create a database
Instance Operations
- Connecting to an instance means connecting to a specific database. See Connecting to a database below.
Tablespace
A tablespace corresponds to a directory on disk. Tablespaces allow database administrators to define locations in the filesystem where the files representing database objects are stored. Once created, the table space can be referred to by name when creating database objects. Tablespaces give the opportunity to control the disk layout of a PostgreSQL installation.
Database
A database is a named collection of local SQL objects. A local SQL object is any object that can be created with an SQL CREATE command.
A PostgresSQL usually comes with 4 pre-existing databases (postgres, admin, template0, template1). "postgres" is fit for general use and it should be used by default.
Database Name Case Sensitivity
Verify this:
It seems that the database name is case sensitive, even if a database is created with "CREATE DATABASE TEST_DB", the database name becomes "test_db", and this is what it should be used in the connect URL. "jdbc:postgres://localhost/test_db" will work, but "jdbc:postgres://localhost/TEST_DB" won't.
Database Operations
Schema
A database contains one or more named schemas, which in turn contain tables. Schemas also contain other kinds of named objects, including data types, functions, and operators. The same object name can be used in different schemas without conflict. Schemas are useful when multiple users want to use the database and don't want to interfere with each other. Schemas are also helpful to organize database objects into logical groups, to make them more manageable. Schemas are analogous to directories at the operating system level, except that schemas cannot be nested.
Each database has by default a public schema, called "public". Other schemas can be created.
Objects that exist within a schema are referred to by a qualified name consisting of the schema name and table name separated by a dot someschema.sometable
. There is an even more general syntax somedatabase.someschema.sometable
, but this is just for pro forma compliance with the SQL standard. If you write a database name, it must be the same as the database you are connected to.
Schema Operations
Data Types
For CREATE TABLE
syntax see:
Identity
- http://www.postgresqltutorial.com/postgresql-identity-column/
- http://www.postgresqltutorial.com/postgresql-serial/
Timestamp
Character Types
Variable Length with Limit
A type designated by character varying(n)
or varchar(n)
. The forms are equivalent. This type can store strings up to n characters (not bytes) in length. An attempt to store a longer string into a column of these types will result in an error, unless the excess characters are all spaces, in which case the string will be truncated to the maximum length. If the string is shorter than n, character varying
column will store the shorter string.
Fixed-Length with Blank Padding
A type designated by character(n)
, char(n)
or bpchar(n)
. It stores fixed-length, blank-padded strings of length n, which represents characters, not bytes.
Variable Unlimited Length
The type is designated by text
. This type stores strings of any length. It is not in the SQL standard. text
is PostgreSQL's native string data type.
Numeric Types
SMALLINT
2 byte integer.
INT
4 byte integer. This is a typical choice for integers.
BIGINT
8 byte integer.
DECIMAL
NUMERIC
REAL
DOUBLE PRECISION
SERIAL
BIGSERIAL
TODO
Arrays
PostgreSQL allows columns of a table to be defined as variable-length multidimensional arrays of any built-in or user-defined base type, enum type, range type or domain. An array data type is named by appending square brackets []
to the data type name of the array elements:
CREATE TABLE some_table (
acl text[]
...
)
To specify a default value:
CREATE TABLE some_table (
acl text[] default '{}'::text[]
...
)
The syntax allows the exact size of arrays to be specified (current implementation ignores it):
CREATE TABLE some_table (
acl text[10]
...
)
To insert data in such a table:
INSERT INTO some_table (acl, ...) VALUES (ARRAY['acl1', 'acl2', ...], ...)
Searching in Arrays
Array Functions and Operators
Parameterized SQL Statements
PostgreSQL allows declaring parameterized SQL statements, which can be thought as named functions with parameters. The parameterized statement can be any DML statement, including queries.
The parameterized statement is declared once with the PREPARE
statement:
PREPARE some_parameterized_dml_statement (parameter_1_type, parameter_2_type, ...) AS
<some_sql_statement_with_placeholders>;
The SQL statement contains $1
, $2
, ... placeholders where <parameter_1_type> corresponds to $1
, <parameter_2_type> corresponds to $2
, etc. See examples below.
The named parameterized statement declared as such can then be executed multiple times with EXECUTE
that is passed different arguments:
EXECUTE some_parameterized_dml_statement(arg_1, arg_2, ...);
Placeholder Format
$<number>
Parameterized INSERT
PREPARE parameterized_insert (integer, text, text) AS
INSERT INTO person (id, name, city) VALUES ($1, $2, $3);
EXECUTE parameterized_insert(1, 'Alice', 'Albuquerque');
Parameterized Query
PREPARE parameterized_query(integer, text) AS
SELECT * FROM person WHERE id=$1 OR name=$2;
EXECUTE parameterized_query(1, 'Bob');
Constraints
Not Null
Environment Variables
PGDATA
Set it after installation, see: