Relational Databases

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

Internal

Overview

A relational database is also called an RDBMS (Relational Database Management System) or SQL database. Relational database represent data as "relations" (tables), where data is organized in rows and columns. The relational model was introduced in 1970 by E.F. Codd in the "A Relational Model of Data for Large Shared Data Banks" paper, and since then, the relational databases have been used across the software development industry for nearly all types of problems.

Data is stored according to an apriori defined schema. Related data from two more more different tables can be extracted with a join. Users can run SQL queries to retrieve data from one or multiple tables.

Relational Databases

Table (Relation)

In a relational model, data is stored in a set of tables, also known as relations. Within a relation, data is organized in rows and columns. A row of data is also knowns as record. Rather than using pointers to navigate between related entities, redundant data (foreign keys) is used to link records in different tables.

The table data is either held in memory (non-persistent) or permanent storage (persistent).

The SQL syntax has evolved around tables, which provide a central element of the syntax.

Record (Row)

Within a relation, data is organized in rows and columns. A row is a set of columns that together completely describe an entity. A row of data is also knowns as record.

Column

A column is an individual piece of data stored in a table.

Entity

The data from a table represents different instances of a data model entity.

Primary Key

Each table in a relational database includes information that uniquely identifies each row in that table. This type of data is known as the primary key of the table. A primary key consisting of two or more columns is known as a compound primary key.

A compound primary key, such as first name, last name, address and date of birth (for example) is referred to as a natural primary key. However, a uniquely generated ID would be referred to as a surrogate key.

Primary key values should never be allowed to change once a value has been assigned.

This is an example of declaring a SQL PRIMARY KEY constraint.

Foreign Key

A foreign key is a copy of a primary key value from another table, serving as a link that relates the this table's record containing the foreign key to the record identified by the primary key, in the other table. The process of linking these two records is referred to as a join. Foreign keys encode relationships in data.

This is an example of declaring a SQL FOREIGN KEY constraint.

Relationship Multiplicity (Cardinality)

The cardinality specifies how many instances of data can participate in a relationship. There are three different types of cardinality:

One-to-One (1:1) Relationships

In a one-to-one relationship, each constituent can have at most one relationship with the other constituent. 1:1 relationships are typically set up by a foreign key relationship in the database. The foreign key column can be set in either table. In practice, 1:1 relationships will be combined into a single table, rather than having a relationship between two separate data objects, so 1:1 relationships are not very common.

Multiplicity One to One.png

One-to-Many (1:N) Relationships

1:N relationships are also typically set up by foreign key relationships in the database. In the example above, a one-to-many relationship is "a manager has many employees who report to her". This relationship is implemented in the database by having each employee row contain the manager's primary key, as a foreign key. Thus, the employees are pointing back to their manager. This may seem backwards if we want to get from the manager to the employees. It works, however, because the database doesn't care, it is a flat structure without a sense of direction. You can construct queries that get from the manager to employees.

Multiplicity One to Many.png

Many-to-Many (M:N) Relationships

M:N relationships are typically set up by an association (or link) table in the database. An association table contains foreign keys to two other tables.

Multiplicity Many to Many.png

Directionality

Specifies the direction in which the relationship can be navigated.

Bidirectional

Unidirectional

Normalization

The process of refining a database schema to ensure that each independent piece of information is stored in only one place, except for foreign keys is known as normalization.

TODO

Object IDs

When persisting objects in a relational database, it is generally a good idea to have one field in the object that uniquely identifies the object.