SQL Querying Multiple Tables: Difference between revisions
Line 90: | Line 90: | ||
Charlie | Bluestone | Chicago | Charlie | Bluestone | Chicago | ||
</font> | </font> | ||
==ANSI Join Syntax== | ==ANSI Inner Join Syntax== | ||
The following syntax is equivalent. It does use the <code>JOIN</code> keywords, and it does not introduce the [[#Join_Condition|join condition]] with <code>ON</code>, but with <code>WHERE</code>: | The following syntax is equivalent. It does use the <code>JOIN</code> keywords, and it does not introduce the [[#Join_Condition|join condition]] with <code>ON</code>, but with <code>WHERE</code>: | ||
<font size=-1.5> | <font size=-1.5> |
Revision as of 01:15, 24 May 2024
External
Internal
Overview
More than one table can be used in a FROM
query clause, and when that happens, it is said that the query performs a join before filtering and returning the results.
Join
The mechanism of linking two or more tables is referred to as a join. For two tables, the most generic way of combining rows is to take every row from the first table and combine it with every row with the other table. This is known as the cartesian product of two relations, or a cross join. If the two tables have N and M rows respectively, the result will have N x M rows. In case of three tables, with N, M and P rows respectively, the result will contain N x M x P rows. See the Cross Join section below for an example of how that works.
In most cases, cartesian products are not very useful, they produce a lot of redundant data. We are usually interested in combining rows that are related to each other, and dropping everything else. This is done with an inner join. The inner join can be thought of as performing the cartesian product, then retaining only the rows that satisfy the join condition. Inner joins are the most commonly used type of join. This mental model is confirmed by results of an inner join without a join condition: the cartesian product. Other types of joins are the outer joins: left outer join and right outer join.
Examples
All examples provided in this article are based on three tables (person
, address
and country
), with the following schema:
person Column | Type id | name | company_id -----------+--------- ------+------------------+------------- id | integer # primary key 1 | Alice | 10 name | text 2 | Bob | 20 company_id | integer # foreign key that references company(id) 3 | Charlie | 30 company Column | Type id | name | city_id -----------+--------- ------+------------------+------------- id | integer # primary key 10 | Moonphone | 100 name | text 20 | Vortextime | 200 city_id | integer # foreign key that references city(id) 30 | Bluestone | 300 city Column | Type id | name -----------+--------- ------+------------------ id | integer # primary key 100 | San Francisco name | text 200 | New York 300 | Chicago
Inner Join
An inner join is performed by providing more than one table in the FROM
clause, and using a syntax that requires each table that participates to the join, except the first one, to be preceded by the JOIN
keyword and then followed by the ON
keyword:
SELECT ... FROM table_one JOIN table_two ON <join_condition> JOIN table_three ON <join_condition> ...
A JOIN
keyword without any qualifier signifies means an inner join. For clarity, INNER
can be (redundantly) specified:
SELECT ... FROM table_one INNER JOIN table_two ON <join_condition> INNER JOIN table_three ON <join_condition> ...
The two forms are equivalent. Also see ANSI Join Syntax below.
The complete syntax to inner join two tables is:
SELECT person.name AS name, company.name AS company
FROM person
JOIN company ON person.company_id = company.id
name | company ---------+------------ Alice | Moonphone Bob | Vortextime Charlie | Bluestone
The complete syntax to inner join three tables is:
SELECT person.name AS name, company.name AS company, city.name AS city
FROM person
JOIN company ON person.company_id = company.id
JOIN city ON company.city_id = city.id;
name | company | city ---------+------------+--------------- Alice | Moonphone | San Francisco Bob | Vortextime | New York Charlie | Bluestone | Chicago
ANSI Inner Join Syntax
The following syntax is equivalent. It does use the JOIN
keywords, and it does not introduce the join condition with ON
, but with WHERE
:
SELECT ... FROM table_one, table_two, ... WHERE <join_condition>
Join Condition
Tables are usually inner joined using foreign keys. Foreign keys encode relationships in data. The inner join condition states that we should associate the rows from the first table with rows from other table where the foreign key column value from the first table row is equal with the primary key column value from other table row, so a join condition based on foreign key usually looks like:
[...] ON table_one.foreign_key_from_table_two == table_two.primary_key
Any other expression that is evaluated to a boolean value can be used. The join condition will be evaluated and filter out rows from the full cartesian product.
Left Outer Join
Right Outer Join
Cross Join
A cross join produces the cartesian product of the tables participating in join.
For two tables:
SELECT * FROM person
CROSS JOIN company;
id | name | company_id | eye_color | id | name | city_id ----+---------+------------+-----------+----+------------+--------- 1 | Alice | 10 | blue | 10 | Moonphone | 100 2 | Bob | 20 | black | 10 | Moonphone | 100 3 | Charlie | 30 | black | 10 | Moonphone | 100 1 | Alice | 10 | blue | 20 | Vortextime | 200 2 | Bob | 20 | black | 20 | Vortextime | 200 3 | Charlie | 30 | black | 20 | Vortextime | 200 1 | Alice | 10 | blue | 30 | Bluestone | 300 2 | Bob | 20 | black | 30 | Bluestone | 300 3 | Charlie | 30 | black | 30 | Bluestone | 300 (9 rows)
This is the same as performing an inner join without a join condition.
SELECT * FROM person, company;
For three tables:
SELECT COUNT(*) FROM person
CROSS JOIN company
CROSS JOIN city;
count ------- 27
Using Table Aliases in Joins
The SQL syntax permits to alias the tables participating in join. These table aliases can be used to identify the tables when you reference columns in the SELECT
clause:
FROM table_one AS table_one_alias JOIN table_two AS table_two_alias ON <join_condition> JOIN table_three AS table_three_alias ON <join_condition> ...
For permanent, temporary and virtual tables aliasing is optional, their original names are usually sufficient to reference columns. However for derived tables, the alias is required if we want to refer to column from the derived table:
SELECT person.name, derived_table_alias.name
FROM person
JOIN (SELECT id, UPPER(name) AS name FROM company) AS derived_table_alias ON person.company_id = derived_table_alias.id;