Set Operators
Set operators in SQL are like joins but different. That’s because set operations like the INTERSECT
, MINUS
, and UNION
operators join the results of two queries based on a match, exclusion, or duplicate elimination respectively. A fourth set operator returns the unique rows plus two copies of any duplicate rows, and it is the UNION ALL
operator. More or less they’re qualified as follows:
- INTERSECT
- An
INTERSECT
operates on two queries that return select lists that share the exact position, number of columns, and data types. AnINTERSECT
operator returns the rows from both queries that share the same values in all of the select list columns. - MINUS
- A
MINUS
operates like theINTERSECT
operator in that both queries must share select lists that match by position, number of columns, and data types. The difference is that the any rows from the first query found in the second query are removed from the result set. - UNION
- A
UNION
acts like a union operator in set mathematics. It returns the unique set of rows found in the collection of both queries. It calculates the distinct set of rows by performing a sort operation that eliminates duplicates. Like the other set operators, theUNION
requires that both queries return the same, position, number of columns, and data types. - UNION ALL
- A
UNION ALL
takes all rows returned by one query and adds the rows returned by another query. It does not filter out duplicate rows. It is most useful when both queries return unique sets because it avoids a sorting operation. Like the other set operators, theUNION
requires that both queries return the same, position, number of columns, and data types.
The following scrollable list exams the four set operator. They’re defined in the ANSI SQL:89 definition and work with either the Oracle Proprietary outer join or the ANSI SQL:92 style outer joins.
SQL Set Operations ↓
Rule(s):
- The top query select list must mirror the bottom query.
- The top query column names must match the bottom query column names.
- The top query must have the same number of columns as the bottom query.
- The top query must match the positional ordering of the bottom query.
INTERSECT
The INTERSECT
operator matches every row in one table against every row in another based on the select list of shared columns. It is a filtered join. In its most basic approach an INTERSECT
operator acts like a nested loop where you match keys from the outer loop against keys from the inner loop.
Another way to describe an INTERSECT
operation is as a matching of two rows of data where they share the same organization, viz-a-viz a common column name, positional order, number of columns, and data type. The formula uses a ∀ to indicate given any sets A and B, then A intersects (∩) B. The intersection works by matching a row of data in one query with a row of data in another query based the entire select list of the query.
The following syntax lets you create an intersection two queries. The first query returns everything in set A that matches set B, and eliminates the symmetric difference, or the relative complements found in both the LEFT JOIN
and RIGHT JOIN
. It uses ANSI SQL:92 syntax.
1 2 3 4 5 6 7 8 9 10 | SELECT ordinal_english_id , ordinal_english_text , ordinal_french_text FROM ordinal_english LEFT JOIN ordinal_french ON ordinal_english_id = ordinal_french_id INTERSECT SELECT ordinal_english_id , ordinal_english_text , ordinal_french_text FROM ordinal_english RIGHT JOIN ordinal_french ON ordinal_english_id = ordinal_french_id ORDER BY ordinal_english_id; |
SELECT ordinal_english_id , ordinal_english_text , ordinal_french_text FROM ordinal_english LEFT JOIN ordinal_french ON ordinal_english_id = ordinal_french_id INTERSECT SELECT ordinal_english_id , ordinal_english_text , ordinal_french_text FROM ordinal_english RIGHT JOIN ordinal_french ON ordinal_english_id = ordinal_french_id ORDER BY ordinal_english_id;
This returns only rows with only the two selected columns, where they are the result of a join between the two queries. The only join between a LEFT JOIN
and a RIGHT JOIN
is the INNER JOIN
, which is synonymous with the INTERSECT
between the two queries.
ORDINAL_ENGLISH_ID ORDINAL_ENGLISH_TEXT ORDINAL_FRENCH_TEXT ------------------ -------------------------- -------------------------- 1 One Un 3 Three Trois 4 Four Quatre 5 Five Cinq 6 Six Six |
ORDINAL_ENGLISH_ID ORDINAL_ENGLISH_TEXT ORDINAL_FRENCH_TEXT ------------------ -------------------------- -------------------------- 1 One Un 3 Three Trois 4 Four Quatre 5 Five Cinq 6 Six Six
You can also use the older Oracle Proprietary syntax for an outer join to mimic this behavior.
MINUS
The MINUS
operator finds matches in the second query that it removes from the first query. The only values returned by a MINUS
operator are the non-matches from the first query.
Another way to describe a MINUS
set operator is to draw a correlation to a LEFT JOIN
because it acts like a TOP JOIN
where the topmost query replaces the left table in a join. The formula uses a ∀ to indicate given any sets A and B, then A intersects (∩) B plus all rows in A not found in B. The Venn diagram and set notation depicts that definition.
The following syntax lets you subtract set B from set A with a MINUS operator. The first query returns everything in set A that isn’t found in set B. It acts like an anti-join where you take the LEFT JOIN
minus INNER JOIN
to find the relative complement of the table on the right. Moreover, it returns the rows from the top query not found in the bottom query, which is likewise the relative complement of the bottom set.
You can format the output with the following SQL*Plus command in an Oracle database but it doesn’t work in the others.
1 2 | COLUMN ORDINAL_ENGLISH_ID FORMAT 99 HEADING "ORDINAL|ENGLISH|ID #" COLUMN ORDINAL_FRENCH_ID FORMAT 99 HEADING "ORDINAL|FRENCH|ID #" |
COLUMN ORDINAL_ENGLISH_ID FORMAT 99 HEADING "ORDINAL|ENGLISH|ID #" COLUMN ORDINAL_FRENCH_ID FORMAT 99 HEADING "ORDINAL|FRENCH|ID #"
1 2 3 4 5 6 7 8 9 10 11 12 | SELECT ordinal_english_id , ordinal_english_text , ordinal_french_id , ordinal_french_text FROM ordinal_english LEFT JOIN ordinal_french ON ordinal_english_id = ordinal_french_id MINUS SELECT ordinal_english_id , ordinal_english_text , ordinal_french_id , ordinal_french_text FROM ordinal_english INNER JOIN ordinal_french ON ordinal_english_id = ordinal_french_id ORDER BY ordinal_english_id; |
SELECT ordinal_english_id , ordinal_english_text , ordinal_french_id , ordinal_french_text FROM ordinal_english LEFT JOIN ordinal_french ON ordinal_english_id = ordinal_french_id MINUS SELECT ordinal_english_id , ordinal_english_text , ordinal_french_id , ordinal_french_text FROM ordinal_english INNER JOIN ordinal_french ON ordinal_english_id = ordinal_french_id ORDER BY ordinal_english_id;
This returns the right relative complement because the INNER JOIN
component (the bottom query) is removed from the LEFT JOIN
query. It returns the following data set:
ORDINAL ORDINAL ENGLISH FRENCH ID # ORDINAL_ENGLISH_TEXT ID # ORDINAL_FRENCH_TEXT ------- -------------------------- ------- -------------------------- 2 Two |
ORDINAL ORDINAL ENGLISH FRENCH ID # ORDINAL_ENGLISH_TEXT ID # ORDINAL_FRENCH_TEXT ------- -------------------------- ------- -------------------------- 2 Two
You can also use the older Oracle Proprietary syntax for an outer join to mimic this behavior. The MINUS
operator returns only rows from the top query not found in the bottom query, like:
SELECT ordinal_english_id , ordinal_english_text , ordinal_french_id , ordinal_french_text FROM ordinal_english LEFT JOIN ordinal_french ON ordinal_english_id = ordinal_french_id MINUS SELECT ordinal_english_id , ordinal_english_text , ordinal_french_id , ordinal_french_text FROM ordinal_english RIGHT JOIN ordinal_french ON ordinal_english_id = ordinal_french_id ORDER BY ordinal_english_id; |
SELECT ordinal_english_id , ordinal_english_text , ordinal_french_id , ordinal_french_text FROM ordinal_english LEFT JOIN ordinal_french ON ordinal_english_id = ordinal_french_id MINUS SELECT ordinal_english_id , ordinal_english_text , ordinal_french_id , ordinal_french_text FROM ordinal_english RIGHT JOIN ordinal_french ON ordinal_english_id = ordinal_french_id ORDER BY ordinal_english_id;
Some think this would return the symmetric difference of the sets because the LEFT JOIN
and RIGHT JOIN
only share the intersection. They reason that the symmetric difference is the combination of the left and right relative components.
They’re wrong. The top most query contains the intersection of the two sets and the right relative complement. When you subtract the two queries, the only rows they share are the intersection. Therefore, the topmost query subtracts the intersection from the LEFT JOIN
(or intersection). That leaves only the right relative complement because you don’t add rows from the bottom query that aren’t found in the topmost query with a MINUS
set operator.
It returns the same data set as the preceding example, or the following data set:
ORDINAL ORDINAL ENGLISH FRENCH ID # ORDINAL_ENGLISH_TEXT ID # ORDINAL_FRENCH_TEXT ------- -------------------------- ------- -------------------------- 2 Two |
ORDINAL ORDINAL ENGLISH FRENCH ID # ORDINAL_ENGLISH_TEXT ID # ORDINAL_FRENCH_TEXT ------- -------------------------- ------- -------------------------- 2 Two
UNION
The UNION
operator matches every row in one table against every row in another based on the select list of shared columns. It is a filtered join that returns the intersection plus the symmetric difference of the top and bottom queries. In its most basic approach an UNION
operator acts like a nested loop where you match keys from the outer loop against keys from the inner loop and add non-matches as they occur.
Another way to describe an UNION
operation is as a matching and reduction of two rows of data into one row of data where they share the same organization, viz-a-viz a common column name, positional order, number of columns, and data type. The formula uses a ∀ to indicate given any sets A and B, then A intersects (∩) B, plus everything in A not found in B and everything in B not found in A. The intersection works by sorting matched rows of data and eliminating any duplicate copies.
The following syntax lets you create a UNION
of two queries. The query returns every row in set A that doesn’t match any row in set B, every row in set B that doesn’t match any row in set A, and one copy of all rows that match between set A and B. It uses ANSI SQL:92 syntax.
You can format the output with the following SQL*Plus command in an Oracle database but it doesn’t work in the others.
1 2 | COLUMN ORDINAL_ENGLISH_ID FORMAT 99 HEADING "ORDINAL|ENGLISH|ID #" COLUMN ORDINAL_FRENCH_ID FORMAT 99 HEADING "ORDINAL|FRENCH|ID #" |
COLUMN ORDINAL_ENGLISH_ID FORMAT 99 HEADING "ORDINAL|ENGLISH|ID #" COLUMN ORDINAL_FRENCH_ID FORMAT 99 HEADING "ORDINAL|FRENCH|ID #"
The query works in Oracle, MySQL, or SQL Server but the output only formats in Oracle.
1 2 3 4 5 6 7 8 9 10 11 12 | SELECT ordinal_english_id , ordinal_english_text , ordinal_french_id , ordinal_french_text FROM ordinal_english LEFT JOIN ordinal_french ON ordinal_english_id = ordinal_french_id UNION SELECT ordinal_english_id , ordinal_english_text , ordinal_french_id , ordinal_french_text FROM ordinal_english RIGHT JOIN ordinal_french ON ordinal_english_id = ordinal_french_id ORDER BY ordinal_english_id; |
SELECT ordinal_english_id , ordinal_english_text , ordinal_french_id , ordinal_french_text FROM ordinal_english LEFT JOIN ordinal_french ON ordinal_english_id = ordinal_french_id UNION SELECT ordinal_english_id , ordinal_english_text , ordinal_french_id , ordinal_french_text FROM ordinal_english RIGHT JOIN ordinal_french ON ordinal_english_id = ordinal_french_id ORDER BY ordinal_english_id;
This returns one copy of all rows from set A and set B. It is effected with the overhead of a sort operation and should only be used when you can’t guarantee the unique between the two query return sets.
ORDINAL ORDINAL ENGLISH FRENCH ID # ORDINAL_ENGLISH_TEXT ID # ORDINAL_FRENCH_TEXT ------- -------------------------- ------- -------------------------- 1 One 1 Un 2 Two 3 Three 3 Trois 4 Four 4 Quatre 5 Five 5 Cinq 6 Six 6 Six 7 Sept 8 Huit |
ORDINAL ORDINAL ENGLISH FRENCH ID # ORDINAL_ENGLISH_TEXT ID # ORDINAL_FRENCH_TEXT ------- -------------------------- ------- -------------------------- 1 One 1 Un 2 Two 3 Three 3 Trois 4 Four 4 Quatre 5 Five 5 Cinq 6 Six 6 Six 7 Sept 8 Huit
You can also use the older Oracle Proprietary syntax for an outer join to mimic this behavior.
UNION ALL
The UNION ALL
operator matches every row in one table against every row in another based on the select list of shared columns. It is an unfiltered join that returns two copies of the intersection plus the symmetric difference of the top and bottom queries. The UNION ALL
set operator should only be used when the set of rows returned from the top query is distinct from the set of rows returned from the bottom query.
Another way to describe an UNION
operation is as a listing of all rows returned by each query without any attempt to match and reduce to a unique set. As with other set operators, they must have a common column name, positional order, number of columns, and data type. The formula uses a ∀ to indicate given any sets A and B, then 2 times the quantity of A intersect (∩) B, plus everything in A not found in B and everything in B not found in A. Two copies are returned in this case because both a LEFT JOIN
and a RIGHT JOIN
return a copy of the INNER JOIN
.
The following syntax lets you create a UNION ALL
of two queries. The query returns every row returned by both queries. It uses ANSI SQL:92 syntax.
You can format the output with the following SQL*Plus command in an Oracle database but it doesn’t work in the others.
1 2 | COLUMN ORDINAL_ENGLISH_ID FORMAT 99 HEADING "ORDINAL|ENGLISH|ID #" COLUMN ORDINAL_FRENCH_ID FORMAT 99 HEADING "ORDINAL|FRENCH|ID #" |
COLUMN ORDINAL_ENGLISH_ID FORMAT 99 HEADING "ORDINAL|ENGLISH|ID #" COLUMN ORDINAL_FRENCH_ID FORMAT 99 HEADING "ORDINAL|FRENCH|ID #"
The query works in Oracle, MySQL, or SQL Server but the output only formats in Oracle.
1 2 3 4 5 6 7 8 9 10 11 12 | SELECT ordinal_english_id , ordinal_english_text , ordinal_french_id , ordinal_french_text FROM ordinal_english LEFT JOIN ordinal_french ON ordinal_english_id = ordinal_french_id UNION ALL SELECT ordinal_english_id , ordinal_english_text , ordinal_french_id , ordinal_french_text FROM ordinal_english RIGHT JOIN ordinal_french ON ordinal_english_id = ordinal_french_id ORDER BY ordinal_english_id; |
SELECT ordinal_english_id , ordinal_english_text , ordinal_french_id , ordinal_french_text FROM ordinal_english LEFT JOIN ordinal_french ON ordinal_english_id = ordinal_french_id UNION ALL SELECT ordinal_english_id , ordinal_english_text , ordinal_french_id , ordinal_french_text FROM ordinal_english RIGHT JOIN ordinal_french ON ordinal_english_id = ordinal_french_id ORDER BY ordinal_english_id;
This returns two copies of all rows from set A and set B where they’re duplicates and one row where they’re not duplicates. You should only use this type of set operator when you can guarantee the uniqueness between the two query return sets.
ORDINAL ORDINAL ENGLISH FRENCH ID # ORDINAL_ENGLISH_TEXT ID # ORDINAL_FRENCH_TEXT ------- -------------------------- ------- -------------------------- 1 One 1 Un 1 One 1 Un 2 Two 3 Three 3 Trois 3 Three 3 Trois 4 Four 4 Quatre 4 Four 4 Quatre 5 Five 5 Cinq 5 Five 5 Cinq 6 Six 6 Six 6 Six 6 Six 7 Sept 8 Huit |
ORDINAL ORDINAL ENGLISH FRENCH ID # ORDINAL_ENGLISH_TEXT ID # ORDINAL_FRENCH_TEXT ------- -------------------------- ------- -------------------------- 1 One 1 Un 1 One 1 Un 2 Two 3 Three 3 Trois 3 Three 3 Trois 4 Four 4 Quatre 4 Four 4 Quatre 5 Five 5 Cinq 5 Five 5 Cinq 6 Six 6 Six 6 Six 6 Six 7 Sept 8 Huit
You can also use the older Oracle Proprietary syntax for an outer join to mimic this behavior.
Written by michaelmclaughlin
January 17th, 2010 at 10:28 pm
Posted in
3 Responses to 'Set Operators'
Subscribe to comments with RSS or TrackBack to 'Set Operators'.
THX 🙂
paul
8 Mar 13 at 7:31 am
When is a set operator necessary? It seems that, with the exception of minus, all of the set operators accomplished the tasks that Joins already did.
James
7 Feb 14 at 6:28 pm
James, When you want the product of two queries that return the same record structure but arrive at the data set through different joins, the
UNION
,UNION ALL
, andINTERSECT
are very useful. TheUNION
works when the sets may have overlapping return sets. TheUNION ALL
works when the sets will never have overlapping return sets. TheINTERSECT
works when you only want the overlapping return set. Hope that answers your question.michaelmclaughlin
18 Mar 14 at 11:01 am