A – Scalar Subqueries
A scalar subquery is a SELECT
statement that returns one and only one row of an atomic data type. Atomic data types are discussed in the Data Normalization page. Atomic data types can be scalar data types or composite (alternatively compound) data types because we treat these structures as whole units. A scalar data type is traditionally seen as a int
, char
, or other base data type that holds only one value at a time. Composite data types may contain a data structure, an object, an array, or a list, which are seen as whole units.
You use a scalar subquery to use a combination of columns to return a single column and row of data. The columns that allow you to filter the query are found in the WHERE
clause. They are often the columns that make a row in that table unique, which means they’re the data describing columns that make up the natural key.
Rules, Uses, and Examples ↓
A scalar subquery returns only one column for a single row and is also known as a SQL expression. You can use a scalar subquery in the VALUES
clause of an INSERT
statement, the WHERE
clause of a SELECT
, the SET
clause of an UPDATE
, and the WHERE
clause of a DELETE
statement.
Rules ↓
A scalar subquery returns a variable like a number, date, or string. Scalar variables are also known as primitives in some languages, like Java. It uses a SQL statement to look something up in the database based on business rules because business rules are more human friendly than recalling surrogate key numbers.
Description
A scalar subquery is one of the closest relative SQL has to a traditional pass-by-value function. A pass-by-value function can also be described as a black box, where you put some raw material in and get some processed product out.
A scalar subquery is independent of its outer query. Consider that a query with subqueries is like an inverted tree. The outermost query is the root node of the inverted tree. Any subqueries of the root node are branches, and subqueries of the first generation of subqueries are branches from branches. The lowest subquery in any sequence of branches is also known as a leaf node. Leaf nodes resolve first, followed by their siblings. After leaf nodes, the branch resolves that is their parent, the parent of the parent and so forth until all values are passed to the outermost subquery. If there’s another branch with leaf nodes, the behavior is repeated until all branches resolve with values to the outermost query. At this point the outermost query runs. This is why scalar subqueries are considered inside out queries.
The following demonstrates two nested scalar subqueries. The outer query depends on the result returned by the subquery to the address
table. While the subquery against the address
table depends on the result returned by the subquery to the telephone
table.
If either of the two subqueries returns more than one row, the query would fail with an ORA-01427
error. The error means that a single row subquery returned more than one row. The scalar subquery architecture is risky because it is data centric, and it relies on both subqueries returning a single key that has only one parent. A multiple row subquery is a more flexible solution when coupled with an IN
or other multiple row comparison operator.
INSERT
statement ↓
An INSERT
statement can only use a scalar subquery in the VALUES
clause, which makes their use natural in this context. Scalar subqueries uses a SQL statement to look something up in the database based on business rules because business rules are more human friendly than recalling surrogate key numbers. Writing queries to identify proper primary key values that you’ll use as foreign key values is the best way to ensure integrity of join criteria.
You typically use a scalar subquery when you lookup a primary key to insert it as a foreign key column. An example of a scalar subquery would be looking up the common_lookup_id
for a DVD, as shown below:
INSERT INTO item VALUES ( item_s1.nextval ,'ASIN: B0001US8F8' ,(SELECT common_lookup_id -- Primary key scalar query lookup. FROM common_lookup WHERE common_lookup_type = 'DVD_WIDE_SCREEN') ,'Around the World in 80 Days' ,'Two-Disc Special Edition' ,'NR' ,'MPAA' ,'18-MAY-2004' , 3 , SYSDATE , 3 , SYSDATE); |
The scalar subquery above looks up a surrogate key based on a human friendly definition of DVD_WIDE_SCREEN
. Actually, that value isn’t too human friendly. The form or web page would actually present something like DVD Wide Screen, which would then map to the uppercase string shown in the example. You would need to define both columns in a table like the common_lookup_table
table. Then, every row can map and substitute the end-user friendly and code reusable values or vice versa.
In some implementations, like Oracle, you can also return a User-Defined Type (UDT). UDTs are typically objects, arrays, or lists, and they are specific to object-relational technology – more or less an Oracle database.
UPDATE
statement ↓
An UPDATE
statement can use a scalar subquery in the SET
or WHERE
clauses. Like an INSERT
statement, an UPDATE
statement uses SQL subqueries to look things up in the database. Lookup values also rely on business rules in UPDATE
statements because they are a more human friendly way to discover valid join relationship keys.
You typically use a scalar subquery when looking up a single column value from another table or collections of tables. The SET
clause generally requires a scalar subquery or at least a single row subquery, but doesn’t impose a single row subquery constraint in all cases. You can use a multiple row subquery when you correlate the rows of a subquery with the rows of an UPDATE
statement.
A pair of scalar subqueries are shown in the UPDATE
statement.
UPDATE rental_item ri SET ri.rental_id = (SELECT r.rental_id -- A scalar subquery assignment. FROM rental r , contact c WHERE r.customer_id = c.contact_id AND c.last_name = 'Winn' AND c.first_name = 'Randi') WHERE ri.rental_id = (SELECT r.rental_id -- A scalar subquery comparison. FROM rental r , contact c WHERE r.customer_id = c.contact_id AND c.last_name = 'Winn' AND c.first_name = 'Brian'); |
The two scalar subqueries in the UPDATE
statement accomplish the same type of thing but in two different contexts. One works in the assignment of a new value through the SET
clause, while the other filters the rows to update in the WHERE
clause. You should note that both subqueries run before the outer query (in this case an UPDATE
statement). Scalar subqueries either assign values or compare values in an UPDATE
statement, and they assign or compare values through an equality, =
, operator.
A scalar subquery is data centric in this type of solution. The statement would throw an ORA-01427
error when two or more rows are returned by a subquery. A multiple row subquery is more flexible and data set independent when you combine it with an IN
or other multiple row comparison operator.
In some implementations, like Oracle, you can also return a User-Defined Type (UDT). UDTs are typically objects, arrays, or lists, and they are specific to object-relational technology – more or less an Oracle database.
DELETE
statement ↓
A DELETE
statement can only use a scalar subquery in the WHERE
clause. Scalar subqueries look something up in your database. They use business rules to do that. Business rules are typically more human friendly ways to find data relationships (or primary to foreign key relationships).
You use a scalar subquery when looking up a single column value from another table or collections of tables. An example of a scalar subquery follows in a DELETE
statement.
DELETE FROM rental_item ri WHERE ri.rental_id = (SELECT r.rental_id -- A scalar subquery comparison. FROM rental r , contact c WHERE r.customer_id = c.contact_id AND c.last_name = 'Winn' AND c.first_name = 'Brian'); |
The scalar subquery in the WHERE
clause of the DELETE
statement runs first. The subquery returns a single scalar value. The DELETE
statement uses the value returned from the scalar subquery to filter which rows are deleted from a table. You should note that the subquery compares values through an equality, =
, comparison operator.
A scalar subquery is data centric in this type of solution. The statement would throw an ORA-01427
error when two or more rows are returned by a subquery. A multiple row subquery is more flexible and data set independent when you combine it with an IN
or other multiple row comparison operator.
In some implementations, like Oracle, you can also return a User-Defined Type (UDT). UDTs are typically objects, arrays, or lists, and they are specific to object-relational technology – more or less an Oracle database.
The inverted tree example is really confusing to a lot of us. Perhaps an image is necessary for it to be understood.
Kara
19 Jul 10 at 12:00 am
Is this a typo or…?
“Scalar subqueries uses a SQL statement to look something up in the database based on business rules because business rules.”
Chris Staber
30 Sep 10 at 10:46 am
I found similar problems in three areas. I believe they’re fixed, if not let me know.
michaelmclaughlin
3 Oct 10 at 12:30 pm
Kara, An illustration of the relationship and examples of output are found in the Hierarchical Queries web page.
michaelmclaughlin
22 Oct 14 at 2:12 am