A – Scalar Subqueries at Database Tutorial

A – Scalar Subqueries

A scalar subquery is a SELECT statement that returns one and only one row of an atomic data type. Atomic data types are discussed in the Data Normalization page. Atomic data types can be scalar data types or composite (alternatively compound) data types because we treat these structures as whole units. A scalar data type is traditionally seen as a int, char, or other base data type that holds only one value at a time. Composite data types may contain a data structure, an object, an array, or a list, which are seen as whole units.

You use a scalar subquery to use a combination of columns to return a single column and row of data. The columns that allow you to filter the query are found in the WHERE clause. They are often the columns that make a row in that table unique, which means they’re the data describing columns that make up the natural key.

Rules, Uses, and Examples ↓

A scalar subquery returns only one column for a single row and is also known as a SQL expression. You can use a scalar subquery in the VALUES clause of an INSERT statement, the WHERE clause of a SELECT, the SET clause of an UPDATE, and the WHERE clause of a DELETE statement.

Rules ↓

A scalar subquery returns a variable like a number, date, or string. Scalar variables are also known as primitives in some languages, like Java. It uses a SQL statement to look something up in the database based on business rules because business rules are more human friendly than recalling surrogate key numbers.

Description

A scalar subquery is one of the closest relative SQL has to a traditional pass-by-value function. A pass-by-value function can also be described as a black box, where you put some raw material in and get some processed product out.

scalarblackbox

A scalar subquery is independent of its outer query. Consider that a query with subqueries is like an inverted tree. The outermost query is the root node of the inverted tree. Any subqueries of the root node are branches, and subqueries of the first generation of subqueries are branches from branches. The lowest subquery in any sequence of branches is also known as a leaf node. Leaf nodes resolve first, followed by their siblings. After leaf nodes, the branch resolves that is their parent, the parent of the parent and so forth until all values are passed to the outermost subquery. If there’s another branch with leaf nodes, the behavior is repeated until all branches resolve with values to the outermost query. At this point the outermost query runs. This is why scalar subqueries are considered inside out queries.

The following demonstrates two nested scalar subqueries. The outer query depends on the result returned by the subquery to the address table. While the subquery against the address table depends on the result returned by the subquery to the telephone table.

Subqueries in SELECT statements

Oracle MySQL SQL Server

Oracle

The Oracle concatenation syntax uses double pipes, as shown in the example code. You can read more about SQL concatenation here.

SELECT   first_name||' '||last_name
FROM     contact
WHERE    contact_id =
          (SELECT   a.contact_id                  -- An intermediate branch subquery.
           FROM     address a
           WHERE    a.address_id =
            (SELECT   t.address_id                -- A leaf node subquery.
             FROM     telephone t
             WHERE    t.telephone_id = 1001));

MySQL

The MySQL concatenation syntax uses the CONCAT() function. It is a recursive function, which allows you to use as many arguments as you require. MySQL also supports double pipes, known as piped concatenation but you must override the default value. There’s an example of overriding the default here. Piped concatenation doesn’t enjoy highest order of precedence as it does in Oracle, which means you can’t use it inside function calls.

SELECT   CONCAT(first_name,' ',last_name)
FROM     contact
WHERE    contact_id =
          (SELECT   a.contact_id                  -- An intermediate branch subquery.
           FROM     address a
           WHERE    a.address_id =
            (SELECT   t.address_id                -- A leaf node subquery.
             FROM     telephone t
             WHERE    t.telephone_id = 1001));

SQL Server

SQL Server concatenation syntax uses the + symbol. You can find other examples of the behavior here.

SELECT   first_name + ' ' + last_name)
FROM     contact
WHERE    contact_id =
          (SELECT   a.contact_id                  -- An intermediate branch subquery.
           FROM     address a
           WHERE    a.address_id =
            (SELECT   t.address_id                -- A leaf node subquery.
             FROM     telephone t
             WHERE    t.telephone_id = 1001));

If either of the two subqueries returns more than one row, the query would fail with an ORA-01427 error. The error means that a single row subquery returned more than one row. The scalar subquery architecture is risky because it is data centric, and it relies on both subqueries returning a single key that has only one parent. A multiple row subquery is a more flexible solution when coupled with an IN or other multiple row comparison operator.

INSERT statement ↓

An INSERT statement can only use a scalar subquery in the VALUES clause, which makes their use natural in this context. Scalar subqueries uses a SQL statement to look something up in the database based on business rules because business rules are more human friendly than recalling surrogate key numbers. Writing queries to identify proper primary key values that you’ll use as foreign key values is the best way to ensure integrity of join criteria.

You typically use a scalar subquery when you lookup a primary key to insert it as a foreign key column. An example of a scalar subquery would be looking up the common_lookup_id for a DVD, as shown below:

INSERT INTO item VALUES
( item_s1.nextval
,'ASIN: B0001US8F8'
,(SELECT   common_lookup_id                       -- Primary key scalar query lookup.
  FROM     common_lookup
  WHERE    common_lookup_type = 'DVD_WIDE_SCREEN')
,'Around the World in 80 Days'
,'Two-Disc Special Edition'
,'NR'
,'MPAA'
,'18-MAY-2004'
, 3
, SYSDATE
, 3
, SYSDATE);

The scalar subquery above looks up a surrogate key based on a human friendly definition of DVD_WIDE_SCREEN. Actually, that value isn’t too human friendly. The form or web page would actually present something like DVD Wide Screen, which would then map to the uppercase string shown in the example. You would need to define both columns in a table like the common_lookup_table table. Then, every row can map and substitute the end-user friendly and code reusable values or vice versa.

In some implementations, like Oracle, you can also return a User-Defined Type (UDT). UDTs are typically objects, arrays, or lists, and they are specific to object-relational technology – more or less an Oracle database.

UPDATE statement ↓

An UPDATE statement can use a scalar subquery in the SET or WHERE clauses. Like an INSERT statement, an UPDATE statement uses SQL subqueries to look things up in the database. Lookup values also rely on business rules in UPDATE statements because they are a more human friendly way to discover valid join relationship keys.

You typically use a scalar subquery when looking up a single column value from another table or collections of tables. The SET clause generally requires a scalar subquery or at least a single row subquery, but doesn’t impose a single row subquery constraint in all cases. You can use a multiple row subquery when you correlate the rows of a subquery with the rows of an UPDATE statement.

A pair of scalar subqueries are shown in the UPDATE statement.

UPDATE rental_item ri
SET    ri.rental_id = (SELECT   r.rental_id       -- A scalar subquery assignment.
                       FROM     rental r
                       ,        contact c
                       WHERE    r.customer_id = c.contact_id
                       AND      c.last_name = 'Winn'
                       AND      c.first_name = 'Randi')
WHERE  ri.rental_id = (SELECT   r.rental_id       -- A scalar subquery comparison.
                       FROM     rental r
                       ,        contact c
                       WHERE    r.customer_id = c.contact_id
                       AND      c.last_name = 'Winn'
                       AND      c.first_name = 'Brian');

The two scalar subqueries in the UPDATE statement accomplish the same type of thing but in two different contexts. One works in the assignment of a new value through the SET clause, while the other filters the rows to update in the WHERE clause. You should note that both subqueries run before the outer query (in this case an UPDATE statement). Scalar subqueries either assign values or compare values in an UPDATE statement, and they assign or compare values through an equality, =, operator.

A scalar subquery is data centric in this type of solution. The statement would throw an ORA-01427 error when two or more rows are returned by a subquery. A multiple row subquery is more flexible and data set independent when you combine it with an IN or other multiple row comparison operator.

DELETE statement ↓

A DELETE statement can only use a scalar subquery in the WHERE clause. Scalar subqueries look something up in your database. They use business rules to do that. Business rules are typically more human friendly ways to find data relationships (or primary to foreign key relationships).

You use a scalar subquery when looking up a single column value from another table or collections of tables. An example of a scalar subquery follows in a DELETE statement.

DELETE
FROM   rental_item ri
WHERE  ri.rental_id = (SELECT   r.rental_id       -- A scalar subquery comparison.
                       FROM     rental r
                       ,        contact c
                       WHERE    r.customer_id = c.contact_id
                       AND      c.last_name = 'Winn'
                       AND      c.first_name = 'Brian');

The scalar subquery in the WHERE clause of the DELETE statement runs first. The subquery returns a single scalar value. The DELETE statement uses the value returned from the scalar subquery to filter which rows are deleted from a table. You should note that the subquery compares values through an equality, =, comparison operator.

Written by michaelmclaughlin

October 8th, 2009 at 8:39 pm

Posted in

4 Responses to 'A – Scalar Subqueries'

Subscribe to comments with RSS or TrackBack to 'A – Scalar Subqueries'.

The inverted tree example is really confusing to a lot of us. Perhaps an image is necessary for it to be understood.

Kara

19 Jul 10 at 12:00 am
Is this a typo or…?

“Scalar subqueries uses a SQL statement to look something up in the database based on business rules because business rules.”

Chris Staber

30 Sep 10 at 10:46 am
I found similar problems in three areas. I believe they’re fixed, if not let me know.

michaelmclaughlin

3 Oct 10 at 12:30 pm
Kara, An illustration of the relationship and examples of output are found in the Hierarchical Queries web page.

michaelmclaughlin

22 Oct 14 at 2:12 am

Database Tutorial