ANSI SQL:92
The behavior the NATURAL JOIN
is so different from the other join patterns, it required it’s own sample code.
The example for the NATURAL JOIN
uses the VEHICLE
and WORK_ORDER
tables. They share a common VEHICLE_ID
column. The VEHICLE_ID
is the primary key in the VEHICLE
table and the foreign key in the WORK_ORDER
table. This script creates the two tables without database-level constraints to enforce referential integrity.
CREATE TABLE vehicle
( vehicle_id NUMBER
, vin VARCHAR2(20));
CREATE TABLE work_order
( work_order_id NUMBER
, vehicle_id NUMBER);
INSERT INTO vehicle VALUES (1,'VIN_NUMBER1');
INSERT INTO vehicle VALUES (2,'VIN_NUMBER2');
INSERT INTO work_order VALUES (1, 1);
INSERT INTO work_order VALUES (2, 2);
INSERT INTO work_order VALUES (3, 2); |
CREATE TABLE vehicle
( vehicle_id NUMBER
, vin VARCHAR2(20));
CREATE TABLE work_order
( work_order_id NUMBER
, vehicle_id NUMBER);
INSERT INTO vehicle VALUES (1,'VIN_NUMBER1');
INSERT INTO vehicle VALUES (2,'VIN_NUMBER2');
INSERT INTO work_order VALUES (1, 1);
INSERT INTO work_order VALUES (2, 2);
INSERT INTO work_order VALUES (3, 2);
The following creates an INNER JOIN
by using the ON
subclause, which would allow the columns in the join to have different column names.
SELECT * FROM vehicle v JOIN work_order wo ON v.vehicle_id = wo.vehicle_id; |
SELECT * FROM vehicle v JOIN work_order wo ON v.vehicle_id = wo.vehicle_id;
The following creates an INNER JOIN
by using the USING
subclause. The USING
subclause requires that the join work against columns that share the same name. They’re provided as call parameters to the USING
subclause, like:
SELECT * FROM vehicle v JOIN work_order wo USING(vehicle_id); |
SELECT * FROM vehicle v JOIN work_order wo USING(vehicle_id);
A NATURAL JOIN
doesn’t have an explicit join clause. The following syntax lets you create a NATURAL JOIN
between the sample two tables:
SELECT * FROM vehicle v NATURAL JOIN work_order wo; |
SELECT * FROM vehicle v NATURAL JOIN work_order wo;
All of these queries return a three row result set. The result set is:
VEHICLE_ID VIN WORK_ORDER_ID
---------- -------------------- -------------
1 VIN_NUMBER1 1
2 VIN_NUMBER2 2
2 VIN_NUMBER2 3 |
VEHICLE_ID VIN WORK_ORDER_ID
---------- -------------------- -------------
1 VIN_NUMBER1 1
2 VIN_NUMBER2 2
2 VIN_NUMBER2 3
A nasty surprise occurs over time with natural joins. Tables mutate, or change definition, over time. The mutation of a table that participates in a NATURAL JOIN
can lead to unexpected results. These results could be termed unnatural, but they’re part of the overhead when you opt to deploy NATURAL JOIN
semantics in your database.
You must maintain NATURAL JOIN
integrity by frequent “black box” testing because maintenance programming may add an unanticipated column to one of the tables. If that column shares a name with a column in the other table of the join, you may no longer get the anticipated results from your query.
For example, when you add a VIN
column name to the WORK_ORDER
table definition, a NATURAL JOIN
may no longer work when the data disagrees between tables. While the surrogate keys match, the VIN
column wouldn’t with a typographical entry error. This type of behavior can cause a business to loose visibility to data. It’s also hard to find unless you know that natural joins were used in the implementation of the product.
You would change the table with an ALTER TABLE
statement, like the following.
ALTER TABLE work_order ADD (vin VARCHAR2(20)); |
ALTER TABLE work_order ADD (vin VARCHAR2(20));
The most frequent and common way for this type of problem occur is through an UPDATE
statement. This type of change is common because many database developers don’t use package constructs when dealing with related business objects. It creates an update anomaly.
UPDATE work_order
SET vin = 'VIN_MISTAKE'
WHERE work_order_id = 2; |
UPDATE work_order
SET vin = 'VIN_MISTAKE'
WHERE work_order_id = 2;
This will UPDATE
two records but make a NATURAL JOIN
impossible for all three previously returned records. Two rows in the WORK_ORDER
table contain a VIN
value not found in the VEHICLE
table, and the third row contains a NULL
value that can’t match anything through a join. NULL
values must use the IS [NOT] NULL
operator, which is only supported in an OUTER JOIN
operation.
While the change precludes a NATURAL JOIN
from success, an INNER JOIN
can succeed. The INNER JOIN
can succeed because you specify the columns upon which to base your join operation. You might fix this by careful attention to data entry or by implementing a table-level (multiple column) foreign key constraint.
If you added who-audit columns to both tables so that you could manage and ensure fine grained transaction auditing, none of the joins would work. This is because records in one row may have differences in who created it, when they created it, who updated it, and when they last updated it.
In short, I’d recommend you never use NATURAL JOIN
statement because they add risks for data corruption.
Nice explanation!!!
ErikCC
5 Aug 14 at 5:06 pm
Just spelling mistakes…
“This type of behavior can cause a business to loose [lose] visibility to data. It’s also hard to find unless you know that natural joins where [were] used in the implementation of the product.”
Cole
31 Jan 15 at 8:44 pm
Cole, Great catch! It’s fixed.
michaelmclaughlin
7 Mar 15 at 3:09 am
nice
vrushali
27 Jul 15 at 3:49 am