ANSI SQL:92
The behavior the NATURAL JOIN is so different from the other join patterns, it required it’s own sample code.
The example for the NATURAL JOIN uses the VEHICLE and WORK_ORDER tables. They share a common VEHICLE_ID column. The VEHICLE_ID is the primary key in the VEHICLE table and the foreign key in the WORK_ORDER table. This script creates the two tables without database-level constraints to enforce referential integrity.
CREATE TABLE vehicle
( vehicle_id NUMBER
, vin VARCHAR2(20));
CREATE TABLE work_order
( work_order_id NUMBER
, vehicle_id NUMBER);
INSERT INTO vehicle VALUES (1,'VIN_NUMBER1');
INSERT INTO vehicle VALUES (2,'VIN_NUMBER2');
INSERT INTO work_order VALUES (1, 1);
INSERT INTO work_order VALUES (2, 2);
INSERT INTO work_order VALUES (3, 2);
The following creates an INNER JOIN by using the ON subclause, which would allow the columns in the join to have different column names.
SELECT * FROM vehicle v JOIN work_order wo ON v.vehicle_id = wo.vehicle_id;
The following creates an INNER JOIN by using the USING subclause. The USING subclause requires that the join work against columns that share the same name. They’re provided as call parameters to the USING subclause, like:
SELECT * FROM vehicle v JOIN work_order wo USING(vehicle_id);
A NATURAL JOIN doesn’t have an explicit join clause. The following syntax lets you create a NATURAL JOIN between the sample two tables:
SELECT * FROM vehicle v NATURAL JOIN work_order wo;
All of these queries return a three row result set. The result set is:
VEHICLE_ID VIN WORK_ORDER_ID
---------- -------------------- -------------
1 VIN_NUMBER1 1
2 VIN_NUMBER2 2
2 VIN_NUMBER2 3
A nasty surprise occurs over time with natural joins. Tables mutate, or change definition, over time. The mutation of a table that participates in a NATURAL JOIN can lead to unexpected results. These results could be termed unnatural, but they’re part of the overhead when you opt to deploy NATURAL JOIN semantics in your database.
You must maintain NATURAL JOIN integrity by frequent “black box” testing because maintenance programming may add an unanticipated column to one of the tables. If that column shares a name with a column in the other table of the join, you may no longer get the anticipated results from your query.
For example, when you add a VIN column name to the WORK_ORDER table definition, a NATURAL JOIN may no longer work when the data disagrees between tables. While the surrogate keys match, the VIN column wouldn’t with a typographical entry error. This type of behavior can cause a business to loose visibility to data. It’s also hard to find unless you know that natural joins where used in the implementation of the product.
You would change the table with an ALTER TABLE statement, like the following.
ALTER TABLE work_order ADD (vin VARCHAR2(20));
The most frequent and common way for this type of problem occur is through an UPDATE statement. This type of change is common because many database developers don’t use package constructs when dealing with related business objects. It creates an update anomaly.
UPDATE work_order
SET vin = 'VIN_MISTAKE'
WHERE work_order_id = 2;
This will UPDATE two records but make a NATURAL JOIN impossible for all three previously returned records. Two rows in the WORK_ORDER table contain a VIN value not found in the VEHICLE table, and the third row contains a NULL value that can’t match anything through a join. NULL values must use the IS [NOT] NULL operator, which is only supported in an OUTER JOIN operation.
While the change precludes a NATURAL JOIN from success, an INNER JOIN can succeed. The INNER JOIN can succeed because you specify the columns upon which to base your join operation. You might fix this by careful attention to data entry or by implementing a table-level (multiple column) foreign key constraint.
If you added who-audit columns to both tables so that you could manage and ensure fine grained transaction auditing, none of the joins would work. This is because records in one row may have differences in who created it, when they created it, who updated it, and when they last updated it.
In short, I’d recommend you never use NATURAL JOIN statement because they add risks for data corruption.