Database Tutorial

Course Tutorial Site

Site Admin

D – Automatic Numbering

with 6 comments

Surrogate keys are interesting structures in databases. They’re essential if you want to make sure you optimize your design. They’re also very useful when you want to capture the automatic numbering value for a prior INSERT statement and reuse the automatic numbering value as the foreign key value in a subsequent statement. It should be interesting to see how they’re implemented differently across Oracle, MySQL, and SQL Server while providing the same utility.

Below is a synopsis of how you implement these in Oracle, MySQL, and SQL Server.

Inserting with automatic numbering

Oracle

The first thing to qualify is that Oracle is generally always in a transactional mode. That means you don’t need to do anything special to set this example up.

Oracle doesn’t support automated numbering in tables. However, you can use sequences to mimic automated numbering. A sequence is a structure in the database that holds a current value, increments by a fixed value – typically 1. Sequences are available in SQL and PL/SQL scopes through two pseudo columns. The pseudo columns are .nextval and .currval (note the two r’s because it’s not a stray dog).

The sequence_name.nextval call in any session places the next number from the sequence into your Personal Global Area (PGA), which is a memory context. After you’ve called the sequence into memory, you can access it again by using sequence_name.currval. The sequence only changes when you call it again with the .nextval pseudo column.

You can use .nextval to insert into a single table without much effort. When you work with more than a single table, you insert data starting with the independent table, and then work your way down to the most dependent table. This is because you need a primary key before you can insert a foreign key.

Whether through automatic numbering or Oracle’s pseudo columns, automatic numbering involves a surrogate primary key. A surrogate key doesn’t describe any of the data, and should never be thought of as a replacement for normalization because descriptive column values should do that. You may provide a null value in place of the column_name.nextval pseudo column for a primary key column provided you’ve implemented a database trigger (an example is found below). Alternatively, you may use an overriding list of columns that excludes the primary key column. At a minimum, you must provide all mandatory columns when you opt to use an overriding signature and exclude the primary key column from the INSERT statement. However, this approach also requires that you’ve previously implemented a database trigger.

As discussed above, Oracle’s session automatically captures the current sequence value. You simply refer to it by the other .currval pseudo column when you use an INSERT statement for a foreign key column.

This example demonstrates how you use both pseudo columns in a transaction across multiple tables.

-- Conditionally drop data sturctures - tables and sequences.
BEGIN
  FOR i IN (SELECT TABLE_NAME
            FROM   user_tables
            WHERE  TABLE_NAME IN ('ONE','TWO')) LOOP
    EXECUTE IMMEDIATE 'DROP TABLE '||i.table_name||' CASCADE CONSTRAINT';
  END LOOP;
  FOR i IN (SELECT sequence_name
            FROM   user_sequences
            WHERE  sequence_name IN ('ONE_S1','TWO_S1')) LOOP
    EXECUTE IMMEDIATE 'DROP SEQUENCE '||i.sequence_name;
  END LOOP;
END;
/
 
-- Create base table and sequence.
CREATE TABLE one
( one_id   INT         NOT NULL CONSTRAINT pk_one PRIMARY KEY
, one_text VARCHAR(10) NOT NULL );
 
CREATE SEQUENCE one_s1;
 
-- Create dependent table and sequence.
CREATE TABLE two
( two_id   INT         NOT NULL CONSTRAINT pk_two PRIMARY KEY
, one_id   INT         NOT NULL
, two_text VARCHAR(10) NOT NULL );
 
CREATE SEQUENCE two_s1;
 
-- Insert rows into the tables with sequence values.
INSERT INTO one VALUES (one_s1.nextval,'One!');
INSERT INTO one VALUES (one_s1.nextval,'Two!');
INSERT INTO two VALUES (two_s1.nextval, one_s1.currval,'Other Two!');
 
-- Display the values inserted with sequences.
SELECT   o.one_id
,        o.one_text
,        t.two_id
,        t.two_text
FROM     one o JOIN two t ON o.one_id = t.one_id;

If you mimic automatic numbering with database triggers, you may not have access to the .currval value for the second INSERT statement. This occurs when you provide a NULL value expecting the trigger to manage .NEXTVAL call for you.

Transactions require that you keep the primary key value for the first table in a locally scoped variable for reuse. Then, you can pass it to the next INSERT statement. You do that with the .CURRVAL value.

You can make a potentially erroneous assumption that you’re the only user updating the table. Operating under that assumption, you can query the highest sequence number from the table before an insert, add one to it, and then attempt the INSERT statement. In a multi-user system, it’s possible that somebody beats you to the finish line with their INSERT statement. Your insert would then have a duplicate surrogate key value for the one_id column, and fail on an ORA-00001 error for a uniqueness violation on a primary key column.

A database trigger can help you avoid a race condition. The trigger would ensure sequence values are unique but it may also introduce problems. A common Oracle trigger with a pseudo automatic numbering paradigm is represented by the following trigger (found in APEX generated code).

CREATE OR REPLACE TRIGGER one_t1 
BEFORE INSERT ON one
FOR EACH ROW
BEGIN
  :NEW.one_id := one_s1.nextval;
END;
/

Caution is required on this type of automated sequence trigger. There are two problems with this type of trigger.

One scenario is where you include a call to sequence_name.NEXTVAL in your INSERT statement. It then increments the sequence, and attempts to insert the value whereupon the trigger fires and repeats the behavior. Effectively, this type of logic creates a sequence that increments by one when you submit a null value in the values clause and by two when you submit a sequence_name.NEXTVAL value.

Another scenario occurs when you attempt a bulk INSERT operation on the table. The sequence call and substitution occurs on each row of the sequence.

You face another problem when you rewrite the trigger to only fire when a surrogate primary key isn’t provided, like this:

CREATE OR REPLACE TRIGGER one_t1 
BEFORE INSERT ON one
FOR EACH ROW
BEGIN
  :NEW.one_id := one_s1.nextval;
END;
/

This trigger design causes a problem only with bulk INSERT statements. Effectively, the sequence remains unaltered when you provide surrogate key values as part of inserting an array of values. The next non-bulk INSERT statement would then grab the .NEXTVAL value, attempt to use it, and raise a unique constraint violation because the bulk operation probably already used the value from the sequence.

The fix to bulk operations requires that you lock the table, disable a trigger like this, and get the .NEXTVAL value. Then, you assign the .NEXTVAL value to two local variables. One of these remains unchanged while the other increments as you populate the array for the bulk insert operation. After assigning the result from the .NEXTVAL, you drop the sequence and find the highest key value for the bulk insertion operation, add one to the highest key value, and store it in another locally stored variable. You perform the bulk insert operation and then recreate the sequence with a value one greater than the highest value in the table, which should already be in a locally scored variable. Don’t forget that you’d locked the table, so unlock it now.

You should note that database triggers run in a subshell with access only to the immediate shell that fired them. Therefore, you can’t set a bind variable in a SQL*Plus session and subsequently reference it inside the trigger body because it doesn’t have access to the variable.

The INSERT statement would change from a default signature, which is the positional order set when you created the table. The override signature replaces the default signature by providing a list of columns that matches the list of values in the VALUES clause. At a minimum, you must provide all mandatory columns. The automated numbering primary key (or surrogate key) is a mandatory column but you can provide a NULL value that will let the trigger grab the next sequence value and assign it to the INSERT statement.

-- Insert rows into the tables with sequence values and a database trigger.
INSERT INTO one (one_id, one_text) VALUES (one_s1.nextval,'One!');
INSERT INTO one (one_id, one_text) VALUES (one_s1.nextval,'Two!');
INSERT INTO two (two_id, one_id, two_text) VALUES (two_s1.nextval, one_s1.currval,'Other Two!');

MySQL

MySQL supports automatic numbering but not a default transactional mode like Oracle. You need to disable auto commit and start a transaction when you want to use automatic numbering across multiple related tables. MySQL implements automatic numbering by bundling sequences inside the table definitions, which you can see by inspecting the information_schema database catalog (e.g., covered in this blog post).

Assigning an automated number to a single table is fairly straightforward but transactions that span two or more tables require planning and knowledge of the MySQL transaction model. You insert data starting with the independent table, and then work your way down to the most dependent table. This is because you need a primary key before you can insert a foreign key.

Automatic numbering typically involves a surrogate primary key. A surrogate key doesn’t describe any of the data, and should never be thought of as a replacement for normalization because descriptive column values should do that. You may provide a null value in place of the auto_increment column or you may use an overriding list of columns that excludes the primary key column. At a minimum, you must provide all mandatory columns when you opt to use an overriding signature and exclude the primary key column from the INSERT statement.

You can capture the last automatic numbering value with the last_insert_id() function, and then you can assign the result of that function to a variable. You use the variable in the VALUES clause of the subsequent INSERT statement for a foreign key column.

This example demonstrates how you use a NULL in the position of the auto incrementing column with the default signature. The default signature is the definition stored in the database catalog when you create a table.

-- Conditionally drop the tables.
DROP TABLE IF EXISTS one;
DROP TABLE IF EXISTS two;
 
-- Create the tables with a surrogate key that automatically increments.
CREATE TABLE one ( one_id   INT PRIMARY KEY AUTO_INCREMENT
                 , one_text VARCHAR(20));
 
CREATE TABLE two ( two_id INT PRIMARY KEY AUTO_INCREMENT
                 , one_id INT
                 , two_text VARCHAR(20));
 
-- Start transaction cycle.
START TRANSACTION;
 
-- Insert first row by using a null for the surrogate key.
INSERT INTO one VALUES (NULL,'One');
 
-- Assign last auto increment to local scope variable, the = works too.
SET @one_fk := last_insert_id();
 
-- Insert second row by using a null for the surrogate key,
-- and use a local scope variable as the foreign key.
INSERT INTO two VALUES (NULL,@one_fk,'Two');
 
COMMIT;
 
-- Display the values inserted with auto incremented values.
SELECT   o.one_id
,        o.one_text
,        t.two_id
,        t.two_text
FROM     one o JOIN two t ON o.one_id = t.one_id;

This example demonstrates how you use an overriding signature:

-- Conditionally drop the tables.
DROP TABLE IF EXISTS one;
DROP TABLE IF EXISTS two;
 
-- Create the tables with a surrogate key that automatically increments.
CREATE TABLE one ( one_id   INT PRIMARY KEY AUTO_INCREMENT
                 , one_text VARCHAR(20));
 
CREATE TABLE two ( two_id INT PRIMARY KEY AUTO_INCREMENT
                 , one_id INT
                 , two_text VARCHAR(20));
 
-- Start transaction cycle.
START TRANSACTION;
 
-- Insert first row, transfer auto increment to memory.
INSERT INTO one (one_text) VALUES ('One');
 
-- Assign last auto increment to local scope variable, the = works too.
SET @one_fk := last_insert_id();
 
-- Insert second row with auto increment and local scope variable.
INSERT INTO two (one_id, two_text) VALUES (@one_fk,'Two');
 
COMMIT;
 
-- Display the values inserted with auto incremented values.
SELECT   o.one_id
,        o.one_text
,        t.two_id
,        t.two_text
FROM     one o JOIN two t ON o.one_id = t.one_id;

SQL Server

SQL Server supports automatic numbering but they call it the identity value. There are two ways to use it but the one I’m showing is for SQL Server 2005 or newer. You should replace the older @@identity for the SCOPE_IDENTITY() function call because Microsoft has already removed first level support from SQL Server 2000. While Microsoft has not said @@identity is deprecated, it sure appears that’s possible in a future release.

SQL Server supports automatic numbering but not a default transactional mode like Oracle. You need to start a transaction when you want to use automatic numbering across multiple related tables. SQL Server implements automatic numbering by bundling sequences inside the table definitions.

Assigning an automated number to a single table is fairly straightforward but transactions that span two or more tables require planning and knowledge of the SQL Server and T-SQL. You insert data starting with the independent table, and then work your way down to the most dependent table. This is because you need a primary key before you can insert a foreign key.

Automatic numbering typically involves a surrogate primary key. A surrogate key doesn’t describe any of the data, and should never be thought of as a replacement for normalization because descriptive column values should do that. You may provide a null value in place of the IDENTITY(1,1) column or you may use an overriding list of columns that excludes the primary key column. At a minimum, you must provide all mandatory columns when you opt to use an overriding signature and exclude the primary key column from the INSERT statement. If you wonder what the parenthetical (1,1) means at the end of the IDENTITY keyword, it means the cardinality of the column. It enforces the rule that a surrogate key must be a not null value, which is a column level constraint.

You can capture the last automatic numbering value with the SCOPE_IDENTITY() function, and then you can assign the result of that function to a variable. You use the variable in the VALUES clause of the subsequent INSERT statement for a foreign key column.

USE student;
 
BEGIN TRAN;
 
-- Conditionally drop tables when they exist.
IF OBJECT_ID('dbo.one','U') IS NOT NULL DROP TABLE dbo.one;
IF OBJECT_ID('dbo.two','U') IS NOT NULL DROP TABLE dbo.two;
 
-- Create auto incrementing tables.
CREATE TABLE one
( one_id   INT         NOT NULL IDENTITY(1,1) CONSTRAINT pk_one PRIMARY KEY
, one_text VARCHAR(10) NOT NULL );
 
CREATE TABLE two
( two_id   INT         NOT NULL IDENTITY(1,1) CONSTRAINT pk_two PRIMARY KEY
, one_id   INT         NOT NULL
, two_text VARCHAR(10) NOT NULL );
 
-- Insert the values, and magically no override signature required.
INSERT INTO one VALUES ('One!');
INSERT INTO one VALUES ('Two!');
INSERT INTO two VALUES (SCOPE_IDENTITY(),'Other Two!');
 
-- Query the results.
SELECT   o.one_id
,        o.one_text
,        t.two_id
,        t.two_text
FROM     one o JOIN two t ON o.one_id = t.one_id;
 
COMMIT TRAN;

You should note that T-SQL doesn’t require an override signature when you use an automatic numbering column. This is different, isn’t it?

While the prior example works with two tables, it doesn’t scale to a series of tables. You should consider the following assignment pattern when you’ll have multiple last identity values in a single transaction scope.

DECLARE @one_pk AS INT;
SET @one_pk = SCOPE_IDENTITY();

As mentioned, this style is important when you’ve got a series of primary and foreign keys to map in the scope of a single transaction. Also, I’d suggest that you put all the declarations at the beginning of the transaction’s scope.

Written by michaelmclaughlin

October 10th, 2009 at 12:34 pm

Posted in

6 Responses to 'D – Automatic Numbering'

Subscribe to comments with RSS or TrackBack to 'D – Automatic Numbering'.

  1. You can find the original post for this article here.

  2. The above references the insertion of a record into table “b”.

    INSERT INTO b (one_id, two_text) VALUES (@one_fk,’Two’);

    I’m thinking that that statement should insert into table “two”

    David Reading

    27 Jan 10 at 3:54 pm

  3. Yes, I fixed it above.

    michaelmclaughlin

    28 Jan 10 at 12:46 am

  4. INSERT INTO b VALUES (NULL,@one_fk,’Two’);

    …It’s still not fixed in the first/top example 🙂

    Chris Staber

    2 Oct 10 at 9:20 pm

  5. I believe all the b are now two, right? 😉

    michaelmclaughlin

    3 Oct 10 at 2:31 pm

  6. You should put row results .

    Jeremy

    26 Jan 15 at 2:22 pm

Leave a Reply