CREATE at Database Tutorial

CREATE

Week #2: Articles

Learning Outcomes

Learn how to create tables.
Learn how to create sequences.
Learn how to create indexes.

DDL Commands

CREATE Statement

You can apply the CREATE statement to many objects in the database. For example, you can create databases, users, tables, indexes, sequences, synonyms, types, view, and directories. Initially, you will create tables, indexes and sequences.

CREATE tables

This section shows you how to create tables.

Instruction Details →

Database tables are two-dimensional record structures that hold data. Grants of permissions to read and write data are most often made to tables. Sometimes grants restrict access to columns in tables.

Although databases contain tables, tables contain data organized by data types. A data type is the smallest container in this model. It defines what type of values can go into its container. Data values such as numbers, strings, or dates belong, respectively, in columns defined as numeric, variable-length string, and DATE data types. Data types that hold a single value are scalar data types (or, to borrow some lingo from Java, primitive data types).

Tables are seldom defined by a single column. They are typically defined by a set of columns. The set of columns that defines a table is a type of data structure. It is more complex than a single data type because it contains a set of ordered data types. The position of the elements and their data types define the structure of a table. The definition of this type of structure is formally a record structure, and the elements are fields of the data structure.

This record structure description can be considered the first dimension of a two-dimensional table. The rows in the table are the second dimension. Rows are organized as an unordered list because relational operations should perform against all rows regardless of their positional order.

Tables are defined by the DDL CREATE TABLE command. The command provides names for columns, data types, default values, and constraints. The column, data type, and default values must always be defined on the same line, but constraints can be defined two places. Defining a constraint on the same line as a table column is defining an inline constraint. This is the typical pattern for column constraints, such as a not null or single-column PRIMARY KEY column. You can opt to define column constraints after all columns are defined. When you do so, the constraints are out-of-line constraints. Sometimes constraints involve more than one column. Constraints that apply to two or more columns are table constraints.

A not null constraint is always a column constraint. The ANSI SQL standard requires that all columns in tables be unconstrained by default. An unconstrained or nullable column is an optional column when you insert or update a row. A not null column is a mandatory column when you insert or update a row. The Oracle database adheres to this ANSI standard use. Note that Microsoft SQL Server doesn’t adhere to the standard, because it makes all columns mandatory by default.

RDBMS implementations comprise five basic groups of data types: numbers, characters, date-time intervals, large objects, and Boolean data types. The Boolean data type was added in ANSI SQL:1999, and it includes three-valued logic: true, not true, and null. It adopts three-valued logic because the ANSI SQL-89 standard and later accept that any column can be a null allowed— or, simply put, a column can contain no value or be empty.

Although RDBMSs determine which data types they’ll support, they also determine how they’ll implement them. Some data types are scalar or primitive data types, and others are built on those primitive data types. Only Oracle (and PostgreSQL) supports building composite data types that can be implemented as nested tables.

The Oracle Database does not support a Boolean data type. The lack of a Boolean data type does sometimes cause problems, because standard comparison operators don’t work with null values. Null values require the IS or IS NOT comparison operator, which is a reference operator rather than a value comparison operator. Table design and management should take into consideration the processing requirements to handle three-valued logic of pseudo-Boolean data types effectively.

The general and basic prototype for a relational table with the CREATE TABLE statement without storage clause options follows:

CREATE [GLOBAL] [TEMPORARY] TABLE [schema_name.] table_name
( column_name NUMBER [GENERATED [{ALWAYS] | BY DEFAULT}] AS IDENTITY
[{START WITH NUMBER | INCREMENT BY NUMBER | NOCACHE}]] [{DEFAULT expression | AS (virtual_expression)}] [[CONSTRAINT] constraint_name constraint_type]
,[column_name data_type [INVISIBLE]
[{DEFAULT expression | AS (virtual_expression)}] [[CONSTRAINT] constraint_name constraint_type]
,[...]
,[CONSTRAINT constraint_name constraint_type(column_list)
[REFERENCES table_name(column_list)]] ,[...]);

The INVISIBLE keyword, available in Oracle Database 12c forward, must follow the data type in all cases, which means the DEFAULT expression, virtual expressions, and constraints must follow the INVISIBLE keyword.

The CREATE table statement lets you create tables, which are lists of record structures. You create a sample table as follows:

CREATE TABLE sample
( sample_id     NUMBER
, sample_text   VARCHAR2(30)
, sample_date   DATE
, sample_amount NUMBER(6,2));

After the table name, you list the column names and their data types in a comma delimited list. All variable length strings require that you designate their size following the VARCHAR, VARCHAR2, or NVARCHAR2 data type name.

You may also limit the physical size of the NUMBER data type by providing an integer value between 1 and 38. The physical size of a NUMBER is the number of digits in the number. You can also designate the precision and scale of any number. The scale is the number of values to the right side of the decimal.

While the database does not store the comma, it does store how to treat the number. Let’s say you designate a sample_amount column as a number of 6 precision and a scale of two decimal places like the sample_amount column example above. The database knows the last two characters belong to the right of the decimal point, and the database performs any math operations against the number by first following that rule.

CREATE sequences

This section shows you how to create sequences.

Instruction Details →

If you’ve used Oracle databases for years, you’re knowledgeable about sequences and the fact that they act independently of tables that they may support. Until the Oracle Database 12c release, all you had to work with were separate sequences and .nextval and .currval pseudocolumns, unlike other databases that support identity columns. Oracle Database 12c now includes identity columns.

This section covers both the old and new approaches. Discussion of the old approach with sequences precedes that of the new identity columns because it appears that IT shops may take some time to adopt the new approach—and if history is any guide, that may be several years.

Sequences Oracle Database 11g and its predecessors don’t support automatic numbering in tables through identity columns. It provides a separate SEQUENCE data structure for use in surrogate keys. Surrogate keys are artificial numbering sequence values that uniquely define rows. They’re typically used in joins, because subsequent redefinition of a natural key doesn’t invalidate their ability to support joins across tables. The “Indexes” section later in this appendix qualifies how to use surrogate key columns with the natural key to define row uniqueness and optimize joins.

The prototype for sequences is

CREATE SEQUENCE sequence_name [INCREMENT BY increment_value]
      [MINVALUE minimum | NOMINVALUE]
      [MAXVALUE maximum | NOMAXVALUE]
      [CACHE | NOCACHE]
      [ORDER | NOORDER]

A typical sequence holds a starting number, an incrementing unit, and a buffer cycling value. The naming convention is to use a table name and append _S to it. It should be noted that you should only use a sequence for one table.

Each time you call the sequence with a sequence_name.nextval statement, the value of the sequence increases by one (or whatever value was chosen as the INCREMENT BY value). This occurs until the system consumes the last sequence value in the buffer cycle. When the last value has been read from the buffer cache, a new cycle of values is provided to the instance. The default for the cycle or sequence buffer is a set of 20 number values.

You create a SEQUENCE structure with the default values, like this:

CREATE SEQUENCE sequence_name;

Sometimes application development requires preseeding (inserting before releasing an application to your customer base) rows in tables. Such inserts are done manually without the sequence value or with a sequence starting at the default START WITH value of 1. After preseeding the data, you drop the sequence to modify the START WITH value because Oracle doesn’t provide an alternative to modifying it.

Preseeding generally inserts 10 to 100 rows of data, but after preseeding data, the START WITH value is often set at 1001. This leaves developers an additional 900 rows for additional post-implementation seeding of data. You create a sequence starting at that value like this:

CREATE SEQUENCE sequence_name START WITH 1001;

You also have the option of creating a sequence value that doesn’t leave gaps by suppressing a buffered cache of values (not recommended). The syntax for that adds the NOCACHE keyword, as follows:

CREATE SEQUENCE sequence_name START WITH 1001 NOCACHE;

Suppressing the buffered cache of sequence values has a substantial negative impact on performance, so you are advised not to do it.

You use sequences by appending (with dot notation) two pseudocolumns to the sequence name: .nextval and .currval. The .nextval pseudocolumn initializes the sequence in a session and gets the next value, which is initially the START WITH value. After accessing the .nextval pseudocolumn, you get the current value by using the .currval pseudocolumn. You receive an ORA-08002 error when attempting to access the .currval pseudocolumn before having called the .nextval pseudocolumn in a session. The error message says that you have tried to access a sequence not defined in the session, because .nextval initializes or declares the sequence in the session.

There are several ways to access sequences with the .nextval pseudocolumn. The basic starting point is querying the pseudo-table dual:

SELECT sequence_s.NEXTVAL
FROM dual;

Then, you can see the value again by querying

SELECT sequence_s.CURRVAL FROM dual;

The number will be the same, provided you did not connect to another schema and/or reconnect to a SQL*Plus session. You can also use the .nextval and .currval pseudocolumns in the VALUES clause of an INSERT statement or inside a subquery feeding an INSERT statement.

CREATE indexes

This section shows you how to create indexes.

Instruction Details →

Indexes are separate data structures that provide alternate pathways to finding data. They can and do generally speed up the processing of queries and other DML commands, like the INSERT, UPDATE, MERGE, and DELETE statements. Indexes are also called fast access paths.

The Oracle database maintains the integrity of indexes after you create them. The upside of indexes is that they can improve SQL statement performance. The downside is that they impose overhead on every INSERT, UPDATE, MERGE, and DELETE statement because the database maintains them by inserting, updating, or deleting items for each related change in the tables that the indexes support.

Indexes have two key properties—usability and visibility. Indexes are both usable and visible by default. That means they are visible to the Oracle database cost-based optimizer and usable when statements run against the tables they support.

You have the ability to make any index invisible, in which case queries and DML statements won’t use the index because they won’t see it. However, the cost-based optimizer still sees the index and maintains it with any DML statement change. That means making an index invisible isn’t quite like making the index unusable or like dropping it temporarily. An invisible index becomes overhead and thus is typically a short-term solution to run a resource-intensive statement that behaves better without the index while avoiding the cost of rebuilding it after the statement runs.

It is also possible to make an index unusable, in which case it stops collecting information and becomes obsolete and the database drops its index segment. You rebuild the index when you change it back to a usable index.

Indexes work on the principal of a key. A key is typically a set of columns or expressions on which you can build an index, but it’s possible that a key can be a single column. An index based on a set of columns is a composite, or concatenated, index.

Indexes can be unique or nonunique. You create a unique index anytime you constrain a column by assigning a primary key or unique constraint, but they’re indirect indexes. You create a direct unique index on a single column with the following syntax against two nonunique columns:

CREATE INDEX common_lookup_nuidx
ON common_lookup (common_lookup_table);

You could convert this to a nonunique index on two columns by using this syntax:

CREATE INDEX common_lookup_nuidx
ON common_lookup (common_lookup_table, common_lookup_column);

Making the index unique is straightforward; you only need to add a unique keyword to the CREATE INDEX statement, like

CREATE UNIQUE INDEX common_lookup_nuidx
ON common_lookup ( common_lookup_table
                 , common_lookup_column
                 , common_lookup_type);

Most indexes use a B-tree (balanced tree). A B-tree is composed of three types of blocks—a root branch block for searching next-level blocks, branch blocks for searching other branch blocks, and leaf blocks that store pointers to row values. B-trees are balanced because all leaf- blocks are at the same level, which means the length of search is the same to any element in the tree. All branch blocks store the minimum key prefix required to make branching decisions through the B-tree.

There are six schemas for creating B-tree indexes and a couple of schemas for creating bitmap indexes. The B-tree schemas are described first, followed by a description of a single bitmap schema.

Written by michaelmclaughlin

August 7th, 2018 at 12:52 am

Posted in

Database Tutorial

CREATE

My Technical Blog

Tutorial Pages

Blogroll