Database Tutorial

Course Tutorial Site

Site Admin

Database History

without comments

Databases evolved from a collection of files. Each file contained something unique. As a rule, the unique thing was a single subject or fact. Unfortunately, the cost of managing file systems led to the development of a Hierarchical Linear Model for storing information. A Hierarchical Linear Model is often called a hierarchical database and works very much like an XML structure.

A hierarchical database typically starts with a single root node. The root node typically one or more dependent children. The children are child nodes. Each child may have one or more siblings or sibling nodes. It is possible that a child node has children of its own. They are grandchild nodes from the perspective of the root node and child nodes from intermediary parent node. There is no limit to the depth of a node tree but those nodes without dependents or child nodes are leaf nodes. More or less a hierarchical model is like an inverted tree because the root node is at the top instead of deep in the ground below the tree. The leaves or leaf nodes are at the bottom instead of reaching for the sky.

Hierarchical databases work best when you compare data between sibling (side-by-side) nodes. The cost of comparing data between nodes becomes more expensive as they become farther apart. The collection of programs that manage a Hierarchical database are called a Database Management System (DBMS).

The structure of an inverted node tree is the problem with hierarchical databases. A good engineer tries to organize the data into single subjects, which are the nodes. Then, a good engineer tries to group related subjects under the same parent node. Unfortunately, some queries need to gather indirectly related subjects, which may be stored under different parent nodes. This type of design makes some queries cost far more than others.

Good engineers also improve their solutions as they learn from them. For example, they recognized that hierarchical databases were too inflexible for solving dynamic query results. The engineering solution was elegant and simple. They created networked database, which effectively added a map to each node in the tree. The map was called a pointer because it would act as an indirect reference to data found elsewhere in the tree. The problem with this approach was somebody had to maintain the pointers as other developer’s queries were developed or modified.

The cost of maintaining all the pointers in networked databases was too high. The engineers created relational databases to automate the maintenance of pointers. They put the pointers into another set of tables, which are called indexes. The collection of definitions about the tables or nodes and their search optimizing pointers are stored in a special subordinate database. The subordinate database is called a database catalog or data dictionary. The data that describes all the other data is metadata. Metadata is simply data about data.

Relational database management systems (RDBMSs) are a collection of programs that manage databases. An RDBMS discovers and maintains the ordinary data and the data that describes the data. The data that describes the data exists in a database catalog or data dictionary.

Structured Query Language (SQL) limits your access to these self-managing catalog or dictionaries. SQL is actually a shortened form of IBM’s original SEQUEL (Structured English Query Language), which conflicted with an English trademark but some thinks it simply doesn’t stand for anything.

The people who use a database are varied. You have programmers, analysts, engineers, database administrators (DBAs), and users. You can divide the users into many groups but the seminal work of Alvin Toffler – PowerShift: Knowledge, Wealth, and Violence at the Edge of the 21st Century labels these users as knowledge workers. If you want an updated, lengthier, tomb please check The World is Flat by Thomas Friedman.

Written by michaelmclaughlin

August 2nd, 2018 at 12:10 am

Posted in