Database Tutorial

Course Tutorial Site

Site Admin

Hierarchical Queries

with 8 comments

You can use hierarchical queries to step through tree-like data stored in self-referencing tables, like this one for organizations and organization structures. The ORG_STRUCTURE table contains two foreign keys (ORG_PARENT_ID and ORG_CHILD_ID). They both reference the ORGANIZATION_ID column. The top-most, or root node, contains a zero as an ORG_PARENT_ID because organizations start numbering at 1. You could also define the root node ORG_PARENT_ID as a null value. Bottom nodes, or leaf nodes, are those organizations that appear as ORG_CHILD_ID but not as ORG_PARENT_ID.

Flexible Organization Hierarchy

Flexible Organization Hierarchy

You’ll find the seeding scripts at the end of this web page. If you want to perform a hierarchical query in T-SQL, go to this blog page.

This page covers four hierarchical query techniques. It shows you how to navigate a complete tree from the top down, and order by siblings. It shows you how to navigate from a leaf, or any intermediary, node up a tree to the root node, and how to limit the depth of traversal. It also shows you how to find all leaf nodes, and use the result to navigate several trees concurrently. Lastly, it shows you how to use NOCYCLE to identify rows that cause an ORA-01436 error, which means the tree linking relationship is broken.

A quick caveat, the START WITH clause is technically optional. If you exclude it all nodes are root nodes and the default depth is only two. So, really it’s not optional for meaningful work.

Down the Tree

This query allows you to navigate down the tree from the root node. You navigate down the tree when the PRIOR keyword precedes the child or dependent node. The SQL*Plus formatting commands generate the output shown.

COL org_id FORMAT A20
COL org_name FORMAT A20
 
SELECT   LPAD(' ', 2*(LEVEL - 1)) || os.org_child_id AS org_id
,        o.organization_name org_name
FROM     organization o
,        org_structure os
WHERE    o.organization_id = os.org_child_id
START
WITH     os.org_parent_id = 0
CONNECT
BY PRIOR os.org_child_id = os.org_parent_id;

It produces the following output:

ORG_ID               ORG_NAME
-------------------- --------------------
1                    One
  2                  Two
    5                Five
      11             Eleven
      12             Twelve
    6                Six
      13             Thirteen
      14             Fourteen
        20           Twenty
  3                  Three
    7                Seven
      15             Fifteen
    8                Eight
      16             Sixteen
      17             Seventeen
  4                  Four
    9                Nine
      18             Eighteen
      19             Nineteen
    10               Ten

If there were an offending row in the table that didn’t have a connecting parent and child set of foreign keys, you’d raise an ORA-01436 error. It means you can’t CONNECT BY PRIOR because a row’s values are non-conforming. You can enter a row to break the hierarchy provided the ORG_PARENT_ID and ORG_CHILD_ID have the same value. Then, this modified query will identify the offending row for you:

SELECT        LPAD(' ', 2*(LEVEL - 1)) || os.org_child_id AS org_id
,             o.organization_name org_name
FROM          organization o
,             org_structure os
WHERE         o.organization_id = os.org_child_id
START WITH    os.org_parent_id = 0
CONNECT BY
NOCYCLE PRIOR os.org_child_id = os.org_parent_id;

The next query changes the output because it orders by siblings. This means that the numeric ordering of the parent and child nodes is overridden.

SELECT   LPAD(' ', 2*(LEVEL - 1)) || os.org_child_id AS org_id
,        o.organization_name org_name
FROM     organization o
,        org_structure os
WHERE    o.organization_id = os.org_child_id
START
WITH     os.org_parent_id = 0
CONNECT
BY PRIOR os.org_child_id = os.org_parent_id
ORDER
SIBLINGS
BY       o.organization_name;

The sort now represents a tree in alphabetical ordering:

ORG_ID               ORG_NAME
-------------------- --------------------
1                    One
  4                  Four
    9                Nine
      18             Eighteen
      19             Nineteen
    10               Ten
  3                  Three
    8                Eight
      17             Seventeen
      16             Sixteen
    7                Seven
      15             Fifteen
  2                  Two
    5                Five
      11             Eleven
      12             Twelve
    6                Six
      14             Fourteen
        20           Twenty
      13             Thirteen

Up the Tree

The query switches the position of the PRIOR keyword. It now precedes the parent node. This means that it will go up the tree. The START WITH in this case starts with an intermediary node.

SELECT  LPAD(' ', 2*(LEVEL - 1)) || os.org_child_id AS org_id
,       o.organization_name org_name
FROM    organization o
,       org_structure os
WHERE   o.organization_id = os.org_child_id
START
WITH    os.org_child_id = 6
CONNECT
BY      os.org_child_id = PRIOR os.org_parent_id;

The output is:

ORG_ID               ORG_NAME
-------------------- --------------------
6                    Six
  2                  Two
    1                One

Restricting the Depth of Search

This is another up the tree hierarchical query. It starts from the bottom most leaf node, which is at depth five from the root node. The query restricts upward navigation to three levels or the two immediate parents (or if you prefer parent and grandparent).

SELECT  LPAD(' ', 2*(LEVEL - 1)) || os.org_child_id AS org_id
,       o.organization_name org_name
FROM    organization o
,       org_structure os
WHERE   o.organization_id = os.org_child_id
AND     LEVEL <= 3
START
WITH    os.org_child_id = 20
CONNECT
BY      os.org_child_id = PRIOR os.org_parent_id;

The output is:

ORG_ID               ORG_NAME
-------------------- --------------------
20                   Twenty
  14                 Fourteen
    6                Six

Leaf Node Up

Traversing from a leaf node can start by inspecting which are leaf nodes first, and hard coding a value in the START WITH clause. A better approach is to use a subquery to identify leaf nodes and then a filter in the outer WHERE clause to limit the leaf nodes. The leaf nodes are plural in both of these solutions.

This solution works prior to Oracle 10g, and continues to work through Oracle 11g:

SELECT  LPAD(' ', 2*(LEVEL - 1)) || os.org_child_id AS org_id
,       o.organization_name org_name
FROM    organization o
,       org_structure os
WHERE   o.organization_id = os.org_child_id
START
WITH    os.org_child_id IN (SELECT os1.org_child_id
FROM    org_structure os1 LEFT JOIN org_structure os2
ON      os2.org_parent_id = os1.org_child_id
MINUS
SELECT  os1.org_child_id
FROM    org_structure os1 JOIN org_structure os2
ON      os2.org_parent_id = os1.org_child_id)
CONNECT
BY      os.org_child_id = PRIOR os.org_parent_id;

This solution uses the CONNECT_BY_LEAF pseudocolumn (introduced in Oracle 10g) to replace the outer join minus the join in the subquery. I tested it in Oracle 10gR2 and Oracle 11gR1 (The terminal release is near, isn’t it?). It also uses the ORDER SIBLINGS BY clause to order the tree by the alphabetical ORGANIZATION_NAME column values.

SELECT  LPAD(' ', 2*(LEVEL - 1)) || os.org_child_id AS org_id
,       o.organization_name org_name
FROM    organization o
,       org_structure os
WHERE   o.organization_id = os.org_child_id
START
WITH    os.org_child_id IN (SELECT   org_child_id
FROM     org_structure
WHERE    CONNECT_BY_ISLEAF = 1
START
WITH     org_child_id = 1
CONNECT
BY PRIOR org_child_id = org_parent_id)
CONNECT
BY      os.org_child_id = PRIOR os.org_parent_id
ORDER
SIBLINGS
BY      o.organization_name;

The snapshot of output for the latter is:

ORG_ID               ORG_NAME
-------------------- --------------------
18                   Eighteen
  9                  Nine
    4                Four
      1              One
11                   Eleven
  5                  Five
    2                Two
      1              One
15                   Fifteen
  7                  Seven
    3                Three
      1              One
19                   Nineteen
  9                  Nine
    4                Four
      1              One
17                   Seventeen
  8                  Eight
    3                Three
      1              One
16                   Sixteen
  8                  Eight
    3                Three
      1              One
10                   Ten
  4                  Four
    1                One
13                   Thirteen
  6                  Six
    2                Two
      1              One
12                   Twelve
  5                  Five
    2                Two
      1              One
20                   Twenty
  14                 Fourteen
    6                Six
      2              Two
        1            One

I did leave out the CONNECT_BY_ROOT and SYS_CONNECT_BY_PATH. If you’ve got the urge, post me a comment with an example. Otherwise, I’ll try to update this page later.

The setup code for the test sample:

BEGIN
  FOR i IN (SELECT NULL
            FROM user_tables
            WHERE table_name = 'ORGANIZATION') LOOP
    EXECUTE IMMEDIATE 'DROP TABLE organization CASCADE CONSTRAINTS';
  END LOOP;
  FOR i IN (SELECT NULL
            FROM user_tables
            WHERE table_name = 'ORG_STRUCTURE') LOOP
    EXECUTE IMMEDIATE 'DROP TABLE org_structure CASCADE CONSTRAINTS';
  END LOOP;
  FOR i IN (SELECT NULL
            FROM user_sequences
            WHERE sequence_name = 'ORGANIZATION_S1') LOOP
    EXECUTE IMMEDIATE 'DROP SEQUENCE organization_s1';
  END LOOP;
  FOR i IN (SELECT NULL
            FROM user_sequences
            WHERE sequence_name = 'ORG_STRUCTURE_S1') LOOP
    EXECUTE IMMEDIATE 'DROP SEQUENCE org_structure_s1';
  END LOOP;
END;
/
 
CREATE TABLE ORGANIZATION
( organization_id NUMBER
, organization_name VARCHAR2(10));
 
CREATE SEQUENCE ORGANIZATION_S1;
 
DECLARE
  TYPE list IS TABLE OF VARCHAR2(10);
  ordinal LIST := list ('One','Two','Three','Four'
                       ,'Five','Six','Seven','Eight'
                       ,'Nine','Ten','Eleven','Twelve'
                       ,'Thirteen','Fourteen','Fifteen','Sixteen'
                       ,'Seventeen','Eighteen','Nineteen','Twenty');
  BEGIN
    FOR i IN 1..ordinal.COUNT LOOP
      INSERT INTO ORGANIZATION VALUES (organization_s1.NEXTVAL,ordinal(i));
  END LOOP;
  COMMIT;
END;
/
 
CREATE TABLE org_structure
( org_structure_id NUMBER
, org_parent_id NUMBER
, org_child_id NUMBER );
 
CREATE SEQUENCE org_structure_s1;
 
INSERT INTO org_structure VALUES (1,0,1);
INSERT INTO org_structure VALUES (1,1,2);
INSERT INTO org_structure VALUES (1,1,3);
INSERT INTO org_structure VALUES (1,1,4);
INSERT INTO org_structure VALUES (1,2,5);
INSERT INTO org_structure VALUES (1,2,6);
INSERT INTO org_structure VALUES (1,3,7);
INSERT INTO org_structure VALUES (1,3,8);
INSERT INTO org_structure VALUES (1,4,9);
INSERT INTO org_structure VALUES (1,4,10);
INSERT INTO org_structure VALUES (1,5,11);
INSERT INTO org_structure VALUES (1,5,12);
INSERT INTO org_structure VALUES (1,6,13);
INSERT INTO org_structure VALUES (1,6,14);
INSERT INTO org_structure VALUES (1,7,15);
INSERT INTO org_structure VALUES (1,8,16);
INSERT INTO org_structure VALUES (1,8,17);
INSERT INTO org_structure VALUES (1,9,18);
INSERT INTO org_structure VALUES (1,9,19);
INSERT INTO org_structure VALUES (1,14,20);
 
COMMIT;

Written by michaelmclaughlin

October 8th, 2009 at 9:33 pm

Posted in

8 Responses to 'Hierarchical Queries'

Subscribe to comments with RSS or TrackBack to 'Hierarchical Queries'.

  1. What can a query like this be used for… I understand what your doing… But I don’t know if I see the application in what they could accomplish.

    Nick Oberndorfer

    16 Dec 09 at 3:25 pm

  2. Hierarchical queries let you use rows in a database to create self-referencing trees. Their joins are independent of fixed relationships to external tables, which may require multiple changes to queries when inserting a table in-between to levels in a tree. They provide more flexibility to your database design.

    michaelmclaughlin

    29 Dec 09 at 4:32 pm

  3. […] display the execution plan by […]

  4. Hi Michael,

    I work with Oracle9i Enterprise Edition Release 9.2.0.8.0 .
    My task is to change an Hierarchical query results according to user input. Here is an example of
    the original query applied on Scott’s EMP table, which has a relation to itself by empno and mgr,
    as we all know.

    My query:
    ———–

    SELECT level,lpad('*',(level-1)*2,'*')||ename ename FROM emp
    START WITH mgr IS NULL
    CONNECT BY prior empno = mgr
    AND rownum=1 -- I'd like it to fetch only this specific record, which should appear in any case
    UNION ALL
    SELECT level,lpad('*',(level-1)*2,'*')||ename ename 
    FROM emp e1
    WHERE (job = '&amp;job' OR '&amp;job' IS NULL)
      AND (e1.deptno = '&amp;dept_no' OR '&amp;dept_no' IS NULL)
      AND (e1.ename = '&amp;name' OR '&amp;name' IS NULL)
      AND mgr IS NOT NULL
    START WITH mgr IS NULL 
    CONNECT BY prior empno = mgr;

    Results:
    ———-

    LEVEL    ENAME
    =====   ==============
    1    KING
    2    **JONES
    3    ****SCOTT
    4    ******ADAMS
    3    ****FORD
    4    ******SMITH
    2    **BLAKE
    3    ****ALLEN
    3    ****WARD
    3    ****MARTIN
    3    ****TURNER
    3    ****JAMES
    2    **CLARK
    3    ****MILLER

    This is all fine, in case the users have not passed parameters.
    The problem begins when they do pass parameters.
    What I expect from the query is to show the both selected records and their successors’ if they have
    any.
    For example, if I run this query, and pass “BLAKE” as a value for the parametr named “NAME”,
    currently my query returns KING as root and as the successor BLAKE, whereas I expect it to fetch
    both Blake and his subordinate employees as in the following result set :

    LEVEL ENAME
    ====== ===========

    1       KING
    2       **BLAKE
    3    ****ALLEN
    3    ****WARD
    3    ****MARTIN
    3    ****TURNER

    Thanks in Advance!
    Shimon B.

    Shimon Batashvili

    17 Mar 14 at 10:25 am

  5. Try removing the first query and UNION ALL set operator? The mgr value will never be null for BLAKE according to your data.

    SELECT level,lpad('*',(level-1)*2,'*')||ename ename FROM emp
    START WITH mgr IS NULL
    CONNECT BY prior empno = mgr
    AND rownum=1

    Alternatively, you can use an object table function, like the one shown here. Hope that helps.

    michaelmclaughlin

    18 Mar 14 at 10:56 am

  6. Hi Michael,
    Thanks alot for your response, I sure hope your suggestion will suite my requirements, and I’m going to tell you if it does. My only doubt is whether this solution will apply with my Oracle version (9i – as mentioned above).
    I can assure you I will try and let you know.

    Thanks again,
    Shimon B.

    Shimon Batashvili

    20 Mar 14 at 3:57 am

  7. Hi Michael,
    As promised, I’d like to share my expreience with you.
    First of all, you’ve suggested to remove the first elemnt of the query, including the “UNION ALL” clause. I’ve done it, but it made no difference; the only reason of using the “union all” was, as noted in my original comment, to show the root node no ,matter what the where conditions are. Ommiting the elemnt in question does not realy help us. The outcome of running the query in this case would be as follows (for passing only one query criteria – ‘&job’ = ‘CLERK’):
    LEVEL ENAME
    ——— ————————
    4 ******ADAMS
    4 ******SMITH
    3 ****JAMES
    3 ****MILLER
    while i’d like it always to begin with level 1, and also show the connect path of the values fetched.
    I another words there is, in some cases, a contradiction between the “connect by ” clause and conditions in the where clause.
    This is why unfotunatly i cannot come to any reasonable solution.
    There is also the limitation of the Oracle version I work with – 9i – wich does not allow using subqueries in the connect by clause (I get the error : ORA-01473: cannot have subqueries in CONNECT BY clause).

    Anyway I’m not going to stop here, but to keep trying to find a solution.

    Best Regards,

    Shimon B.

    Shimon Batashvili

    23 Mar 14 at 4:12 am

  8. Shimon,
    I’ll make it the subject of my Dell Maven article for June. I’ll let you know when it’s available.
    Michael

    michaelmclaughlin

    27 May 14 at 12:30 pm

Leave a Reply