Translating Joins
Week 10: Articles
Translating Joins
Learning Outcomes
- Learn how to discover elements of a natural key in denormalized import tables.
- Learn how to discover elements of the production table’s natural key that are not found in the import table that let you map unique rows.
Lesson Materials
This article relies on your understanding of how to map a denormalized import table to production normalized table. If you’re unsure about how to map an import tables to production tables, please read the Mapping Joins article.
The example works through how you import data into a normalized production table when you need to translate a meaningful string into a numeric surrogate key value. You translate the common_lookup_type
column value into the common_lookup_id
column value.
You may need to review the purpose of common lookup tables before working through this blog page. The common_lookup
table is a table that contains sets of tables. The following three columns define the common_lookup
table’s natural key:
common_lookup_table
common_lookup_column
common_lookup_type
Each subtable is defined by a super key, which is composed of two columns of the natural key:
common_lookup_table
common_lookup_column
Together, the two foregoing columns identify a set of related rows that define a subtable inside the common_lookup
table. The two columns let you create logically independent tables in a single table structure.
Inside each subtable the common_lookup_type
column becomes the natural key that defines unique rows. The common_lookup_code
column provides an optional display value and the common_lookup_meaning
provides a case sensitive text description. These two descriptive columns are non-key columns.
Unfortunately, the common lookup design pattern has some limitations. It doesn’t guarantee that the values in the common_lookup_type
column are unique in the scope of the common_lookup
table. That limitation means that you may encounter a problem when you try to map an external import table’s column value to the common_lookup_type
column. The problem occurs because the value in the import table may match two or more rows in the common_lookup
table.
The common lookup table pattern requires that you translate rather than map common_lookup_type
column values. You can map column values when they’re unique within the scope of a table but not when they’re non-unique. You need to translate a non-unique join between two tables. Translating a join is like triangulating a location.
Let’s say somebody gave you directions to a buried treasure chest. The directions were to find a tree and boulder on a knoll ten miles from a distant city, and then dig when you are eight paces from the tree and five paces from the bolder. You need to find the city, then the knoll with a tree and bolder, then the length of a pace, and finally the place where the distance of five paces from the bolder and the distance of eight paces from the tree intersect.
More or less, the last step is triangulating a position. One line connects the tree and boulder, a second line connects the boulder and the intersection of the paces, and a third line connects the tree and the intersection of the paces. The three lines become the lines of a triangle, which is why its called triangulation.
You triangulate a table by translating the relationship from the common_lookup table to the appropriate subtable inside the common_lookup
table. Below is a color illustration of the various subtables inside the common_lookup
table.
NATURAL KEY | |||
---|---|---|---|
SUPER KEY (Subtable Identifier) | SUBTABLE NATURAL KEY |
||
COMMON_LOOKUP_ID | COMMON_LOOKUP_TABLE | COMMON_LOOKUP_COLUMN | COMMON_LOOKUP_TYPE |
1 | SYSTEM_USER | SYSTEM_USER_TYPE | SYSTEM_ADMIN |
2 | SYSTEM_USER | SYSTEM_USER_TYPE | DBA |
3 | SYSTEM_USER | SYSTEM_USER_TYPE | SYSTEM_GROUP |
4 | SYSTEM_USER | SYSTEM_USER_TYPE | COST_CENTER |
5 | SYSTEM_USER | SYSTEM_USER_TYPE | INDIVIDUAL |
1001 | CONTACT | CONTACT_TYPE | EMPLOYEE |
1002 | CONTACT | CONTACT_TYPE | CUSTOMER |
1003 | MEMBER | MEMBER_TYPE | INDIVIDUAL |
1004 | MEMBER | MEMBER_TYPE | GROUP |
1005 | MEMBER | MEMBER_TYPE | DISCOVER_CARD |
1006 | MEMBER | MEMBER_TYPE | MASTER_CARD |
1007 | MEMBER | MEMBER_TYPE | VISA_CARD |
1008 | ADDRESS | ADDRESS_TYPE | HOME |
1009 | ADDRESS | ADDRESS_TYPE | WORK |
1010 | ITEM | ITEM_TYPE | DVD_FULL_SCREEN |
1011 | ITEM | ITEM_TYPE | DVD_WIDE_SCREEN |
1012 | ITEM | ITEM_TYPE | NINTENDO_GAMECUBE |
1013 | ITEM | ITEM_TYPE | PLAYSTATION2 |
1014 | ITEM | ITEM_TYPE | XBOX |
1015 | ITEM | ITEM_TYPE | BLU-RAY |
1016 | TELEPHONE | TELEPHONE_TYPE | HOME |
1017 | TELEPHONE | TELEPHONE_TYPE | WORK |
1018 | PRICE | ACTIVE_FLAG | YES |
1019 | PRICE | ACTIVE_FLAG | NO |
1020 | PRICE | PRICE_TYPE | 1-DAY RENTAL |
1021 | PRICE | PRICE_TYPE | 3-DAY RENTAL |
1022 | PRICE | PRICE_TYPE | 5-DAY RENTAL |
1023 | RENTAL_ITEM | RENTAL_ITEM_TYPE | 1-DAY RENTAL |
1024 | RENTAL_ITEM | RENTAL_ITEM_TYPE | 3-DAY RENTAL |
1025 | RENTAL_ITEM | RENTAL_ITEM_TYPE | 5-DAY RENTAL |
1026 | TRANSACTION | TRANSACTION_TYPE | DEBIT |
1027 | TRANSACTION | TRANSACTION_TYPE | CREDIT |
1028 | TRANSACTION | PAYMENT_METHOD_TYPE | DISCOVER_CARD |
1029 | TRANSACTION | PAYMENT_METHOD_TYPE | VISA_CARD |
1030 | TRANSACTION | PAYMENT_METHOD_TYPE | MASTER_CARD |
1031 | TRANSACTION | PAYMENT_METHOD_TYPE | CASH |
Rows 1023 through 1025 define a subtable for the rental_item_type
column of the rental_item
subtable. Let’s say an import table contained a rental_item_type
value of:
1-DAY RENTAL
3-DAY RENTAL
5-DAY RENTAL
Any of those values are not unique within the common_lookup
table because they also occur in rows 1020 through 1022 for the price_item_type
of the price
subtable.
That means you have to translate, triangulate, or filter the query to find only the row in the rental_item
subtable. Assuming a transaction_upload
import table, you would combine literal string values for the super key values in the common_lookup_table
and common_lookup_column
columns and a join between the transaction_upload
and common_lookup
tables to select the right row. The following join example assumes you’ve assigned cl1
as the table alias for the common_lookup
table and tu
as the table alias for the transaction_upload
external import table:
ON cl1.common_lookup_table = 'RENTAL_ITEM' AND cl1.common_lookup_column = 'RENTAL_ITEM_TYPE' AND cl1.common_lookup_type = tu.rental_item_type |
The example ensures that you can find the right row in a join and then it returns the common_lookup_id
value that should be the correct foreign key column value in the rental_item
table. It would be nice if you could say this only happens in a common lookup design pattern but that’s untrue. The same problem occurs whenever any table is implemented in first normal form (1NF), which is the technical level of normalization for a common_lookup
table.
If you wonder why it is in 1NF, then you should revisit the Week #3 discussion on acheiving second normal form (2NF). The short version is that the common_lookup_type
column has a partial dependency on the common_lookup_table
and common_lookup_column
super key. That super key is the only part of the common_lookup
table’s natural key. The other common_lookup_type
column of the natural key has a partial dependency on the other two columns of the natural key.