Lab #8 : Transformations
This lab works on developing the students understanding of advanced aggregation concepts and data transformations. It involves selective aggregation concepts, and leverages the data import skills developed in Lab #7.
Objectives ↓
These are the lab objectives. They provide the description of the database tables. You have the table names, column names, and data types for each table. You also have the designation for external files and/or *.csv file locations.
Group Objectives
Group assignments are cooperative situations where you should learn from one another. Each team member should engage and type solution components individually. This tends to maximize the student learning opportunity. You should avoid the “expert and followers” paradigm, and become cooperative and interactive problem solvers. The “expert and followers” paradigm is where one person in a group does the work and learns, while others look over the shoulder at the work without truly understanding it. Team members become interactive problem solvers when they collaborate and work together on solving problems.
If something comes up, like an illness or excused absence, you may share files. The caution here is that the student receiving the work must master the concepts in the material before submitting the work to the instructor. The instructor may interview the student to determine their level of mastery and post group credit that may differ from other team members who performed the work.
- Learn how to translate textual descriptions into internal and external tables.
- Learn how to import data.
- Learn how to use advanced aggregation, including functions in the
GROUP BYclause. - Learn how to use the
CASEstatement in conjunction with aggregatoin functions to produce selective aggregation results. - Learn how to work with hierarchical queries in an Oracle database.
Individual Objectives
Individual assignments are situations where you apply the generalized concepts acquired in the group exercise, and transfer your skills to another database engine. You may ask questions but you should only receive general concepts, principles, or directions to published materials (such as the product documentation found in the resources). Likewise, those are the types of answers you should provide when asked questions by your peers. You should not share files in this part of the lab.
- Learn how to translate textual descriptions into internal and external tables.
- Learn how to import data.
- Learn how to use advanced aggregation, including functions in the
GROUP BYclause. - Learn how to use the
CASEstatement in conjunction with aggregatoin functions to produce selective aggregation results.
Resources ↓
These are the lab resources.
Deliverables ↓
The table definitions are in the lab deliverables.
- [3 points] Create the following
CALENDARtable as per the specification.
Implementation Note: The CALENDAR table has a range relationship to the TRANSACTION table created in the prior lab.
Business Rule: The CALENDAR table provides end points to bracket transaction dates into calendar months.
System Logic: In future refinement this table may become multilingual and support regional languages based on HTML request header information.
| Table Name: CALENDAR | |||||
|---|---|---|---|---|---|
| Column Name | Constraint | Data Type |
Physical Size |
||
| Type | Reference Table | Reference Column | |||
| CALENDAR_ID | PRIMARY KEY | Integer | Maximum | ||
| CALENDAR_NAME | NOT NULL | String | 10 | ||
| CALENDAR_SHORT_NAME | NOT NULL | String | 3 | ||
| START_DATE | NOT NULL | Date | Date | ||
| END_DATE | NOT NULL | Date | Date | ||
| CREATED_BY | FOREIGN KEY | SYSTEM_USER | SYSTEM_USER_ID | Integer | Maximum |
| CREATION_DATE | NOT NULL | DATE | Date | ||
| LAST_UPDATED_BY | FOREIGN KEY | SYSTEM_USER | SYSTEM_USER_ID | Integer | Maximum |
| LAST_UPDATE_DATE | NOT NULL | DATE | Date | ||
- [3 points] Seed the
CALENDARtable with the following data.
| Data Entry to Table: CALENDAR | |||
|---|---|---|---|
| Calendar Month Name |
Calendar Short Month Name |
Start Date |
End Date |
| January | JAN | 01-JAN-2009 | 31-JAN-2009 |
| February | FEB | 01-FEB-2009 | 28-FEB-2009 |
| March | MAR | 01-MAR-2009 | 31-MAR-2009 |
| April | APR | 01-APR-2009 | 30-APR-2009 |
| May | MAY | 01-MAY-2009 | 31-MAY-2009 |
| June | JUN | 01-JUN-2009 | 30-JUN-2009 |
| July | JUL | 01-JUL-2009 | 31-JUL-2009 |
| August | AUG | 01-AUG-2009 | 31-AUG-2009 |
| September | SEP | 01-SEP-2009 | 30-SEP-2009 |
| October | OCT | 01-OCT-2009 | 31-OCT-2009 |
| November | NOV | 01-NOV-2009 | 30-NOV-2009 |
| December | DEC | 01-DEC-2009 | 31-DEC-2009 |
- [4 points] Import and merge the new
*.csvfiles.
Implementation Note: The first *.csv file contained incorrect values. This one reverses a number of rentals and adds new ones. Download this zip file, and use the transaction_upload2.csv for Oracle and transaction_upload2_mysql.csv for MySQL. A merge is not necessary. You may accomplish the addition by using an INSERT statement. You need to ignore the TRANSACTION_ID from the TRANSACTION_REVERSAL table but use the other fields.
A cautionary note: If you open the MySQL upload file in Excel, it will change the native date formats and make them incompatible with an upload to MySQL. You’ll need to manually change them to make it work. You can do that by choosing the other date formatting option.
System Logic: Create the TRANSACTION_REVERSAL table as qualified below by leveraging the existing structure of the TRANSACTION table. Oracle provides the DBMS_METADATA.GET_DDL to create a starting point for an external table definition, and you can simply select the structure in MySQL (see the hint).
| Table Name: TRANSACTION_REVERSAL | |||||
|---|---|---|---|---|---|
| Column Name | Constraint | Data Type |
Physical Size |
||
| Type | Reference Table | Reference Column | |||
| TRANSACTION_ID | PRIMARY KEY | Integer | Maximum | ||
| TRANSACTION_ACCOUNT | String | 15 | |||
| TRANSACTION_TYPE | Integer | Maximum | |||
| TRANSACTION_DATE | Date | Date | |||
| TRANSACTION_AMOUNT | Float | Maximum | |||
| RENTAL_ID | Integer | Maximum | |||
| PAYMENT_METHOD_TYPE | Integer | Maximum | |||
| PAYMENT_ACCOUNT_NUMBER | String | 19 | |||
| CREATED_BY | Integer | Maximum | |||
| CREATION_DATE | DATE | Date | |||
| LAST_UPDATED_BY | Integer | Maximum | |||
| LAST_UPDATE_DATE | DATE | Date | |||
Hints for creating the TRANSACTION_REVERSAL table. ↓
This provides a couple tricks that save some typing.
Oracle Trick and Technique
The following shows you how to get key definition syntax from the Oracle data dictionary. You can find the column definitions by getting the syntax for the TRANSACTION table, and the external table components from the definition of the TRANSACTION_UPLOAD table.
1 2 3 4 5 6 7 8 9 10 11 12 | -- Set environment variables. SET LONG 100000 SET PAGESIZE 0 -- Set a local variable of a character large object (CLOB). VARIABLE ddl_text CLOB -- Get the internal DDL command for the TRANSACTION table from the data dictionary. SELECT dbms_metadata.get_ddl('TABLE','TRANSACTION') FROM dual; -- Get the internal DDL command for the external TRANSACTION_UPLOAD table from the data dictionary. SELECT dbms_metadata.get_ddl('TABLE','TRANSACTION_UPLOAD') FROM dual; |
MySQL Trick and Technique
This allows you to create a new table based on the structure of an existing table. Just don’t forget to truncate the data after creating the table.
1 2 3 4 5 | -- Create a table with a unique name from another table. CREATE TABLE transaction_reversal AS SELECT * FROM TRANSACTION; -- Truncate the table to maintain the structure but dump the data. TRUNCATE TABLE transaction_reversal; |
You should find that you have 1,170 rows in Oracle and 1,093 rows in MySQL in the TRANSACTION_REVERSAL table. After you insert them into the TRANSACTION table, run the following queries and compare results.
Oracle
Run this query:
1 2 3 4 5 6 7 8 9 10 11 12 | COLUMN "Debit Transactions" FORMAT A20 COLUMN "Credit Transactions" FORMAT A20 COLUMN "All Transactions" FORMAT A20 -- Check current contents of the model. SELECT 'SELECT record counts' AS "Statement" FROM dual; SELECT LPAD(TO_CHAR(c1.transaction_count,'99,999'),19,' ') AS "Debit Transactions" , LPAD(TO_CHAR(c2.transaction_count,'99,999'),19,' ') AS "Credit Transactions" , LPAD(TO_CHAR(c3.transaction_count,'99,999'),19,' ') AS "All Transactions" FROM (SELECT COUNT(*) AS transaction_count FROM TRANSACTION WHERE transaction_account = '111-111-111-111') c1 CROSS JOIN (SELECT COUNT(*) AS transaction_count FROM TRANSACTION WHERE transaction_account = '222-222-222-222') c2 CROSS JOIN (SELECT COUNT(*) AS transaction_count FROM TRANSACTION) c3; |
You should get the following results.
Debit Transactions Credit Transactions All Transactions
-------------------- -------------------- --------------------
4,681 1,170 5,851MySQL
1 2 3 4 5 6 7 8 | -- Check current contents of the model. SELECT 'SELECT record counts' AS "Statement"; SELECT LPAD(FORMAT(c1.transaction_count,0),19,' ') AS "Debit Transactions" , LPAD(FORMAT(c2.transaction_count,0),19,' ') AS "Credit Transactions" , LPAD(FORMAT(c3.transaction_count,0),19,' ') AS "All Transactions" FROM (SELECT COUNT(*) AS transaction_count FROM TRANSACTION WHERE transaction_account = '111-111-111-111') c1 CROSS JOIN (SELECT COUNT(*) AS transaction_count FROM TRANSACTION WHERE transaction_account = '222-222-222-222') c2 CROSS JOIN (SELECT COUNT(*) AS transaction_count FROM TRANSACTION) c3; |
You should get the following results.
+---------------------+---------------------+---------------------+ | Debit Transactions | Credit Transactions | All Transactions | +---------------------+---------------------+---------------------+ | 4,372 | 1,093 | 5,465 | +---------------------+---------------------+---------------------+
- [10 points] Create the following transformation report by a
CROSS JOINbetween theTRANSACTIONandCALENDARtables. While it is possible to perform this task when the accounting periods are calendar months with only theTRANSACTIONtable, that coincidence isn’t always the case. Therefore, this presents an opportunity to reinforce a non-equijoin range query.
Implementation Note: This is an ad hoc (run anytime report) that shows rental income, expense, and the total (rental income – rental expense). A zero is returned when there’s no income or expense in a given calendar period (month, quarter, or year-to-date).
Business Rule: Rental income are those rows in the TRANSACTION table that have a PAYMENT_ACCOUNT_NUMBER of 111-111-111-111, and rental return income are those rows in the TRANSACTION table that have a PAYMENT_ACCOUNT_NUMBER of 222-222-222-222 .
System Logic: The CASE statement inside of a SUM function call on the TRANSACTION_DATE column allows you to return only qualifying TRANSACTION_AMOUNT column values in the aggregated sum.
The desired layout is noted below in the color formatted table.
| Annual Financial Report | |||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Transaction | Jan | Feb | Mar | FQ1 | Apr | May | Jun | FQ2 | Jul | Aug | Sep | FQ3 | Sep | Nov | Dec | FQ4 | YTD |
| Income | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn |
| Expense | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn |
| Total | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn |
Hint ↓
This shows you how to apply the aggregation syntax for a single month and year. You should also check the lookup functions for quarters and year-to-date transformation columns.
You should be able to extend this behavior to fulfill most of this steps’ requirements. However, you’ll need to replicate a number of components, label them respectively as the month, quarter, or year-to-date; and then wrap this as an inline view.
Oracle Sample
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | SELECT CASE WHEN t.transaction_account = '111-111-111-111' THEN 'Debit' WHEN t.transaction_account = '222-222-222-222' THEN 'Credit' END AS "TRANSACTION_ACCOUNT" , CASE WHEN t.transaction_account = '111-111-111-111' THEN 1 WHEN t.transaction_account = '222-222-222-222' THEN 2 END AS "SORTKEY" , LPAD(TO_CHAR (SUM(CASE WHEN EXTRACT(MONTH FROM transaction_date) = 1 AND EXTRACT(YEAR FROM transaction_date) = 2009 THEN CASE WHEN cl.common_lookup_type = 'DEBIT' THEN t.transaction_amount ELSE t.transaction_amount * -1 END END),'99,999.00'),10,' ') AS "JANUARY" FROM TRANSACTION t INNER JOIN common_lookup cl ON t.transaction_type = cl.common_lookup_id WHERE cl.common_lookup_table = 'TRANSACTION' AND cl.common_lookup_column = 'TRANSACTION_TYPE' GROUP BY CASE WHEN t.transaction_account = '111-111-111-111' THEN 'Debit' WHEN t.transaction_account = '222-222-222-222' THEN 'Credit' END , CASE WHEN t.transaction_account = '111-111-111-111' THEN 1 WHEN t.transaction_account = '222-222-222-222' THEN 2 END; |
MySQL Sample
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | SELECT CASE WHEN t.transaction_account = '111-111-111-111' THEN 'Debit' WHEN t.transaction_account = '222-222-222-222' THEN 'Credit' END AS "Account" , CASE WHEN t.transaction_account = '111-111-111-111' THEN 1 WHEN t.transaction_account = '222-222-222-222' THEN 2 END AS "Sortkey" , LPAD(FORMAT (SUM(CASE WHEN EXTRACT(MONTH FROM transaction_date) = 1 AND EXTRACT(YEAR FROM transaction_date) = 2009 THEN CASE WHEN cl.common_lookup_type = 'DEBIT' THEN t.transaction_amount ELSE t.transaction_amount * -1 END END),2),10,' ') AS "JANUARY" FROM TRANSACTION t INNER JOIN common_lookup cl ON t.transaction_type = cl.common_lookup_id WHERE cl.common_lookup_table = 'TRANSACTION' AND cl.common_lookup_column = 'TRANSACTION_TYPE' GROUP BY CASE WHEN t.transaction_account = '111-111-111-111' THEN 'Debit' WHEN t.transaction_account = '222-222-222-222' THEN 'Credit' END; |
The numbers that let you verify the accuracy of your query are:
Oracle Results
You should return the following results:
Transaction Jan Feb Mar F1Q Apr May Jun F2Q Jul Aug Sep F3Q Oct Nov Dec F4Q YTD --------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ------------ Debit 2,671.20 4,270.74 5,371.02 12,312.96 4,932.18 2,216.46 1,208.40 8,357.04 2,404.08 2,241.90 2,197.38 6,843.36 3,275.40 3,125.94 2,340.48 8,741.82 36,255.18 Credit -690.06 -1,055.76 -1,405.56 -3,151.38 -1,192.50 -553.32 -298.92 -2,044.74 -604.20 -553.32 -581.94 -1,739.46 -874.50 -833.16 -601.02 -2,308.68 -9,244.26 Total 1,981.14 3,214.98 3,965.46 9,161.58 3,739.68 1,663.14 909.48 6,312.30 1,799.88 1,688.58 1,615.44 5,103.90 2,400.90 2,292.78 1,739.46 6,433.14 27,010.92
MySQL Results
You should return the following results:
+-------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+--------------+ | Transaction | Jan | Feb | Mar | F1Q | Apr | May | Jun | F2Q | Jul | Aug | Sep | F3Q | Oct | Nov | Dec | F4Q | YTD | +-------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+--------------+ | Debit | 2,957.40 | 4,022.70 | 5,654.04 | 12,634.14 | 4,595.10 | 2,219.64 | 1,300.62 | 8,115.36 | 2,413.62 | 2,149.68 | 2,162.40 | 6,725.70 | 3,291.30 | 3,246.78 | 2,299.14 | 8,837.22 | 36,312.42 | | Credit | -750.48 | -992.16 | -1,437.36 | -3,180.00 | -1,217.94 | -546.96 | -302.10 | -2,067.00 | -597.84 | -537.42 | -604.20 | -1,739.46 | -829.98 | -829.98 | -594.66 | -2,254.62 | -9,241.08 | | Total | 2,206.92 | 3,030.54 | 4,216.68 | 9,454.14 | 3,377.16 | 1,672.68 | 998.52 | 6,048.36 | 1,815.78 | 1,612.26 | 1,558.20 | 4,986.24 | 2,461.32 | 2,416.80 | 1,704.48 | 6,582.60 | 27,071.34 | +-------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+--------------+
Group Deliverables
Groups should demonstrate to the instructor or lab assistant during class time that they have successfully completed the lab.
- Every team member should be capable of explaining the idea of a
CROSS JOIN. - Every team member should be capable of explaining the idea of what happens when you combine the
SUMandCASEfunctions in a query like this lab’s ad hoc report. - Every team member should be capable of explaining how null values are treated in selective aggregation.
Individual Deliverables
Individuals should demonstrate to the instructor or lab assistant during class time time that they have successfully completed the lab.
- Every team member should be capable of explaining the idea of a
CROSS JOIN. - Every team member should be capable of explaining the idea of what happens when you combine the
SUMandCASEfunctions in a query like this lab’s ad hoc report. - Every team member should be capable of explaining how null values are treated in selective aggregation.
- Every team member should be capable of explaining any differences between the
SELECTstatements for Oracle versus MySQL databases.
Brother McLaughlin,
The ZIP file from step 3 does not work.
Thanks!
Trevor D
17 Mar 10 at 9:48 pm
I’m re-working it to simplify that step and the lab. It’ll be available with new write-up tonight.
michaelmclaughlin
18 Mar 10 at 11:29 am
Step 4 Business Rule needs to be rewritten. PAYMENT_ACCOUNT_NUMBER is 111-111-111-111 for both income and ‘expenses income’ as it is presently written.
Wil G.
22 Mar 10 at 11:59 pm
When I try to update the transaction_account number to 222-222-222-222 in oracle, it says that ‘this operation is not supported on external organized tables’. It works in mysql, but not in oracle.
Daren J.
23 Mar 10 at 3:10 pm
You simply want to select everything from the external table, and substitute ’222-222-222-222′ in place of the
TRANSACTION_ACCOUNTcolumn.michaelmclaughlin
23 Mar 10 at 6:49 pm
Um. The CSV file for this case has been messed up. It is just like the CSV file for lab 7. The one we need is all numbers…
Nate
25 Mar 10 at 10:06 pm
It’s all fixed now.
michaelmclaughlin
25 Mar 10 at 10:06 pm