Lab #7 : Transactions
This lab works on developing the students understanding of the syntax to import data from external sources into Oracle and MySQL databases. It also examines the nature of transactions from the perspective of reading and normalizing data from denormalized external sources.
Objectives ↓
These are the lab objectives. They provide the description of the database tables that you will work with to insert data. You have the table names, column names, and data types for each table, and you should have a script to create these tables in Oracle and MySQL databases. The MySQL database script should create all tables by using the ENGINE=InnoDB clause at the end of the CREATE TABLE statement.
Group Objectives
Group assignments are cooperative situations where you should learn from one another. Each team member should engage and type solution components individually. This tends to maximize the student learning opportunity. You should avoid the “expert and followers” paradigm, and become cooperative and interactive problem solvers. The “expert and followers” paradigm is where one person in a group does the work and learns, while others look over the shoulder at the work without truly understanding it. Team members become interactive problem solvers when they collaborate and work together on solving problems.
If something comes up, like an illness or excused absence, you may share files. The caution here is that the student receiving the work must master the concepts in the material before submitting the work to the instructor. The instructor may interview the student to determine their level of mastery and post group credit that may differ from other team members who performed the work.
- Learn how to use define Comma Separated Value (CSV) files and import them to the Oracle databases.
- Learn how to work with transactions that normalize external import flat files through Oracle external tables.
- Learn how to use basic aggregation and formatting in an Oracle database.
Individual Objectives
Individual assignments are situations where you apply the generalized concepts acquired in the group exercise, and transfer your skills to another database engine. You may ask questions but you should only receive general concepts, principles, or directions to published materials (such as the product documentation found in the resources). Likewise, those are the types of answers you should provide when asked questions by your peers. You should not share files in this part of the lab.
- Learn how to use define Comma Separated Value (CSV) files and import them to the Oracle databases.
- Learn how to work with transactions that normalize external import flat files through Oracle external tables.
- Learn how to use basic aggregation and formatting in an Oracle database.
Resources ↓
These are the lab resources.
Deliverables ↓
The table definitions are in the lab deliverables.
- [4 points] Create the following
TRANSACTIONtable as per the specification, but do so understanding the business logic of the model.
Implementation Note: The TRANSACTION table has a range relationship to a CALENDAR table that will be defined later.
Business Rule: You may enter an account number into the TRANSACTION_ACCOUNT column of 111-111-111-111 for a debit, and 222-222-222-222 for a credit. A debit recognizes payment for a rental, and is a value that adds to the company’s assets. A credit recognizes a refund for a rental, and is a value that decreases the company’s assets. Debits should have a TRANSACTION_TYPE that maps to the corresponding debit row in the COMMON_LOOKUP table, and credits should have a TRANSACTION_TYPE that maps to the corresponding credit row in the COMMON_LOOKUP table.
System Logic: In a future refinement of our pilot design, the TRANSACTION_ACCOUNT column will be replaced by a foreign key that points to the CHART_OF_ACCOUNT table.
After you create the table, you need to add a UNIQUE INDEX on the following columns to the TRANSACTION table. This is necessary to minimize the run-time performance of the merge operations later in this lab.
- RENTAL_ID
- TRANSACTION_TYPE
- TRANSACTION_DATE
- PAYMENT_METHOD_TYPE
- PAYMENT_ACCOUNT_NUMBER
| Table Name: TRANSACTION | |||||
|---|---|---|---|---|---|
| Column Name | Constraint | Data Type |
Physical Size |
||
| Type | Reference Table | Reference Column | |||
| TRANSACTION_ID | PRIMARY KEY | Integer | Maximum | ||
| TRANSACTION_ACCOUNT | NOT NULL | String | 15 | ||
| TRANSACTION_TYPE | FOREIGN KEY | COMMON_LOOKUP | COMMON_LOOKUP_ID | Integer | Maximum |
| TRANSACTION_DATE | NOT NULL | Date | Date | ||
| TRANSACTION_AMOUNT | NOT NULL | Float | Maximum | ||
| RENTAL_ID | FOREIGN KEY | RENTAL | RENTAL_ID | Integer | Maximum |
| PAYMENT_METHOD_TYPE | FOREIGN KEY | COMMON_LOOKUP | COMMON_LOOKUP_ID | Integer | Maximum |
| PAYMENT_ACCOUNT_NUMBER | NOT NULL | String | 19 | ||
| CREATED_BY | FOREIGN KEY | SYSTEM_USER | SYSTEM_USER_ID | Integer | Maximum |
| CREATION_DATE | NOT NULL | DATE | Date | ||
| LAST_UPDATED_BY | FOREIGN KEY | SYSTEM_USER | SYSTEM_USER_ID | Integer | Maximum |
| LAST_UPDATE_DATE | NOT NULL | DATE | Date | ||
Don’t forget that on the Oracle Database 10g Express Edition, that you’ll need to provide an explicit TRANSACTION_S1 sequence. The sequence should start with the default value of a 1.
- [2 points] Insert the following nine rows into the
COMMON_LOOKUPtable with valid who-audit column data. After you insert theRENTAL_ITEM_TYPEcolumn values, update the existing values in theRENTAL_ITEM_TYPEcolumn of theRENTAL_ITEMtable to point to the newCOMMON_LOOKUP_IDcolumn values.
| Table Name: COMMON_LOOKUP | ||||
|---|---|---|---|---|
| Lookup Table | Lookup Column | Lookup Type | Lookup Meaning | Lookup Code |
| TRANSACTION | TRANSACTION_TYPE | CREDIT | Credit | CR |
| TRANSACTION | TRANSACTION_TYPE | DEBIT | Debit | DR |
| TRANSACTION | PAYMENT_METHOD_TYPE | DISCOVER_CARD | Discover Card | |
| TRANSACTION | PAYMENT_METHOD_TYPE | VISA_CARD | Visa Card | |
| TRANSACTION | PAYMENT_METHOD_TYPE | MASTER_CARD | Master Card | |
| TRANSACTION | PAYMENT_METHOD_TYPE | CASH | Cash | |
| RENTAL_ITEM | RENTAL_ITEM_TYPE | 1-DAY RENTAL | 1-Day Rental | |
| RENTAL_ITEM | RENTAL_ITEM_TYPE | 3-DAY RENTAL | 3-Day Rental | |
| RENTAL_ITEM | RENTAL_ITEM_TYPE | 5-DAY RENTAL | 5-Day Rental | |
- [14 points] Create the following
AIRPORTandACCOUNT_LISTtables as per the specification, but do so understanding the business logic of the model.
Implementation Note: The AIRPORT table has a filtered substring relationship to ACCOUNT numbers because the business model is evolving toward airport kiosks rather than stores. The ACCOUNT_LIST contains a set of pre-seeded values that may be consumed as new customers are entered into the Video Store application.
Business Rule: You must seed the AIRPORT with any additional information that may be added to the model to support new customers. The city where the customer lives needs to be mapped to the nearest regional airport. All ACCOUNT_NUMBER column values must contain a valid account number found in the ACCOUNT_LIST table. When an account number is assigned to a customer, you must indicate the SYSTEM_USER_ID linked to assigning the account number in the CONSUMED_BY column of the ACCOUNT_LIST table, and provide the date of that activity in the CONSUMED_BY column.
System Logic: Seeding procedures are provided in the lab to update all existing data in the model, provided you add any required new cities in the AIRPORT table. The import of external records is now dependent on accurate ACCOUNT_NUMBER values in the MEMBER table. You must update all pre-existing records.
- You need to create the
AIRPORTtable andACCOUNT_LISTtables.
| Table Name: AIRPORT | |||||
|---|---|---|---|---|---|
| Column Name | Constraint | Data Type |
Physical Size |
||
| Type | Reference Table | Reference Column | |||
| AIRPORT_ID | PRIMARY KEY | Integer | Maximum | ||
| AIRPORT_CODE | NOT NULL | String | 3 | ||
| AIRPORT_CITY | NOT NULL | String | 30 | ||
| CITY | NOT NULL | String | 30 | ||
| STATE_PROVINCE | NOT NULL | String | 30 | ||
| CREATED_BY | FOREIGN KEY | SYSTEM_USER | SYSTEM_USER_ID | Integer | Maximum |
| CREATION_DATE | NOT NULL | DATE | Date | ||
| LAST_UPDATED_BY | FOREIGN KEY | SYSTEM_USER | SYSTEM_USER_ID | Integer | Maximum |
| LAST_UPDATE_DATE | NOT NULL | DATE | Date | ||
| Table Name: ACCOUNT_LIST | |||||
|---|---|---|---|---|---|
| Column Name | Constraint | Data Type |
Physical Size |
||
| Type | Reference Table | Reference Column | |||
| ACCOUNT_LIST_ID | PRIMARY KEY | Integer | Maximum | ||
| ACCOUNT_NUMBER | NOT NULL | String | 10 | ||
| CONSUMED_DATE | Date | Date | |||
| CONSUMED_BY | FOREIGN KEY | SYSTEM_USER | SYSTEM_USER_ID | Integer | Maximum |
| CREATED_BY | FOREIGN KEY | SYSTEM_USER | SYSTEM_USER_ID | Integer | Maximum |
| CREATION_DATE | NOT NULL | DATE | Date | ||
| LAST_UPDATED_BY | FOREIGN KEY | SYSTEM_USER | SYSTEM_USER_ID | Integer | Maximum |
| LAST_UPDATE_DATE | NOT NULL | DATE | Date | ||
- You need to seed the
AIRPORTtable with at least these cities, and any others that you’ve used for inserted values in theCONTACTtable.
| Table Name: AIRPORT | ||||
|---|---|---|---|---|
| Airport Code | Airport City | City | State Province | |
| LAX | Los Angeles | Los Angeles | California | |
| SLC | Salt Lake City | Provo | Utah | |
| SLC | Salt Lake City | Spanish Fork | Utah | |
| SFO | San Francisco | San Francsico | California | |
| SJC | San Jose | San Jose | California | |
| SJC | San Jose | San Carlos | California | |
- You need to seed the
ACCOUNT_LISTtable. This would take more time than leveraging a stored procedure but writing stored procedures from class is a beyond scope. Therefore, you’ve been provided with stored procedures for the databases.
Setup Script ↓
The drop down items, unfold with source code to seed the examples.
- In a prior lab and in both the create and seed scripts, the
STATE_PROVINCEcolumn values used a mix of USPS state abbreviations and full name. The next seeding program and the import program rely on the use of a full state name. You need to update all values that don’t have the full state name. The script updates only the create and seed store script data. You need to modify the script to work with your independent entries from Lab #6.
1 2 3 | UPDATE address SET state_province = 'California' WHERE state_province = 'CA'; |
- The rules for replacing
ACCOUNT_NUMBERcolumn values in theMEMBERtable are complex. A single query can’t support the complete logic. Another procedure has been written for you to use to accomplish this task. As with other provided code samples, you’re responsible for learning how and why this work to accomplish the task at hand.
Setup Script ↓
The drop down items, unfold with source code to seed the examples.
- [5 points] Create the following
TRANSACTION_UPLOADtable as per the specification, but do so understanding the business logic of the model. As a BIG PICTURE, our video store just got a consolidation of rentals from a store that’s closed for our customers. We need to import the values and ensure that our source data agrees with the other store. For example, do all customer names, addresses, account numbers match our data. If they do, the merge should go well. If they don’t, the figures will ultimately be incorrect.
Implementation Note: The TRANSACTION_UPLOAD table is in unnormalized form, which means it has repeating rows and unique rows.
Business Rule: The data has secured a valid CUSTOMER_ID and ITEM_ID before generating the CSV source file. You need to create an import table to support the upload of remote store data.
System Logic: This will be implemented in an Oracle database as an external table, and in MySQL as a memory resident table. Memory resident tables disappear when the database is rebooted. You use the ENGINE=MEMORY as opposed to the ENGINE=INNODB syntax. These rules support the business rules provided you have no FOREIGN KEY references in the TRANSACTION_UPLOAD table.
| Table Name: TRANSACTION_UPLOAD | |||||
|---|---|---|---|---|---|
| Column Name | Constraint | Data Type |
Physical Size |
||
| Type | Reference Table | Reference Column | |||
| ACCOUNT_NUMBER | String | 10 | |||
| FIRST_NAME | String | 20 | |||
| MIDDLE_NAME | String | 20 | |||
| LAST_NAME | String | 20 | |||
| CHECK_OUT_DATE | Date | Date | |||
| RETURN_DATE | Date | Date | |||
| RENTAL_ITEM_TYPE | String | 12 | |||
| TRANSACTION_TYPE | String | 14 | |||
| TRANSACTION_AMOUNT | Number | Decimal | |||
| TRANSACTION_DATE | Date | Date | |||
| ITEM_ID | Integer | Maximum | |||
| PAYMENT_METHOD_TYPE | String | 14 | |||
| PAYMENT_ACCOUNT_NUMBER | String | 19 | |||
Hints for tuning merge performance ↓
The drop down items, unfold the hints for this step.
These steps reduced the run time to 2 seconds for the first call to the merging procedure, and 9 seconds for the second call. Without the tuning hints, the first call to merging procedure is 17 seconds, and second call is 53 seconds. It is strongly recommended that you provide the correct indexes before proceeding.
Hint #1: MySQL requires a BTREE index or it will perform a full table scan of the 11,520 rows in the TRANSACTION_UPLOAD table. The following is the suggested INDEX.
1 2 | CREATE INDEX tu_rental ON transaction_upload (account_number, first_name, last_name, check_out_date, return_date); |
Hint #2: MySQL also requires a BTREE non-unique index on the RENTAL_ITEM table. The following is the suggested INDEX.
1 2 3 | ALTER TABLE rental_item ADD CONSTRAINT natural_key UNIQUE INDEX (rental_item_id, rental_id, item_id, rental_item_type, rental_item_price); |
Hint #3: MySQL also requires a BTREE index on the MEMBER table. The following is the suggested INDEX.
1 2 3 | ALTER TABLE member ADD CONSTRAINT member_u1 UNIQUE INDEX member_key (account_number, credit_card_number, credit_card_type, member_type); |
Hint #4: You should have added a UNIQUE INDEX in Lab #6. It should look like the example below. If it doesn’t you should drop and recreate it. You can see what you created with the following command:
1 | SHOW INDEXES FROM common_lookup; |
You should see the following:
+---------------+------------+-------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +---------------+------------+-------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+ | common_lookup | 0 | PRIMARY | 1 | common_lookup_id | A | 34 | NULL | NULL | | BTREE | | | common_lookup | 0 | common_lookup_key | 1 | common_lookup_table | A | 17 | NULL | NULL | | BTREE | | | common_lookup | 0 | common_lookup_key | 2 | common_lookup_column | A | 17 | NULL | NULL | | BTREE | | | common_lookup | 0 | common_lookup_key | 3 | common_lookup_type | A | 34 | NULL | NULL | | BTREE | | | common_lookup | 1 | common_lookup_fk1 | 1 | created_by | A | 2 | NULL | NULL | | BTREE | | | common_lookup | 1 | common_lookup_fk2 | 1 | last_updated_by | A | 2 | NULL | NULL | | BTREE | | +---------------+------------+-------------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+
The following is the suggested INDEX.
1 2 | ALTER TABLE common_lookup ADD CONSTRAINT common_lookup_u1 UNIQUE INDEX common_lookup_key (common_lookup_table,common_lookup_column,common_lookup_type); |
The mapping of the TRANSACTION_UPLOAD external table to the TRANSACTION table requires some translations and mappings. A translation may mean that we take a column value from the source table and use it as a filter in a join to find a key value in another table. Translation may also mean that we apply an aggregation function to a source column. Mapping may mean associating a column name in the import *.csv file, or external table definition, with a column in a target table. A second meaning of mapping may mean that you need to join tables on a natural key to find a surrogate key. Lastly, mapping may mean linking to the system access logs or local current clock.
An example of translation would be using a column value from the *.csv files to find a valid COMMON_LOOKUP_ID value. Any *.csv file column that maps to a value found in the COMMON_LOOKUP_TYPE column of the COMMON_LOOKUP table is a key to discover a proper foreign key value for an import target table.
The system analyst was pushed for time, and your manager assured him that you could build your own mapping guides on the first two MERGE statements. The system analysis did provide a mapping guide between the import *.csv file or external table for the TRANSACTION table because of its complexity. The table provides both translations and mappings. An example of how you may incorporate translation in your statements follows the table.
You should probably develop mapping tables for the first two MERGE statements before you write the code. Then, you should leverage this one while writing the third MERGE statements. This is the “Mapping Table” referred to by Hint #5 on the third query.
| Import Column Mapping Table | ||||
|---|---|---|---|---|
| Reference Table | Reference Column | Map Translate |
Source Table | Source Column |
| TRANSACTION | TRANSACTION_ID | Translate | Sequence | TRANSACTION_S1 |
| TRANSACTION | TRANSACTION_ACCOUNT | Map | TRANSACTION_UPLOAD | PAYMENT_ACCOUNT_NUMBER |
| TRANSACTION | TRANSACTION_TYPE | Translate | COMMON_LOOKUP | COMMON_LOOKUP_ID |
| TRANSACTION | TRANSACTION_DATE | Map | TRANSACTION_UPLOAD | TRANSACTION_DATE |
| TRANSACTION | TRANSACTION_AMOUNT | Map | TRANSACTION_UPLOAD | SUM(TRANSACTION_AMOUNT) |
| TRANSACTION | RENTAL_ID | Map | RENTAL | RENTAL_ID |
| TRANSACTION | PAYMENT_METHOD | Translate | COMMON_LOOKUP | COMMON_LOOKUP_ID |
| TRANSACTION | PAYMENT_ACCOUNT_NUMBER | Map | MEMBER | CREDIT_CARD_NUMBER |
| TRANSACTION | CREATED_BY | Translate | SYSTEM_USER | SYSTEM_USER_ID |
| TRANSACTION | CREATION_DATE | Translate | Internal Clock | Current Date |
| TRANSACTION | LAST_UPDATED_BY | Translate | SYSTEM_USER | SYSTEM_USER_ID |
| TRANSACTION | LAST_UPDATE_DATE | Translate | Internal Clock | Current Date |
- [15 points] Create an Upload Procedure that write correct normalized data into the
RENTAL,RENTAL_ITEM, andTRANSACTIONtables from the denormalizedTRANSACTION_UPLOADsource file. You should use the sample code provided in the Import Transaction Units page.
Implementation Note: The TRANSACTION_UPLOAD table is in unnormalized form, which means it has repeating rows and unique rows. This generally requires a logic similar to the Oracle MERGE statement, or the MySQL INSERT ... ON DUPLICATE KEY statement.
Business Rule: The source file doesn’t contain any of the surrogate keys and you’ll need to populate them. As part of your uploading and transaforming process.
System Logic: Implement consistent with the business rule and available resources on the respective platforms. You can download the *.csv files here.
This step requires you to build a merge procedure that merges a source data set into three tables. The following demonstrates the Oracle pseudo code for the procedure.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | -- Create a procedure to wrap the transaction. CREATE OR REPLACE PROCEDURE upload_transaction_log IS BEGIN -- Set save point for an all or nothing transaction. SAVEPOINT starting_point; -- Merge into RENTAL table. MERGE INTO rental ... -- Merge into RENTAL_ITEM table. MERGE INTO rental_item ... -- Merge into TRANSACTION table. MERGE INTO TRANSACTION ... -- Save the changes. COMMIT; EXCEPTION WHEN OTHERS THEN ROLLBACK TO starting_point; RETURN; END; / |
Naturally, the shell changes when you port it to MySQL or some other database. You can find the MySQL sample here.
Hints for the merge operations ↓
The drop down items, unfold the hints for this step.
Hint #1: MySQL doesn’t treat a zero length string like a null value, which is different than Oracle. When you perform this step for MySQL, you must update all zero length strings in the MIDDLE_NAME column of the TRANSACTION_UPLOAD table with a null value.
1 2 3 | UPDATE transaction_upload SET middle_name = NULL WHERE middle_name = ''; |
Hint #2: You can discover the CONTACT_ID by using the following type of query.
1 2 3 4 5 6 7 8 | -- This uses an NVL function in Oracle, you need to change that to an IFNULL function in MySQL. SELECT DISTINCT c.contact_id FROM member m INNER JOIN transaction_upload tu ON m.account_number = tu.account_number INNER JOIN contact c ON m.member_id = c.member_id WHERE c.first_name = tu.first_name AND NVL(c.middle_name,'x') = NVL(tu.middle_name,'x') AND c.last_name = tu.last_name; |
Hint #3: The first MERGE statement uses a source query that performs the following joins. There aren’t any subqueries in the statement, and the DISTINCT operator is required to make it re-runnable. The easiest way to test requires that you remove the queries from the USING clause of the MERGE statement and run them as queries.
- Join the
MEMBERtable toCONTACTtable on anINNER JOIN. - Join the foregoing result set (the foregoing result set is the result set of the prior join) to
TRANSACTION_UPLOADtable on anINNER JOIN. - Join the foregoing result set to
RENTALtable on anLEFT JOIN. - The query to support the
MERGEstatement to theRENTALtable should return 4,681 rows (or with the MySQL upload file it returns 4,372 rows), which is the same number that should go into theTRANSACTIONtable. - The natural key for the
TRANSACTIONtable is a composite key made up of the following columns:RENTAL_ID,TRANSACTION_TYPE,TRANSACTION_DATE,PAYMENT_METHOD_TYPE, andPAYMENT_ACCOUNT_NUMBER. - The query in the
MERGEstatement doesn’t need to return the who-audit columns because you can simply put them in theINSERTstatement. However, you also need theLAST_UPDATED_BYandLAST_UPDATE_DATEcolumns in theUPDATEstatement, as seen below. It is best for maintenance coding that you put them all in the query, even though they’re not required.
1 2 | UPDATE SET last_updated_by = SOURCE.last_updated_by , last_update_date = SOURCE.last_update_date |
Diagnostics: A possible data error in earlier steps can cause major heartache matching the anticipated row inserts. Here are some diagnostics that you may run when you fail to have a matching number of row inserts.
- Check the join condition between the
MEMBERandCONTACTtable, which should return 15 rows. If this is not correct, fix the foreign key values in theCONTACTtable.
1 2 3 | SELECT COUNT(*) FROM member m INNER JOIN contact c ON m.member_id = c.member_id; |
- Check the join condition between the
CONTACTandTRANSACTION_UPLOADtables. It should return 11,520 rows. If this is not correct, fix theFIRST_NAME,MIDDLE_NAME, orLAST_NAMEvalues in theCONTACTtable or check whether you have the current*.csvfile.
1 2 3 4 5 | SELECT COUNT(*) FROM contact c INNER JOIN transaction_upload tu ON c.first_name = tu.first_name AND NVL(c.middle_name,'x') = NVL(tu.middle_name,'x') AND c.last_name = tu.last_name; |
- Check the join condition between both the
MEMBERandCONTACTtables and the join of those two tables with theTRANSACTION_UPLOADtable. This should return 11,520 rows. If this is not correct, fix theACCOUNT_NUMBERcolumn values in theMEMBERtable.
1 2 3 4 5 6 7 | SELECT COUNT(*) FROM member m INNER JOIN contact c ON m.member_id = c.member_id INNER JOIN transaction_upload tu ON c.first_name = tu.first_name AND NVL(c.middle_name,'x') = NVL(tu.middle_name,'x') AND c.last_name = tu.last_name AND m.account_number = tu.account_number; |
- Did you inadvertently insert a zero length string instead of a null for a
MIDDLE_NAMEcolumn value when you Ginny and Harry Potter (only a critical failure in MySQL)? - Did you insert the correct foreign key value in the
MEMBER_IDcolumn of theCONTACTtable when you added the Potter Family? - Did you insert an address for each of the individuals in the Potter Family?
- Do the Potter’s live in Utah?
- Did you insert the correct foreign key value in the
CONTACT_IDcolumn of theADDRESStable. - If the Potter’s live someplace other than Provo or Spanish Fork, did you enter a row in the
AIRPORTtable for them? - Did you enter the correct names for the Potter’s? They should be Harry Potter, Ginny Potter, and Lily Luna Potter.
Hint #4: The second MERGE statement leverages the joins that you worked out in the first MERGE statement. Don’t try to re-invent the wheel because it isn’t profitable in this case.
You should use the first query as the basis for the query of the second MERGE statement. Remember from the earlier hint, the easiest way to test requires that you remove the queries from the USING clause of the MERGE statement and run them as queries.
- The query to support the
MERGEstatement to theRENTAL_ITEMtable should return 11,520 rows and theDISTINCToperator isn’t by theMERGEstatement to get the correct results. TheDISTINCTkeyword is requuired to verify your SQL statement returns the right number of rows before inserting it into theMERGEstatement. You’ll also need to nest the query with the distinct result set as an inline view, like the following:
1 2 3 4 5 6 7 8 9 10 11 12 | SELECT COUNT(*) FROM (SELECT DISTINCT rental_item_id , r.rental_id , tu.item_id , TRUNC(r.return_date) - TRUNC(r.check_out_date) AS rental_item_price , cl.common_lookup_id AS rental_item_type , 3 AS created_by , TRUNC(creation_date) AS creation_date , 3 AS last_updated_by , TRUNC(last_update_date) AS last_update_date FROM ... ) il |
- After you’ve created the first query for the
MERGEstatement to theRENTALtable, all you need to do for theMERGEstatement to theRENTAL_ITEMtable is the following:
- Join the foregoing result set (the foregoing result set is the result set of the prior join) to the
COMMON_LOOKUPtable on anINNER JOIN, and translate theCOMMON_LOOKUP_TYPEcolumn value to aCOMMON_LOOKUP_IDvalue. - Join the foregoing result set to the
RENTAL_ITEMtable on anLEFT JOIN. - Remove the
DISTINCToperator - Restructure the
SELECTclause list.
Hint #5: The third MERGE statement also leverages the joins that you worked out in the first MERGE statement. Don’t try to re-invent the wheel because it isn’t profitable in this case.
- The query to support the
MERGEstatement to theTRANSACTIONtable should return 4,681 rows, which is the same number that should have gone into theRENTALtable. Unlike the query for theRENTALtable, you can’t use theDISTINCToperator. You will need to use aSUMfunction and aGROUP BYclause. - Assuming that the query for the first
MERGEstatement is your starting point, change theLEFT JOINtoRENTALtable to anINNER JOINbecause rows must exist in theRENTALtable before you enter rows in theTRANSACTIONtable. - Join the foregoing result set (the foregoing result set is the result set of the prior join) to the
COMMON_LOOKUPtable on anINNER JOIN, use an alias likecl1because you’ll have another to this table. This is done to narrow limit the selection of the rows, and you should plan on the following two join statements:
1 2 3 | ON cl1.common_lookup_table = 'TRANSACTION' AND cl1.common_lookup_column = 'TRANSACTION_TYPE' AND cl1.common_lookup_type = 'Check mapping table' |
- Join the foregoing result set to the
COMMON_LOOKUPtable on anINNER JOIN, use an alias likecl2because you don’t want to conflict with what you did in the prior step. This is done to narrow limit the selection of the rows, and you should plan on the following two join statements:
1 2 3 | ON cl1.common_lookup_table = 'TRANSACTION' AND cl1.common_lookup_column = 'PAYMENT_METHOD_TYPE' AND cl1.common_lookup_type = 'Check mapping table' |
- Join the foregoing result set to the
TRANSACTIONtable on anLEFT JOIN. The trick here is figuring out the natural key, which is the compound key ofTRANSACTION_ACCOUNT,TRANSACTION_TYPE,TRANSACTION_DATE,TRANSACTION_AMOUNT,PAYMENT_METHOD, andPAYMENT_ACCOUNT_NUMBERcolumns. You should use the provided mapping table to figure out what goes in theSELECTclause. - A
GROUP BYclause has restrictions that differ across database implementations. In Oracle, you can’t include aliases that are in theSELECTclause in theGROUP BYclause. - The query in the
MERGEstatement doesn’t need to return the who-audit columns because you can simply put them in theINSERTstatement. However, you also need theLAST_UPDATED_BYandLAST_UPDATE_DATEcolumns in theUPDATEstatement, as seen below. It is best for maintenance coding that you put them all in the query, even though they’re not required.
As another reminder on aggregation and comparison to the DISTINCT operator, they both help you collapse the 11,520 record sets to 4,681 rows. The DISTINCT is slightly better as an approach than the alternative GROUP BY clause.
In Oracle, you can run this query to ensure that you’ve inserted the correct number of rows.
1 2 3 4 5 6 7 8 9 10 | COLUMN rental_count FORMAT 99,999 HEADING "Rental|Count" COLUMN rental_item_count FORMAT 99,999 HEADING "Rental|Item|Count" COLUMN transaction_count FORMAT 99,999 HEADING "Transaction|Count" SELECT il1.rental_count , il2.rental_item_count , il3.transaction_count FROM (SELECT COUNT(*) AS rental_count FROM rental) il1 CROSS JOIN (SELECT COUNT(*) AS rental_item_count FROM rental_item) il2 CROSS JOIN (SELECT COUNT(*) AS transaction_count FROM TRANSACTION) il3; |
Oracle Validation
You should see the following if everything is on track:
Rental Rental Item TRANSACTION COUNT COUNT COUNT ------- ------- ----------- 4,689 11,532 4,681
MySQL Validation
You should see the following if everything is on track:
+--------------+-------------------+-------------------+ | rental_count | rental_item_count | transaction_count | +--------------+-------------------+-------------------+ | 4380 | 11532 | 4372 | +--------------+-------------------+-------------------+
In case you’ve forgotten, there where 8 rows in the RENTAL table before you started. The total reflects the new rows plus the existing rows. Likewise, you had 12 rows in the RENTAL_ITEM table and that total will be 12 higher than what you insert.
Make sure you call the UPLOAD_TRANSACTIONS procedure twice and that it returns the same result set. If you don’t test it, you may not earn full credit. When you pass-off the assignment you may only earn 90% of the points possible in the lab.
The loss of points result because your merging operation fails to normalize the denormalized data in the *.csv file. Also, the script isn’t re-runnable, which will be a requirement when you re-use this component in Lab #8.
- [10 points] Create a query that prints the following types of data for the year 2010.
Implementation Note: This is a preliminary report that lets the business user see what their aggregate earnings would be with a 10% and 20% increase, as well as see the incremental increases by themselves.
Business Rule: This join can be made against only the TRANSACTION table.
System Logic: Implement consistent with the business rule and available resources on the respective platforms. Performing the GROUP BY in an inline view with a non-displayed sorting key that maps to the numeric value of the month is probably the easiest approach.
You’ll also may benefit from knowing about these functions, as you solve this part of the lab. They work in both Oracle and MySQL. The EXTRACT function returns a number between 1 and 12 when you extract a MONTH, and a four-digit year when you extract a YEAR. Copy the code into a terminal session or development tool and test it.
SELECT EXTRACT(MONTH FROM TO_DATE('02-FEB-2009')) , EXTRACT(YEAR FROM TO_DATE('02-FEB-2009')) FROM dual;
The TO_CHAR function lets you apply formatting masks in Oracle, like the following that converts numbers to dollars with two values to the right of the decimal point. Again, you can copy and test how this works.
SELECT TO_CHAR(9999,'$9,999,999.00') AS "Formatted" FROM dual;
The TO_CHAR function doesn’t exist in MySQL. You must combine a number of functions to get the same behavior. Here’s the equivalent of the preceding statement in MySQL:
SELECT LPAD(CONCAT('$',FORMAT(SUM(9999999.00),2)),14,' ') AS "Formatted" FROM dual;
If you changed the second call parameter to the LPAD function from 14 to 10, you’d not see any decimal points displayed because it would truncate the return value on the length of the string. You must like wise use the CONCAT function because piped concatenation never works inside function calls (you can read more on piped concatenation here).
The desired layout is noted below in the color formatted table.
| Query Elements | |||||
|---|---|---|---|---|---|
| Month Year | Base Revenue |
10 Plus Revenue |
20 Plus Revenue |
10 Plus Difference |
20 Plus Difference |
| JAN 2009 | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn |
| FEB 2009 | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn |
| MAR 2009 | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn |
| APR 2009 | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn |
| MAY 2009 | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn |
| JUN 2009 | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn |
| JUL 2009 | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn |
| AUG 2009 | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn |
| SEP 2009 | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn |
| OCT 2009 | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn |
| NOV 2009 | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn |
| DEC 2009 | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn | n,nnn.nn |
The next two sections show you the expected outcome. You should note that these results work from only the seeded items. However, they do work from customers seeded and you’re three Potter users from your Lab #6.
Oracle Validation
The actual data from the import source should show the following when you run it in an Oracle database:
MONTH BASE_REVENUE 10_PLUS 20_PLUS 10_PLUS_LESS_B 20_PLUS_LESS_B ---------- -------------- -------------- -------------- -------------- -------------- JAN-2009 $2,671.20 $2,938.32 $3,205.44 $267.12 $534.24 FEB-2009 $4,270.74 $4,697.81 $5,124.89 $427.07 $854.15 MAR-2009 $5,371.02 $5,908.12 $6,445.22 $537.10 $1,074.20 APR-2009 $4,932.18 $5,425.40 $5,918.62 $493.22 $986.44 MAY-2009 $2,216.46 $2,438.11 $2,659.75 $221.65 $443.29 JUN-2009 $1,208.40 $1,329.24 $1,450.08 $120.84 $241.68 JUL-2009 $2,404.08 $2,644.49 $2,884.90 $240.41 $480.82 AUG-2009 $2,241.90 $2,466.09 $2,690.28 $224.19 $448.38 SEP-2009 $2,197.38 $2,417.12 $2,636.86 $219.74 $439.48 OCT-2009 $3,275.40 $3,602.94 $3,930.48 $327.54 $655.08 NOV-2009 $3,125.94 $3,438.53 $3,751.13 $312.59 $625.19 DEC-2009 $2,340.48 $2,574.53 $2,808.58 $234.05 $468.10
MySQL Validation
The actual data from the import source should show the following when you run it in an MySQL database. The principal difference occurs because the data sets are different in the respective *.csv files.
+----------+------------+------------+------------+------------+------------+ | MON-YEAR | BASE | 10_PLUS | 20_PLUS | 10_DIFF | 20_DIFF | +----------+------------+------------+------------+------------+------------+ | JAN-2009 | $2,957.40 | $3,253.14 | $3,548.88 | $295.74 | $591.48 | | FEB-2009 | $4,022.70 | $4,424.97 | $4,827.24 | $402.27 | $804.54 | | MAR-2009 | $5,654.04 | $6,219.44 | $6,784.85 | $565.40 | $1,130.81 | | APR-2009 | $4,595.10 | $5,054.61 | $5,514.12 | $459.51 | $919.02 | | MAY-2009 | $2,219.64 | $2,441.60 | $2,663.57 | $221.96 | $443.93 | | JUN-2009 | $1,300.62 | $1,430.68 | $1,560.74 | $130.06 | $260.12 | | JUL-2009 | $2,413.62 | $2,654.98 | $2,896.34 | $241.36 | $482.72 | | AUG-2009 | $2,149.68 | $2,364.65 | $2,579.62 | $214.97 | $429.94 | | SEP-2009 | $2,162.40 | $2,378.64 | $2,594.88 | $216.24 | $432.48 | | OCT-2009 | $3,291.30 | $3,620.43 | $3,949.56 | $329.13 | $658.26 | | NOV-2009 | $3,246.78 | $3,571.46 | $3,896.14 | $324.68 | $649.36 | | DEC-2009 | $2,299.14 | $2,529.05 | $2,758.97 | $229.91 | $459.83 | +----------+------------+------------+------------+------------+------------+
Group Deliverables
Groups should demonstrate to the instructor or lab assistant during class time that they have developed a single re-runnable (found here by clicking on the Re-runnable query script that performs the tasks shown above.
Team members should also show the following:
- Every team member should be capable of running and explaining elements of the script running against an Oracle database.
- Every team member can confidently explain the concepts of importing and transforming data into an Oracle database.
- Every team member can confidently explain the concepts of basic aggregation into an Oracle database.
Individual Deliverables
Individuals should demonstrate to the instructor or lab assistant during class time that you have developed a single re-runnable (found here by clicking on the Re-runnable query script that performs the tasks shown above.
- Every team member should be capable of running and explaining elements of the script running against an Oracle database.
- Every team member can confidently explain the concepts of importing and transforming data into an Oracle database.
- Every team member can confidently explain the concepts of basic aggregation into an Oracle database.
Okay I found a fun error:
SQL> select * from transaction_upload;
select * from transaction_upload
*
ERROR at line 1:
ORA-29913: error in executing ODCIEXTTABLEOPEN callout
ORA-29400: data cartridge error
KUP-04063: unable to open log file transaction_upload.log
OS error The system cannot find the file specified.
ORA-06512: at “SYS.ORACLE_LOADER”, line 19
This one occurred because I had been using the external table tutorial and ran CREATE DIRECTORY download AS ‘C:\Downloads’;
During the lab I tried renaming C:\Downloads to C:\Download and then the command to CREATE DIRECTORY download AS ‘C:\Download’;
This caused the error and when I renamed C:\Download to C:\Downloads, it works just fine.
John G
4 Mar 10 at 4:02 pm
Brother McLaughlin,
We have found errors in your account number assignment procedure (well, either that or the CSV). Below are the inconsistencies that have been giving us grief:
Member Table CSV
Clinton SLC-000002 SLC-000001
Moss SLC-000005 SLC-000002
Royal SLC-000001 SLC-000004
Smith SLC-000004 SLC-000005
And then of course the Potters are also being created incorrectly.
We made manual changes to the member table to correspond with the CSV, and we got good results. The issue is either that the CSV has the wrong values, or the procedure being used to assign account numbers is off. This has been causing problems because of the account_number comparison being used whenever joining member-contact to transaction_upload.
We may be very off, but I think this is the issue. thanks.
Skyler
8 Mar 10 at 11:09 pm
We kinda hacked together a fix by just removing the m.account_number =tu.account_number comparison being used in our merge statement joins. This gives us 4681 WITHOUT using a GROUP BY statement. I dont know how accurate this data is; my understanding of the theory isnt sound enough to know for sure. Please advise. thanks.
Skyler
8 Mar 10 at 11:15 pm
I’ve posted some added checks. There shouldn’t be anything wrong with the
*.csvfile. Other students have this working.michaelmclaughlin
8 Mar 10 at 11:42 pm
Assuming that you’re referencing the 3rd query (or
MERGEstatement), that would not work because it’ll fail to aggregate rental_item amount values to rental amount values.michaelmclaughlin
8 Mar 10 at 11:43 pm
None of the transaction dates in the MySQL .csv are in 2009.
Wil G.
10 Mar 10 at 2:07 am
Oops! I’ll fix it. I’m not sure what happened.
michaelmclaughlin
10 Mar 10 at 9:54 am
2nd line in the objective statement at the top has an extra the “and the you should”.
Chaz
15 Jun 10 at 1:09 pm
Fixed. Thanks Chaz.
michaelmclaughlin
21 Jun 10 at 9:43 pm
Here’s a correlated update statement that will update all the rental_item_type values to correct rows after inserting the values for the rental_item table into common_lookup:
update rental_item ri
set rental_item_type =
(select cl1.common_lookup_id
from common_lookup cl1
where cl1.common_lookup_table = ‘RENTAL_ITEM’
and cl1.common_lookup_column = ‘RENTAL_ITEM_TYPE’
and cl1.common_lookup_type =
(select cl2.common_lookup_type
from common_lookup cl2 inner join price p
on p.price_type = cl2.common_lookup_id inner join rental r
on r.check_out_date between p.start_date and NVL(p.end_date, sysdate)
where p.amount = ri.rental_item_price
and ri.item_id = p.item_id
and cl2.common_lookup_id = ri.rental_item_type
and r.rental_id = ri.rental_id));
Tyler
15 Mar 11 at 1:48 pm