Database Tutorial

Course Tutorial Site

Site Admin

Lab #10: MySQL

with 14 comments

Lab #10: Oracle Assignment

You begin these steps after running the create_mysql_store_ri2.sql, the seed_mysql_store_ri2.sql, apply_mysql_lab5.sql, apply_mysql_lab6.sql, apply_mysql_lab7.sql, apply_mysql_lab8.sql, and apply_mysql_lab9.sql scripts. You should create the apply_mysql_lab10.sql script as follows:

-- This calls Lab #9, Lab #9 calls Lab #8, Lab #8 calls Lab #7, Lab #7 calls Lab #6, Lab #6 calls Lab #5, and Lab #5 calls both the Creation and Seed script files.
\. ../lab8/apply_mysql_lab9.sql
 
TEE apply_mysql_lab10.txt
 
... insert code here ...
 
NOTEE

You should embed the verification queries inside your apply_lab10_oracle.sql script. For MySQL, you need to create additional indexes or the remainder of the labs will run too slowly.

The following is a mapping and translation table. You should use the mapping and translation table to create three queries that take records from the TRANSACTION_UPLOAD table with intent of putting the information into the RENTAL, RENTAL_ITEM, and TRANSACTION tables.

The mapping of the TRANSACTION_UPLOAD external table to the TRANSACTION table requires some translations and mappings. A translation may mean that we take a column value from the source table and use it as a filter in a join to find a key value in another table. Translation may also mean that we apply an aggregation function to a source column. Mapping may mean associating a column name in the import *.csv file, or external table definition, with a column in a target table. A second meaning of mapping may mean that you need to join tables on a natural key to find a surrogate key. Lastly, mapping may mean linking to the system access logs or local current clock settings.

An example of translation would be using a COMMON_LOOKUP_TYPE column value from the *.csv files to find a valid COMMON_LOOKUP_ID value. Any *.csv file column that maps to a value found in the COMMON_LOOKUP_TYPE column of the COMMON_LOOKUP table is a key to discover a proper foreign key value (a COMMON_LOOKUP_ID column value) for an import target table.

The system analyst was pushed for time, and your manager assured him that you could build your own mapping guides on the first two MERGE statements. The system analysis did provide a mapping guide between the import *.csv file or external table for the TRANSACTION table because of its complexity. The table provides both translations and mappings. An example of how you may incorporate translation in your statements follows the table definition.

You should probably develop mapping tables for the first two MERGE statements before you write any code. Then, you should leverage this one while writing the third MERGE statements.

 Import Column Mapping Table
Reference Table Reference Column Map
Translate
Source Table Source Column
TRANSACTION TRANSACTION_ID Translate Sequence TRANSACTION_S1
TRANSACTION TRANSACTION_ACCOUNT Map TRANSACTION_UPLOAD PAYMENT_ACCOUNT_NUMBER
TRANSACTION TRANSACTION_TYPE Translate COMMON_LOOKUP COMMON_LOOKUP_ID
TRANSACTION TRANSACTION_DATE Map TRANSACTION_UPLOAD TRANSACTION_DATE
TRANSACTION TRANSACTION_AMOUNT Map TRANSACTION_UPLOAD SUM(TRANSACTION_AMOUNT)
TRANSACTION RENTAL_ID Map RENTAL RENTAL_ID
TRANSACTION PAYMENT_METHOD_TYPE Translate COMMON_LOOKUP COMMON_LOOKUP_ID
TRANSACTION PAYMENT_ACCOUNT_NUMBER Map MEMBER CREDIT_CARD_NUMBER
TRANSACTION CREATED_BY Translate SYSTEM_USER SYSTEM_USER_ID
TRANSACTION CREATION_DATE Translate Internal Clock Current Date
TRANSACTION LAST_UPDATED_BY Translate SYSTEM_USER SYSTEM_USER_ID
TRANSACTION LAST_UPDATE_DATE Translate Internal Clock Current Date

  1. [15 points] The first SELECT statement uses a query that relies on a mapping and translation table. This step requires you to create a query that take records from the TRANSACTION_UPLOAD table with intent of putting the information into the RENTAL table. You’ll use this query inside a SELECT statement in Lab #11.

MySQL doesn’t treat a zero length string like a null value, which is different than Oracle. When you perform this step for MySQL, you must update all zero length strings in the MIDDLE_NAME column of the TRANSACTION_UPLOAD table with a null value.

You can check the count on empty, or zero length, strings with the following query:

1
2
3
SELECT   COUNT(*) AS "Empty String Columns"
FROM     transaction_upload
WHERE    middle_name = '';

It should return the following:

1
2
3
4
5
6
+----------------------+
| Empty String Columns |
+----------------------+
|                 6912 |
+----------------------+
1 row in set (0.00 sec)

1
2
3
UPDATE transaction_upload
SET    middle_name = null
WHERE  middle_name = '';

Using the previous diagnostic query, you can see there aren’t any empty strings after the UPDATE statement.

1
2
3
4
5
6
+----------------------+
| Empty String Columns |
+----------------------+
|                    0 |
+----------------------+
1 row in set (0.00 sec)

You can discover the CONTACT_ID by using the following type of query.

1
2
3
4
5
6
7
8
9
10
SELECT   DISTINCT
         c.contact_id
,        c.last_name
FROM     member m INNER JOIN transaction_upload tu
ON       m.account_number = tu.account_number INNER JOIN contact c
ON       m.member_id = c.member_id
WHERE    c.first_name = tu.first_name
AND      IFNULL(c.middle_name,'x') = IFNULL(tu.middle_name,'x')
AND      c.last_name = tu.last_name
ORDER BY 1;

It should return the following rows.

+------------+-----------+
| contact_id | last_name |
+------------+-----------+
|       1001 | Winn      |
|       1002 | Winn      |
|       1003 | Vizquel   |
|       1004 | Vizquel   |
|       1005 | Sweeney   |
|       1006 | Sweeney   |
|       1007 | Sweeney   |
|       1008 | Clinton   |
|       1009 | Moss      |
|       1010 | Gretelz   |
|       1011 | Royal     |
|       1012 | Smith     |
|       1013 | Potter    |
|       1014 | Potter    |
|       1015 | Potter    |
+------------+-----------+
15 rows in set (0.15 sec)

The first MERGE statement uses a query that performs the following joins. There aren’t any subqueries in the statement, and the DISTINCT operator is required to make it re-runnable.

  • Join the MEMBER table to CONTACT table on an INNER JOIN.
  • Join the foregoing result set (the foregoing result set is the result set of the prior join) to TRANSACTION_UPLOAD table on an INNER JOIN.
  • Join the foregoing result set to RENTAL table on an LEFT JOIN.
  • The query to support the MERGE statement to the RENTAL table should return 4,681 rows (or with the MySQL upload file it returns 4,372 rows), which is the same number that should go into the TRANSACTION table.
  • The natural key for the TRANSACTION table is a composite key made up of the following columns: RENTAL_ID, TRANSACTION_TYPE, TRANSACTION_DATE, PAYMENT_METHOD_TYPE, and PAYMENT_ACCOUNT_NUMBER.

Diagnostics: A possible data error in earlier steps can cause major heartache matching the anticipated row inserts. Here are some diagnostics that you may run when you fail to have a matching number of row inserts.

  1. Check the join condition between the MEMBER and CONTACT table, which should return 15 rows. If this is not correct, fix the foreign key values in the CONTACT table.
1
2
3
SELECT   COUNT(*)
FROM     member m INNER JOIN contact c
ON       m.member_id = c.member_id;

It should return the following rows.

1
2
3
4
5
6
+----------+
| COUNT(*) |
+----------+
|       15 |
+----------+
1 row in set (0.00 sec)
  1. Check the join condition between the CONTACT and TRANSACTION_UPLOAD tables. It should return 11,520 rows. If this is not correct, fix the FIRST_NAME, MIDDLE_NAME, or LAST_NAME values in the CONTACT table or check whether you have the current *.csv file.
1
2
3
4
5
SELECT   COUNT(*)
FROM     contact c INNER JOIN transaction_upload tu
ON       c.first_name = tu.first_name
AND      IFNULL(c.middle_name,'x') = IFNULL(tu.middle_name,'x')
AND      c.last_name = tu.last_name;

It should return the following rows.

1
2
3
4
5
6
+----------+
| COUNT(*) |
+----------+
|    11520 |
+----------+
1 row in set (0.07 sec)
  1. Check the join condition between both the MEMBER and CONTACT tables and the join of those two tables with the TRANSACTION_UPLOAD table. This should return 11,520 rows. If this is not correct, fix the ACCOUNT_NUMBER column values in the MEMBER table.
1
2
3
4
5
6
7
SELECT   COUNT(*)
FROM     member m INNER JOIN contact c
ON       m.member_id = c.member_id INNER JOIN transaction_upload tu
ON       c.first_name = tu.first_name
AND      IFNULL(c.middle_name,'x') = IFNULL(tu.middle_name,'x')
AND      c.last_name = tu.last_name
AND      m.account_number = tu.account_number;

It should return the following rows.

1
2
3
4
5
6
+----------+
| COUNT(*) |
+----------+
|    11520 |
+----------+
1 row in set (0.11 sec)
  1. Did you inadvertently insert a zero length string instead of a null for a MIDDLE_NAME column value when you Ginny and Harry Potter (only a critical failure in MySQL)?
  2. Did you insert the correct foreign key value in the MEMBER_ID column of the CONTACT table when you added the Potter Family?
  3. Did you insert an address for each of the individuals in the Potter Family?
  4. Do the Potter’s live in Utah?
  5. Did you insert the correct foreign key value in the CONTACT_ID column of the ADDRESS table.
  6. If the Potter’s live someplace other than Provo or Spanish Fork, did you enter a row in the AIRPORT table for them?
  7. Did you enter the correct names for the Potter’s? They should be Harry Potter, Ginny Potter, and Lily Luna Potter.

You should see the following result set (redacted to the first 4 and last 4 rows) and the count of rows:

+-----------+----------+----------------+-------------+
| rental_id | customer | check_out_date | return_date |
+-----------+----------+----------------+-------------+
|      NULL |     1001 | 2009-11-14     | 2009-11-17  |
|      NULL |     1001 | 2009-11-21     | 2009-11-24  |
|      NULL |     1001 | 2009-12-22     | 2009-12-25  |
|      NULL |     1001 | 2009-12-19     | 2009-12-22  |
...
|      NULL |     1015 | 2009-08-20     | 2009-08-23  |
|      NULL |     1015 | 2009-11-14     | 2009-11-17  |
|      NULL |     1015 | 2009-05-19     | 2009-05-22  |
|      NULL |     1015 | 2009-07-08     | 2009-07-11  |
+-----------+----------+----------------+-------------+
4372 rows in set (0.21 sec)

After you query the rows, you query a count from the RENTAL_ITEM table before you insert the rows from your query. This SELECT statement gets you the before value:

SELECT   COUNT(*) AS "Rental before count"
FROM     rental;

It should return the following value:

+---------------------+
| Rental before count |
+---------------------+
|                   8 |
+---------------------+
1 row in set (0.00 sec)

After you insert the records from the query, you re-query a count from the RENTAL_ITEM table. The same SELECT statement gets you the after value:

SELECT   COUNT(*) AS "Rental after count"
FROM     rental;

It should return the following value:

+--------------------+
| Rental after count |
+--------------------+
|               4380 |
+--------------------+
1 row in set (0.00 sec)

  1. [15 points] The second SELECT statement requires that you inserted values into the RENTAL table in the last step. It leverages the joins that you worked out in the first SELECT statement. Don’t try to re-invent the wheel because it isn’t profitable in this case.

  • The query to support the SELECT statement to the RENTAL_ITEM table should return 11,520 rows and the DISTINCT operator isn’t by the SELECT statement to get the correct results. The DISTINCT keyword is required to verify your SQL statement returns the right number of rows before inserting it into the SELECT statement. You’ll also need to nest the query with the distinct result set as an inline view, like the following:
1
2
3
4
5
6
7
8
9
10
11
12
SELECT   COUNT(*)
FROM    (SELECT   DISTINCT
                  rental_item_id
         ,        r.rental_id
         ,        tu.item_id
         ,        DATEDIFF(r.return_date,r.check_out_date) AS rental_item_price
         ,        cl.common_lookup_id AS rental_item_type
         ,        3 AS created_by
         ,        creation_date AS creation_date
         ,        3 AS last_updated_by
         ,        last_update_date AS last_update_date
         FROM ... ) il

When you want to embed a query with a distinct operator, you may have to re-work the query like the following example in the INSERT statement. The difference is that you need to write it as a MERGE statement.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
INSERT
INTO     rental_item
SELECT   r.rental_item_id
,        r.rental_id
,        r.item_id
,        r.rental_item_price
,        r.rental_item_type
,        r.created_by
,        r. creation_date
,        r.last_update_date
,        r.last_updated_by
FROM    (SELECT   DISTINCT
                  rental_item_id
         ,        r.rental_id
         ,        tu.item_id
         ,        DATEDIFF(r.return_date,r.check_out_date) AS rental_item_price
         ,        cl.common_lookup_id AS rental_item_type
         ,        3 AS created_by
         ,        creation_date AS creation_date
         ,        3 AS last_updated_by
         ,        last_update_date AS last_update_date
         FROM ... ) r;
  • After you’ve created the first query for the MERGE statement to the RENTAL table, all you need to do for the MERGE statement to the RENTAL_ITEM table is the following:
  1. Join the foregoing result set (the foregoing result set is the result set of the prior join) to the COMMON_LOOKUP table on an INNER JOIN, and translate the COMMON_LOOKUP_TYPE column value to a COMMON_LOOKUP_ID value.
  2. Join the foregoing result set to the RENTAL_ITEM table on an LEFT JOIN.
  3. Remove the DISTINCT operator
  4. Restructure the SELECT clause list.

You should see the following result set (redacted to the first 4 and last 4 rows) and the count of rows (please note that your results may differ when the 3-Day Rental value in the COMMON_LOOKUP table has a different result):

+----------------+-----------+---------+-------------------+------------------+
| rental_item_id | rental_id | item_id | rental_item_price | rental_item_type |
+----------------+-----------+---------+-------------------+------------------+
|           NULL |      1009 |       1 |              3.00 |             1026 |
|           NULL |      1010 |       1 |              3.00 |             1026 |
|           NULL |      1011 |       1 |              3.00 |             1026 |
|           NULL |      1012 |       1 |              3.00 |             1026 |
...
|           NULL |      5098 |       1 |              3.00 |             1026 |
|           NULL |      5379 |       1 |              3.00 |             1026 |
|           NULL |      5222 |       1 |              3.00 |             1026 |
|           NULL |      5380 |       1 |              3.00 |             1026 |
+----------------+-----------+---------+-------------------+------------------+
11520 rows in set (15.95 sec)

After you query the rows, you query a count from the RENTAL_ITEM table before you insert the rows from your query. This SELECT statement gets you the before value:

SELECT   COUNT(*) AS "Rental_Item before count"
FROM     rental_item;

It should return the following value:

+--------------------------+
| Rental Item Before Count |
+--------------------------+
|                       13 |
+--------------------------+
1 row in set (0.01 sec)

After you insert the records from the query, you re-query a count from the RENTAL_ITEM table. The same SELECT statement gets you the after value:

SELECT   COUNT(*) AS "Rental after count"
FROM     rental_item;

It should return the following value:

+--------------------+
| Rental after count |
+--------------------+
|              11533 |
+--------------------+
1 row in set (0.00 sec)

  1. [15 points] The third SELECT statement requires that you inserted values into the RENTAL_ITEM table in the last step. It also leverages the joins that you worked out in the first and secondSELECT statements. Don’t try to re-invent the wheel because it isn’t profitable, especially in this case.

  • The query to support the MERGE statement to the TRANSACTION table should return 4,372 rows, which is the same number that should have gone into the RENTAL table. Unlike the query for the RENTAL table, you can’t use the DISTINCT operator. You will need to use a SUM function and a GROUP BY clause.
  • Assuming that the query for the first MERGE statement is your starting point, change the LEFT JOIN to RENTAL table to an INNER JOIN because rows must exist in the RENTAL table before you enter rows in the TRANSACTION table.
  • Join the foregoing result set (the foregoing result set is the result set of the prior join) to the COMMON_LOOKUP table on an INNER JOIN, use an alias like cl1 because you’ll have another to this table. This is done to narrow limit the selection of the rows, and you should plan on the following two join statements:
1
2
3
ON      cl1.common_lookup_table = 'TRANSACTION'
AND     cl1.common_lookup_column = 'TRANSACTION_TYPE'
AND     cl1.common_lookup_type = 'Check mapping table to find the correct type value'
  • Join the foregoing result set to the COMMON_LOOKUP table on an INNER JOIN, use an alias like cl2 because you don’t want to conflict with what you did in the prior step. This is done to narrow limit the selection of the rows, and you should plan on the following two join statements:
1
2
3
ON      cl2.common_lookup_table = 'TRANSACTION'
AND     cl2.common_lookup_column = 'PAYMENT_METHOD_TYPE'
AND     cl2.common_lookup_type = 'Check mapping table to find the correct type value'
  • Join the foregoing result set to the TRANSACTION table on an LEFT JOIN. The trick here is figuring out the natural key, which is the compound key of TRANSACTION_ACCOUNT, TRANSACTION_TYPE, TRANSACTION_DATE, TRANSACTION_AMOUNT, PAYMENT_METHOD, and PAYMENT_ACCOUNT_NUMBER columns. You should use the provided mapping table to figure out what goes in the SELECT clause.
  • A GROUP BY clause has restrictions that differ across database implementations. In Oracle, you can’t include aliases that are in the SELECT clause in the GROUP BY clause.
+----------------+---------------------+------------------+------------------+--------------------+-----------+---------------------+------------------------+
| transaction_id | transaction_account | transaction_type | transaction_date | transaction_amount | rental_id | payment_method_type | payment_account_number |
+----------------+---------------------+------------------+------------------+--------------------+-----------+---------------------+------------------------+
|           NULL | 111-111-111-111     |             1028 | 2009-01-03       |               3.00 |      1593 |                1030 | 2222-3333-4444-5555    |
|           NULL | 111-111-111-111     |             1028 | 2009-01-03       |               3.00 |      2099 |                1030 | 2222-3333-4444-5555    |
|           NULL | 111-111-111-111     |             1028 | 2009-01-04       |               6.00 |      1259 |                1030 | 1111-2222-3333-4444    |
|           NULL | 111-111-111-111     |             1028 | 2009-01-04       |              12.00 |      1774 |                1030 | 2222-3333-4444-5555    |
...
|           NULL | 111-111-111-111     |             1028 | 2010-01-23       |               3.00 |      5088 |                1030 | 6011 0000 0000 0078    |
|           NULL | 111-111-111-111     |             1028 | 2010-01-24       |               3.00 |      1565 |                1030 | 2222-3333-4444-5555    |
|           NULL | 111-111-111-111     |             1028 | 2010-01-26       |               3.00 |      2889 |                1030 | 3333-4444-5555-6666    |
|           NULL | 111-111-111-111     |             1028 | 2010-01-28       |               3.00 |      3000 |                1030 | 3333-4444-5555-6666    |
+----------------+---------------------+------------------+------------------+--------------------+-----------+---------------------+------------------------+
4372 rows in set (17.52 sec)

After you query the rows, you query a count from the TRANSACTION table before you insert the rows from your query. This SELECT statement gets you the before value:

SELECT   COUNT(*) AS "Transaction before count"
FROM     transaction;

It should return the following value:

+--------------------------+
| Transaction before count |
+--------------------------+
|                        0 |
+--------------------------+
1 row in set (0.00 sec)

After you insert the records from the query, you re-query a count from the TRANSACTION table. The same SELECT statement gets you the after value:

SELECT   COUNT(*) AS "Transaction after count"
FROM     transaction;

It should return the following value:

+-------------------------+
| Transaction after count |
+-------------------------+
|                    4372 |
+-------------------------+
1 row in set (0.00 sec)

Written by michaelmclaughlin

June 23rd, 2014 at 1:21 am

Posted in

14 Responses to 'Lab #10: MySQL'

Subscribe to comments with RSS or TrackBack to 'Lab #10: MySQL'.

  1. The code at the top to start the file is still the code for oracle. fyi.

    Bryce Franzen

    24 Jun 14 at 4:48 pm

  2. Bryce, Thanks. It’s fixed now.

    michaelmclaughlin

    25 Jun 14 at 12:33 am

  3. still has @ instead of \.

    Ryan

    1 Jul 14 at 6:30 pm

  4. Ryan, Thanks. It’s fixed now.

    michaelmclaughlin

    2 Jul 14 at 12:43 am

  5. The check statement for the Rental_item after count says Rental After count. Its a minor issue but just wanted to point it out. Its in Step 2

    Matt Mason

    10 Jul 14 at 6:43 pm

  6. Matt, Fixed.

    michaelmclaughlin

    22 Oct 14 at 1:24 am

  7. HINT 3 says “You can issue the following command to discover the indexes on the RENTAL_ITEM table” but it is suppouse to say “You can issue the following command to discover the indexes on the MEMBER table”.

    Elfre Valdes

    17 Nov 14 at 5:29 pm

  8. Elfre, Thanks, it’s fixed.

    michaelmclaughlin

    17 Nov 14 at 11:32 pm

  9. Under Hints for tuning MySQL merge performance,

    Hint #1 should be payment_account_number from the transaction_upload table in the index.

    Jeremy

    18 Mar 15 at 11:12 am

  10. All instances of merge statement should be changed to insert statement in lab 10 because we don’t do merges until lab 11.

    Jeremy

    18 Mar 15 at 11:25 am

  11. Jeremy, Good catch, it’s fixed.

    michaelmclaughlin

    24 Mar 15 at 8:51 pm

  12. Jeremy, I believe that the INSERT statement example led to an inaccurate conclusion. I’ve added a qualifying statement to explain that they must solve the problem with MERGE statements.

    michaelmclaughlin

    24 Mar 15 at 9:31 pm

  13. TRUNC() is not a MySQL function. Use DATE() or if subtracting dates use DATEDIFF()

    Jonathan

    26 Mar 15 at 6:11 pm

  14. Jonathan, Thanks, I’ve updated it.

    michaelmclaughlin

    27 Mar 15 at 10:44 pm

Leave a Reply