Database Tutorial

Course Tutorial Site

Site Admin

MySQL CSV Upload

with 4 comments

This section covers how you can stage information in comma separated value files and upload the data directly into a MySQL database. This is extremely useful when seeding data into relation data models and is a critical part of ETL (Extract, Transform, and Load) for data warehousing applications.

You have one approach to uploading CSV file data into a MySQL database. You use the LOAD DATA or LOAD DATA LOCAL statements. There two caveats about using these commands, and they are:

  • You must convert dates to the default MySQL format – YYYYMMDD, or the four digit year, two digit month, and two digit day.
  • You can’t use AUTO_INCREMENT from the source to the loading table.

It’s untrue that you must input dates as strings and convert them in the database. However, you can do that if you like. I’d suggest you open the file in Microsoft Excel, then create a column with the correct data format. You can do that by creating a custom format, or with a formula. I figure the custom format is easiest but sometimes a formula is necessary for applications.

You should note that loading a date at the end of a line has problems in MySQL. The easiest fix that I’ve found is to place a comma at the end of each line. In a Microsoft world, that eliminates the need for the \r from the LINES TERMINATED BY clause.

While this shows a trivial amount of data, here’s a CSV file to use in the test case. Create a directory like Data off your base logical drive or mount point. Then create a file named transaction_upload.csv and put the following data set in it.

20090102,5
20091231,5
20091128,5
20090616,5

Create an importing.sql file in the C:\Data folder with the contents below (please note the line returns are only for Windows). It’s only a recommendation but generally import tables aren’t useful in between uploads, and that’s why I create the table as an in-memory table. In-memory tables are dropped when you recycle (boot) the MySQL instance.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- Conditionally drop the table.
DROP TABLE IF EXISTS transaction_upload;
 
-- Create the new upload target table.
CREATE TABLE transaction_upload
( transaction_date   DATE
, transaction_amount FLOAT ) ENGINE=MEMORY;
 
-- Load the data from a file, don't forget the \n after the \r on Windows or
-- it won't work in Windows. You should use only \n in a Linux instance.
LOAD DATA INFILE 'c:/Data/mysql/transaction_upload.csv'
INTO TABLE transaction_upload
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
ESCAPED BY '\\'
LINES TERMINATED BY '\r\n';
 
-- Select the uploaded records.
SELECT * FROM transaction_upload;

Before you connect as anything other than the root account, you’ll need to grant the global FILE privilege to the targeted user. This is true because you’ve not qualified a LOCAL file in the import statement. Users that have all privileges on a given database also have read-write access to LOCAL tables, which is read write to client-side files. When the LOCAL key word is omitted, you’re instructing a read from the server by the client tool. You must therefore grant a global permission to enable a connected user can call a server-side file.

If you want to import without granting any additional global permissions, you can rewrite the statement like this:

1
2
3
4
5
6
LOAD DATA LOCAL INFILE 'c:/Data/mysql/transaction_upload.csv'
INTO TABLE transaction_upload
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
ESCAPED BY '\\'
LINES TERMINATED BY '\r\n';

Or, you can use this in the course’s Fedora Linux instance:

1
2
3
4
5
6
LOAD DATA LOCAL INFILE '/u01/app/mysql/upload/transaction_upload.csv'
INTO TABLE transaction_upload
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
ESCAPED BY '\\'
LINES TERMINATED BY '\n';

If you don’t want to rewrite the statement, you’ll get the following error:

ERROR 1045 (28000): Access denied FOR USER 'student'@'localhost' (USING password: YES)
Empty SET (0.01 sec)

Run the script:

mysql> \. importing.sql

Then, you should see the following:

+------------------+--------------------+
| transaction_date | transaction_amount |
+------------------+--------------------+
| 2009-01-02       |                  5 |
| 2009-12-31       |                  5 |
| 2009-11-28       |                  5 |
| 2009-06-16       |                  5 |
+------------------+--------------------+
4 rows in set (0.00 sec)

Another quick tidbit, dates can be tricky when they’re the last column in the file. The best solution is to put a comma at the end of each string. If you put the comma at the end of each line, you can also dispense with the \r on the Windows platform.

Written by michaelmclaughlin

October 8th, 2009 at 9:38 pm

Posted in

4 Responses to 'MySQL CSV Upload'

Subscribe to comments with RSS or TrackBack to 'MySQL CSV Upload'.

  1. The “IF(LEN(MONTH(F1))” phrase is missing “=1” and the else portion of the same IF statemet “,CONCATENATE(“”,MONTH(F1))”

    This formula works:

    =CONCATENATE(CONCATENATE(“”,YEAR(E1)),IF(LEN(MONTH(E1))=1,CONCATENATE(“0”,MONTH(E1)),CONCATENATE(“”,MONTH(E1))),IF(LEN(DAY(E1)) = 1,CONCATENATE(“0”,DAY(E1)),CONCATENATE(“”,DAY(E1))))

    Wil G.

    9 Mar 10 at 10:48 pm

  2. The clarification on this line

    LINES TERMINATED BY 'rn';

    needs to be clarified. THere is a good explanation here.

    Since it is \r on Mac and like OSes and \n on Linux and like OSes and then \r\n in Windows. I think an explanation outside code comment would be better.

    Matt

    15 Nov 14 at 12:59 pm

  3. Matt, That’s a good idea. I’ll add the reference.

    michaelmclaughlin

    15 Nov 14 at 4:06 pm

  4. -- Conditionally drop the table.
    DROP TABLE IF EXISTS transaction_upload;
     
    -- Create the new upload target table.
    CREATE TABLE transaction_upload
    ( transaction_date   DATE
    , transaction_amount FLOAT ) ENGINE=MEMORY;
     
    -- Load the data from a file, don't forget the n after the r on Windows or
    -- it won't work in Windows. You should use only n in a Linux instance.
    LOAD DATA INFILE 'c:/Data/mysql/transaction_upload.csv'
    INTO TABLE transaction_upload
    FIELDS TERMINATED BY ','
    ENCLOSED BY '"'
    ESCAPED BY ''
    LINES TERMINATED BY 'rn';
     
    -- Select the uploaded records.
    SELECT * FROM transaction_upload;

    It is missing LOCAL:

    LOAD DATA LOCAL INFILE 'c:/Data/mysql/transaction_upload.csv'
    INTO TABLE transaction_upload
    FIELDS TERMINATED BY ','
    ENCLOSED BY '"'
    ESCAPED BY ''
    LINES TERMINATED BY 'rn';

    TA Matt

    22 Nov 16 at 8:59 pm

Leave a Reply