Writing SQL for both MySQL and DB2

04 May 2006 »

Update: A new version of the MySQL to DB2 Conversion Guide Redbook from IBM has just been published in December 2009. It now includes some of the tips listed here.

The MySQL to DB2 UDB Conversion Guide is the definitive resource for migrating data and applications from MySQL to DB2. Developing PHP Applications for IBM Data Servers provides a few pointers specifically for migrating from MySQL 5.

Neither book really covers another common scenario involving these two database systems, which is to write PHP and SQL which can support either regardless of which is chosen at runtime. The following tips will help you to smoothly transition your code by supporting both. They are grouped by the following topics:

A hat tip to Tim Dietrich for discovering and working through some of these issues with me. I intend to clarify and improve this page over time, so please add your additions or improvements below.

Columns, joins and grouping

  • The following SQL syntax is legal in either database:
    SELECT * FROM table_name;

    The following is legal in MySQL, but will not work in DB2

    SELECT *, column_name AS name FROM table_name;

    To make it work in both databases, you must qualify the wildcard with a table name:

    SELECT table_name.*, column_name AS name FROM table_name;
  • MySQL allows you to use a GROUP BY clause without specifying each of the column names not operated by on an aggregate function in the column list.

    The following will work in MySQL, but not DB2 because it can return an unexpected result:

    SELECT manager, location, AVG(salary)
    FROM employee
    GROUP BY manager;

    You need to explicitly instruct DB2 to order by all column names to remove any ambiguity.

    SELECT manager, location, AVG(salary)
    FROM employee
    GROUP BY manager, location;
  • When performing an outer join on a pair of tables while also retrieving data from a third table, you need to place the tables used for the outer join adjacent to the JOIN keywords.

    MySQL will let you put the tables after the FROM clause in any order, but DB2 will not.

    SELECT manager, location
    FROM project, employee
    LEFT JOIN org ON project.id = org.id;

    Rewrite the query like this.

    SELECT manager, location
    FROM employee, project
    LEFT JOIN org ON project.id = org.id;
  • MySQL provides a pair of constructs that allow you to issue a single query that becomes an INSERT or and UPDATE depending on whether the values you are inserting into a row with a key already exist. REPLACE INTO deletes the existing row and performs a new insertion (resulting in a new primary key value). ON DUPLICATE KEY UPDATE replaces only the values for that key.

    DB2 provides MERGE for a similar, but not equivalent purpose.

    To make both databases happy, you will need to perform the check for an existing row manually with a SELECT query, then perform an INSERT or UPDATE depending on the result.

Data types

  • 64K Character Large Objects (CLOBs) are DB2′s equivalent to MySQL’s TEXT data type. Unfortunately, CLOBs cannot be used in conjunction with a SELECT DISTINCT query (SQL0134N). You should instead consider using a large VARCHAR() column to store similar data in DB2 to get around this restriction.

    VARCHARs are limited to 32K, and will require a tablespace buffer large enough to contain them in result sets. The default tablespace buffer is 4K. You can also opt for 8K, 16K and 32K. Following is a sample of how to create a bufferpool and temporary tablespace of 32K in DB2.

    CREATE DATABASE SAMPLE USING CODESET UTF-8 TERRITORY US;

    CONNECT TO SAMPLE;

    CREATE BUFFERPOOL smp_bp_32k SIZE 250 PAGESIZE 32 K;

    CREATE TABLESPACE smp_ts_32k PAGESIZE 32 K MANAGED BY SYSTEM USING (‘SMP_TS_32K’) BUFFERPOOL smp_bp_32k;

    CREATE SYSTEM TEMPORARY TABLESPACE smp_tempts_32k PAGESIZE 32 K MANAGED BY SYSTEM USING (‘SMP_TEMPTS_32K’) EXTENTSIZE 32 PREFETCHSIZE 32 BUFFERPOOL smp_bp_32k;

    CONNECT RESET;

    CONNECT TO SAMPLE;

  • Following on the above, you will only be able to join data from large VARCHAR columns until you have reached the tablespace bufferpool limit, for example, if you have a 32K tablespace, you will only be able to join about 3 VARCHAR(9000) columns, allowing space for other data type columns.
  • When performing an INSERT into a table which has NOT NULL columns, DB2 won’t let you issue an update which does not provide a value for the NOT NULL fields which don’t have a default. MySQL will silently insert a blank string in the following case.
    – MySQL
    CREATE TABLE test (
        id      INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
        name    VARCHAR(64) NOT NULL,
        value   VARCHAR(64),
        PRIMARY KEY(id)
    );

    INSERT INTO test (value) VALUES (‘Dan’);

    – DB2
    CREATE TABLE test (
        id      INTEGER NOT NULL GENERATED ALWAYS AS IDENTITY (START WITH 1, INCREMENT BY 1),
        name    VARCHAR(64) NOT NULL,
        value   VARCHAR(64),
        PRIMARY KEY(id)
    );

    INSERT INTO test (value) VALUES (‘Dan’);

Date and time

  • DB2 users often employ the CURRENT TIMESTAMP special register. MySQL developers use the NOW() function. However, CURRENT_TIMESTAMP works in both.
  • As a caveat to the above, MySQL’s CURRENT_TIMESTAMP and DB2′s CURRENT_TIMESTAMP differ in their degree of accuracy. DB2 reports microseconds, while MySQL only reports to the second. For example:
    mysql> SELECT CURRENT_TIMESTAMP;
    2006-05-04 19:18:53
    db2 => SELECT CURRENT_TIMESTAMP FROM SYSIBM.SYSDUMMY1;
    2006-05-04-19.18.53.583000

    You can work around this by lopping off the milliseconds from the DB2 result:

    VARCHAR_FORMAT(CURRENT_TIMESTAMP, ‘YYYY-MM-DD HH24:MI:SS’)

Functions

  • If you’re using a function provided by MySQL but has no equivalent in DB2, first see if there’s a PHP function that does what you need, then look into a DB2 SQL UDF (DB2 makes UDFs easy).

    For example, DB2 has no equivalent to the UNIX_TIMESTAMP() function in MySQL. You can emulate that by defining one or more DB2 SQL UDFs:

    CREATE FUNCTION UNIX_TIMESTAMP ()
    RETURNS INTEGER
    RETURN
    SELECT TIMESTAMPDIFF(2,CHAR(CURRENT_TIMESTAMP - TIMESTAMP(’1970-01-01-00.00.00.000000′)))
    FROM SYSIBM.SYSDUMMY1;
    CREATE FUNCTION UNIX_TIMESTAMP (ts VARCHAR(255))
    RETURNS INTEGER
    RETURN
    SELECT TIMESTAMPDIFF(2,CHAR(CAST(ts AS TIMESTAMP) - TIMESTAMP(’1970-01-01-00.00.00.000000′)))
    FROM SYSIBM.SYSDUMMY1;
  • Use the SQL keyword OR instead of || as the double pipes are the concatenation operator in DB2.

Predicates

  • DB2 and MySQL use different syntax to get the id of the last row inserted. The snippet must be provided at runtime by your application code.
    mysql> SELECT LAST_INSERT_ID();
    db2 => SELECT SYSIBM.IDENTITY_VAL_LOCAL() FROM SYSIBM.DUAL;

PHP

  • If you are using prepared statement placeholders in PEAR DB, you will need to explicitly cast any integer or floating point numbers passed as parameters to the query function. MySQL doesn’t care if you quote your numbers, DB2 does.

    This code will work in MySQL, but will not work in DB2:

    $id = ’1′;
    $result =& $db->query(‘SELECT * FROM users WHERE id = ?’, $id);

    This code will work for both databases:

    $id = ’1′;
    $id = (int) $id;
    $result =& $db->query(‘SELECT * FROM users WHERE id = ?’, $id);
  

16 Responses to 'Writing SQL for both MySQL and DB2'

Subscribe to comments with RSS

  1. Jason said,

    05 May 2006 at 12:18 pm

    “Following on the above, you will only be able to join data from large VARCHAR columns until you have reached the tablespace bufferpool limit”

    Exactly.

  2. Archiuyero said,

    23 August 2006 at 3:39 am

    Really helpful :)

  3. George Frick said,

    24 October 2006 at 11:21 am

    Very Helpful. Thanks for the info on merge Vs. on duplicate. What a shame….

  4. Charles Forsythe said,

    15 November 2006 at 12:16 pm

    Great article.

    One nitpick, DB2 returns the timestamp with microsecond, not millisecond accuracy. This might be of significance to Java developers who typically have millisecond-accurate times.

  5. Mike said,

    15 December 2006 at 6:10 am

    Dan,
    great article/presentation, thanks!

    One small finding (I might be wrong?), on slide 11 where you describe the insert with NULL values, I think the example for DB2 is wrong?

    INSERT INTO test (value) VALUES (‘Dan’);

    ?

    cu,
    Mike

  6. krook said,

    15 December 2006 at 12:07 pm

    Hi Mike,

    Yes, that DB2 insert will fail, whereas the MySQL one will succeed with unexpected results. :)

  7. fakhre said,

    05 September 2007 at 4:08 am

    i want to write something like below in php
    but infact getting nothing…
    $rsFnd=mysql_query(“SELECT distinct c.category_id, c.category FROM stbl_category c, stbl_company com WHERE (com.company LIKE ‘%$txtFind%’ or com.address like ‘%$txtFind%’ or com.comp_id like or ‘%$txtFind%’ com.keywords like ‘%$txtFind%’ or c.category like ‘%$txtFind%’) and (com.category_id=c.category_id)”);

    can someone please explain me where the problem actually is?
    it looks some php syntax error; is it?

  8. Daniel Krook said,

    05 September 2007 at 9:59 am

    Hi Fakhre,

    You have an error in your SQL syntax. You’re missing a condition after the third LIKE clause: “com.comp_id like [missing ‘%$txtFind%’ here] or”

  9. Augustine said,

    10 November 2007 at 3:07 pm

    hi

    This tips on mysql and db2 are very helpful
    keep the good job
    Thanks.

  10. Ray said,

    05 December 2007 at 6:59 pm

    Attempting to insert or update a row in a table for credit cards that has the card number as varchar and expiration date as DATE. Both may be NULL. When db2_execute is passed NULL as the parameter as in “expdate = NULL”, I get an SQL0180N error “The syntax of the string representation of a datetime value is incorrect.” The SQL statement can be cut and pasted into the Command Center and it is valid. Setting both columns to actual values works so that leaves me wuth the question, How do you set a nullable column to NULL using db2_* functions from PHP ? I know I can change the string and parameters being passed, but NULL is a valid keyword and it should be processed as such.

    Thanks in advance.

  11. Daniel Krook said,

    11 January 2008 at 9:26 am

    Ray,

    Sorry for the delayed reply, have you contacted the IBM maintainers of the ibm_db2 PECL extension about this?

    http://pecl.php.net/package/ibm_db2

  12. Brock Dulaney said,

    31 January 2008 at 3:05 pm

    Great site. Will be bookmarking this as a resource!

    -B

  13. Ray said,

    31 January 2008 at 7:11 pm

    Daniel: That’s OK, I was busy too.

    No, I have not contacted IBM. My thought was that being so new to PHP, that I was making a rookie mistake. I will send this question on to them now.

    Thanks.

  14. Mini said,

    06 December 2008 at 9:56 am

    hi,
    thnx a lot Daniel. this post was really very helpful…thnx a ton! :)

  15. Jim said,

    13 August 2009 at 11:50 am

    When working with dates, the odds are pretty good that you will need access to today’s date. I have several books and have looked all over the IBM site, finally I have found the correct statement to access the current date. Thank you for publishing what should be so much easier to find.

  16. Wayne said,

    01 April 2010 at 12:58 pm

    Hi Dan,

    I found there are a little bit incorrect DB2 queries.
    db2 => SELECT SYSIBM.IDENTITY_VAL_LOCAL() FROM table_name;
    should be :SELECT SYSIBM.IDENTITY_VAL_LOCAL() FROM SYSIBM.DUAL;
    “FROM TABLE_NAME’ will introduce performance issue.
    BTW: IDENTITY_VAL_LOCAL() will return a new id to you even if the last insert SQL isn’t executed successfully.

    For “group by”, we should really take care.
    if data is as follow:
    manager location salary
    Dan NY 10K
    Martina NY 12K
    Martina NJ 10K

    then, in MySQL:
    SELECT manager, location, AVG(salary)
    FROM employee
    GROUP BY manager;
    you will get 2 rows, that is what you want to get.

    but in DB2:
    SELECT manager, location, AVG(salary)
    FROM employee
    GROUP BY manager, location;
    you will get 3 rows.

    Therefore, if you really want to group by ‘manager’ only, be care of the group by clause.

    We follow your blog on our migration activity, it is really helpful for us. That is all what I found during migration, hope it is helpful for you.

    Thanks