Solved TD-6167: TD5.1SP4 : CHAR, VARCHAR and TEXT encoding in MySQL [ODBC]

General discussion forum about all databases other than SqlBase.
olvin

TD-6167: TD5.1SP4 : CHAR, VARCHAR and TEXT encoding in MySQL [ODBC]

Post by olvin » 12 Feb 2009, 03:54

TD 5.1 SP4
MySQL Enterprise 5.1.30
MySQL ODBC Driver 5.1.5

All the String data are stored in UTF8.

We specify the charset=utf8 in the odbc configuration :

Code: Select all

remotedbname=my56088,Driver={MySQL ODBC 5.1 Driver};Server=localhost;Port=3306;Database=SAC56088;Charset=UTF8
CHAR and VARCHAR :

When we try to read data stored in utf8 from a CHAR or VARCHAR column => no problem with SQLDescribeCol, SQLBindCol and SQLFetch in sqlodb32.dll => TD 5.1 convert the text to UTF16
But when we try to insert or update data or to bind a variable in the where clause, I think that TD5.1 sends data in LATIN1 to MySQL (LATIN1 is the default enconding on my computer)... And when we try to convert to UTF8 before sending the data, it replaces all the characters with '?'... So, For me, it's a bug in sqlodb32.dll when calling the functions SQLDescribeParam and SQLBindParameter
For example, when we put the string 'éè' in a variable called gsTest, the following query give us the following error message :
Query : "update errors_list set erl_val_d = :gsTest where erl_code = '00001'"
Error message : "MySQL:1366[MySQL][ODBC 5.1 Driver][mysqld-5.1.30-enterprise-gpl-pro-log]Incorrect string value: '\xE9\xE8' for column 'ERL_VAL_D' at row 1"
But when we execute the following query, there's no problem : "update errors_list set erl_val_d = '" || gsTest || "' where erl_code = '00001'"
So, please, when we specify the encoding in the ODBC configuration, use it to send the data in this encoding when there's binding to do.

TEXT :

With the data types TINYTEXT, TEXT and MEDIUMTEXT => no problem to read the data stored with charset utf8 => TD 5.1 convert the text to UTF16 with no problem in a SELECT
But we must convert UTF16 to UTF8 when we do an insert or an update (and so, we can't use SalTblDoInserts and SalTblDoUpdates !) :

Code: Select all

Call SalStrToMultiByte( gsTest, gsTest, ENC_UTF8)
Call SalSetBufferLength(gsTest, SalGetBufferLength( gsTest) + 1)
So, one improvement would be to convert UTF16 to the charset specified into the ODBC connection (or the default charset of the database) when we are doing an INSERT or an UPDATE.
Or "the must" would be a property associated to a column component (in a table or in a grid) to indicate the charset used in the database and that property would be used by TD 5.1 to convert the encoding.


And the data type LONGTEXT can't be use because the data are truncated and can't be read properly.

This refers to my original post here :
https://support.guptatechnologies.com/s ... =18&t=2493

Olvin
Last edited by olvin on 16 Feb 2009, 04:56, edited 1 time in total.

olvin

Re: TD-6167: TD5.1SP4 : CHAR, VARCHAR and TEXT encoding in MySQL [ODBC]

Post by olvin » 16 Feb 2009, 03:19

Please, somebody from Unify can answer to my post to see if my analysis of the problem is good ?

Thanks.

olvin

Re: TD-6167: TD5.1SP4 : CHAR, VARCHAR and TEXT encoding in MySQL [ODBC]

Post by olvin » 10 Mar 2009, 08:59

From this topic https://support.guptatechnologies.com/su ... =20&t=2239, we can read that you suggest to store the Unicode data in binary column when the database engine is NON-UNICODE.
It works well with MySQL and BLOB datatype.

But to avoid this and to avoid the using of "SqlSetLongBindDatatype", could you please admit that there's a bug and the data is converting to wrong encoding (LATIN-1) when there are binding string variables to store in UTF-8 CHAR and VARCHAR columns through an ODBC connection ?
And fix it rapidly ?

Because we have bought a licence of TD 5.1 only to enable Unicode data in our application and it doesn't work completely...

I attach the ODBC.LOG generated when I try to update a UTF-8 VARCHAR column with unicode data, so you can see this error :

Code: Select all

[MySQL][ODBC 5.1 Driver][mysqld-5.1.31-enterprise-gpl-pro-log]Incorrect string value: '\xE9?\xF9\xA9' for column 'ERL_VAL_D' at row 1 (1366)
So, could you at least tell me if you will plan to test it and correct it ?
If no, we will apply another solution... (e.g. storing data in binary column)

Thanks,

Olvin
You do not have the required permissions to view the files attached to this post.

olvin

Re: TD-6167: TD5.1SP4 : CHAR, VARCHAR and TEXT encoding in MySQL [ODBC]

Post by olvin » 11 Mar 2009, 01:20

I think I have found the error in the ODBC-TD5.1.LOG.

Here is the log of executing a query with a binding variable (trying to update a column with UTF-8 VARCHAR data type) :

Code: Select all

test_mysql      1b30-1e88   ENTER SQLPrepareW 
      HSTMT               02C929F0
      WCHAR *             0x01DFE608 [      -3] "Update errors_list set erl_val_d = ?          where erl_code = '00001'\ 0"
      SDWORD                    -3

test_mysql      1b30-1e88   EXIT  SQLPrepareW  with return code 0 (SQL_SUCCESS)
      HSTMT               02C929F0
      WCHAR *             0x01DFE608 [      -3] "Update errors_list set erl_val_d = ?          where erl_code = '00001'\ 0"
      SDWORD                    -3

test_mysql      1b30-1e88   ENTER SQLDescribeParam 
      HSTMT               02C929F0
      UWORD                        1 
      SWORD *             0x0012E448
      SQLULEN *           0x0012E434
      SWORD *             0x0012E44C
      SWORD *             0x00000000

test_mysql      1b30-1e88   EXIT  SQLDescribeParam  with return code 0 (SQL_SUCCESS)
      HSTMT               02C929F0
      UWORD                        1 
      SWORD *             0x0012E448 (12)
      SQLULEN *           0x0012E434 (255)
      SWORD *             0x0012E44C (0)
      SWORD *             0x00000000

test_mysql      1b30-1e88   ENTER SQLBindParameter 
      HSTMT               02C929F0
      UWORD                        1 
      SWORD                        1 <SQL_PARAM_INPUT>
      SWORD                        1 <SQL_C_CHAR>
      SWORD                        1 <SQL_CHAR>
      SQLULEN                  255
      SWORD                        0 
      PTR                0x01E009F0
      SQLLEN                   255
      SQLLEN *            0x01E00974

test_mysql      1b30-1e88   EXIT  SQLBindParameter  with return code 0 (SQL_SUCCESS)
      HSTMT               02C929F0
      UWORD                        1 
      SWORD                        1 <SQL_PARAM_INPUT>
      SWORD                        1 <SQL_C_CHAR>
      SWORD                        1 <SQL_CHAR>
      SQLULEN                  255
      SWORD                        0 
      PTR                0x01E009F0
      SQLLEN                   255
      SQLLEN *            0x01E00974 (4)

test_mysql      1b30-1e88   ENTER SQLExecute 
      HSTMT               02C929F0

test_mysql      1b30-1e88   EXIT  SQLExecute  with return code -1 (SQL_ERROR)
      HSTMT               02C929F0

      DIAG [S1000] [MySQL][ODBC 5.1 Driver][mysqld-5.1.31-enterprise-gpl-pro-log]Incorrect string value: '\xE9?\xF9\xA9' for column 'ERL_VAL_D' at row 1 (1366) 
You can see that the specified C data type is SQL_C_CHAR and the SQL data type is SQL_CHAR.
But as you can see here : http://msdn.microsoft.com/en-us/library ... S.85).aspx# and here : http://msdn.microsoft.com/en-us/library ... S.85).aspx, this is for ANSI string.
For Unicode string, the C data type must be SQL_C_WCHAR and the SQL data type must be SQL_WCHAR or SQL_WVARCHAR.

And then, the bug is this one :
TD 5.1 sends Unicode string to the MySQL ODBC Driver but TD tell it that it's ANSI string !

So, I think that you must only change the data types when using the function "SQLBindParameter" in the DLL "Sqlodb32.dll".

olvin

Re: TD-6167: TD5.1SP4 : CHAR, VARCHAR and TEXT encoding in MySQL [ODBC]

Post by olvin » 12 Mar 2009, 02:57

I have made a little program in C++ to test it and to find the solution.
Here are the results :

I put the code that I think you have in "Sqlodb32.dll" (you must just replace the password data) :

Code: Select all

/*******************
  GOOD FOR ANSI ONLY
*******************/
#define NEW_VALUE_LEN 6 // 5+1(null-terminated char.)

SQLHENV henv = NULL;
SQLHDBC hdbc = NULL;
SQLRETURN retcode;
SQLHSTMT hstmt = NULL;

SQLCHAR newValue[NEW_VALUE_LEN];
SQLINTEGER cbnewValue = SQL_NTS;

SQLWCHAR* InConnStr = L"Driver={MySQL ODBC 5.1 Driver};Server=localhost;Port=3310;Database=sal56088cp;Charset=UTF8;user=SYSADM;password=*******";


   retcode = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &henv);
   retcode = SQLSetEnvAttr(henv, SQL_ATTR_ODBC_VERSION, (SQLPOINTER*)SQL_OV_ODBC3, 0);
   retcode = SQLAllocHandle(SQL_HANDLE_DBC, henv, &hdbc);

   retcode = SQLDriverConnect( 
               hdbc, 
               NULL, 
            InConnStr,
               SQL_NTS,
               NULL,
               0, 
               NULL,
               SQL_DRIVER_NOPROMPT );

   retcode = SQLAllocHandle(SQL_HANDLE_STMT, hdbc, &hstmt);

   retcode = SQLPrepare(hstmt, (SQLWCHAR*)L"Update errors_list set erl_val_d = ?          where erl_code = '00001'", SQL_NTS);
   strcpy_s((char*)newValue, _countof(newValue), "étèùé");
   retcode = SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT, SQL_C_CHAR, SQL_CHAR, NEW_VALUE_LEN, 0, newValue, 0, &cbnewValue);
   
   retcode = SQLExecute(hstmt);
   SQLFreeStmt(hstmt, SQL_DROP);

   SQLDisconnect(hdbc);
   SQLFreeHandle(SQL_HANDLE_DBC, hdbc);
   SQLFreeHandle(SQL_HANDLE_ENV, henv);
So, it produces the same error posted above (MySQL try to insert ANSI data to an UTF column and produce an error)

But if I modify the code like this :

Code: Select all

/*****************
  GOOD FOR UNICODE
*******************/
#define NEW_VALUE_LEN 6 // 5+1(null-terminated char.)

SQLHENV henv = NULL;
SQLHDBC hdbc = NULL;
SQLRETURN retcode;
SQLHSTMT hstmt = NULL;

SQLWCHAR newValue[NEW_VALUE_LEN];
SQLINTEGER cbnewValue = SQL_NTS;

SQLWCHAR* InConnStr = L"Driver={MySQL ODBC 5.1 Driver};Server=10.184.10.142;Port=3310;Database=sal56088cp;Charset=UTF8;user=SYSADM;password=*******";


   retcode = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &henv);
   retcode = SQLSetEnvAttr(henv, SQL_ATTR_ODBC_VERSION, (SQLPOINTER*)SQL_OV_ODBC3, 0);
   retcode = SQLAllocHandle(SQL_HANDLE_DBC, henv, &hdbc);

   retcode = SQLDriverConnect( 
               hdbc, 
               NULL, 
            InConnStr,
               SQL_NTS,
               NULL,
               0, 
               NULL,
               SQL_DRIVER_NOPROMPT );

   retcode = SQLAllocHandle(SQL_HANDLE_STMT, hdbc, &hstmt);

   retcode = SQLPrepare(hstmt, (SQLWCHAR*)L"Update errors_list set erl_val_d = ?          where erl_code = '00001'", SQL_NTS);
   wcscpy_s(newValue, L"étèùé");
   retcode = SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT, SQL_C_WCHAR, SQL_WCHAR, NEW_VALUE_LEN, 0, newValue, 0, &cbnewValue);
   
   retcode = SQLExecute(hstmt);
   SQLFreeStmt(hstmt, SQL_DROP);

   SQLDisconnect(hdbc);
   SQLFreeHandle(SQL_HANDLE_DBC, hdbc);
   SQLFreeHandle(SQL_HANDLE_ENV, henv);
the string is well sent in Unicode to MySQL !

Here are the changes :

...
SQLWCHAR newValue[NEW_VALUE_LEN];
...
wcscpy_s(newValue, L"étèùé");
...
retcode = SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT, SQL_C_WCHAR, SQL_WCHAR, NEW_VALUE_LEN, 0, newValue, 0, &cbnewValue);
...

I have also tested it with an ODBC connection to an Oracle database and the results are the same.

So, I give you the changes to make in your "Sqlodb32.dll" to work with Unicode through ODBC connections.
Could you make this changes in the SP5 refresh ?

Thanks,

Olvin

vg

Re: TD-6167: TD5.1SP4 : CHAR, VARCHAR and TEXT encoding in MySQL [ODBC]

Post by vg » 13 Mar 2009, 04:54

On man, I think your are on to something, confirming my old suspictions.

Unify, please check all your routers, including OLEDB :!:

caa

Re: TD-6167: TD5.1SP4 : CHAR, VARCHAR and TEXT encoding in MySQL [ODBC]

Post by caa » 13 Mar 2009, 14:31

I'm looking at this now...

Can I get the full SQL to create this database for MySQL, please? I'm especially interested in the encoding.

Also, what code page is being used for the database itself?

olvin

Re: TD-6167: TD5.1SP4 : CHAR, VARCHAR and TEXT encoding in MySQL [ODBC]

Post by olvin » 16 Mar 2009, 02:06

Good to know that you are interested :D

In the following attachment, you can see the values of the charsets of our MySQL Server and the database "sal56088cp" for our tests.
MySQL-server-charsets.jpg
All is utf8 (database, client connection, ...) except the character set of the server which is latin1.
The charset of the table and the columns is also utf8.

As you ask, I have also posted a SQL script which is generated with MySQL Administrator (only one table for the tests).
sal56088cp-TD51.zip
The database name is sal56088cp.
The table name is errors_list and the varchar column used for the test is erl_val_d;

Thanks,

Olvin
You do not have the required permissions to view the files attached to this post.

caa

Re: TD-6167: TD5.1SP4 : CHAR, VARCHAR and TEXT encoding in MySQL [ODBC]

Post by caa » 17 Mar 2009, 14:19

What you've described above appears to be very specific to MySQL.

Normally we query the driver to see if it can handle converting data to Unicode. If they driver says it can, we let it do so. However, the MySQL driver says it cannot... so we modified our code to look at the connection string in the case of MySQL only. If it says Charset=UTF8 or UCS2, we treat it as though the driver can translate Unicode data.

This passes our tests here. I'll include it in SP6.

This is defect TD-6167.

olvin

Re: TD-6167: TD5.1SP4 : CHAR, VARCHAR and TEXT encoding in MySQL [ODBC]

Post by olvin » 18 Mar 2009, 00:31

Ok, thanks.

But would you like to tell me how you query the MySQL ODBC driver ? => To make it return the good value and like that we will forward the correction to do to MySQL.
Because we have "special" support with the ODBC team of MySQL.

Because, in the next few years, we will perhaps use the UTF-16 encoding (not UCS-2) that will be able in MySQL 6.0 and so, there will be "charset=UTF16" in the ODBC configuration.

Thanks,

Olvin

olvin

Re: TD-6167: TD5.1SP4 : CHAR, VARCHAR and TEXT encoding in MySQL [ODBC]

Post by olvin » 20 Apr 2009, 07:51

The problem has been fixed in SP6.

Thanks !

olvin

Re: TD-6167: TD5.1SP4 : CHAR, VARCHAR and TEXT encoding in MySQL [ODBC]

Post by olvin » 05 Feb 2010, 14:23

Hi,

Sun tells us that this problem is corrected in a hot fix of MySQL ODBC Driver. It will be available in the 5.1.7 version of the driver.
So, you don't need to look again at the connection string for the next release of TD.

Olvin

Return to “General Discussion”

Who is online

Users browsing this forum: [Ccbot] and 0 guests