[Mono-dev] Possible multiple errors/bugs in Odbc stack, regarding handling of strings with non-ascii characters

Mads Bondo Dydensborg mbd at dbc.dk
Wed Jul 4 10:45:47 EDT 2007


Hi there

I believe I have spotted an issue with the mono Odbc stack, regarding the 
handling of strings with characters, that give rise to multibyte 
representations in utf8.

What I have seen, when talking to an Odbc database, is that strings passed to 
the database, loose their tail, when they contain e.g. the danish letter æ, 
which has the multibyte representation 'c3 a6' in utf8. 

An example: the following request is send to the odbc stack:

"UPDATE PublishingJob SET Name = 'foo' WHERE JobId = 2.000000"

This appears in the odbctrace identical to this string.

However, when changing foo to foæ, the string appears as:

"UPDATE PublishingJob SET Name = 'fo' WHERE JobId = 2.00000"

"foæ" appears as "fo" - I believe this is due to a limitation in the log 
mechanism, as the value is correctly set in the database. Note, however, the 
change from "2.000000" to "2.00000". This is no problem in this query, but 
this query:

"UPDATE PublishingJob SET Name = 'ææææææææ' WHERE JobId = 2.000000"

is traced to this:

"UPDATE PublishingJob SET Name = '' WHERE JobId = "

and, the odbc driver/database wont accept that as valid sql...

I believe the issue is with OdbcCommand.cs, the method ExecSQL, and quite 
possibly, other places:

libodbc.cs:
		[DllImport("odbc32.dll")]
		internal static extern OdbcReturn SQLExecDirect (IntPtr StatementHandle, 
string StatementText, int TextLength);

OdbcCommand.cs:

		private void ExecSQL(string sql)
		{
			OdbcReturn ret;

...
				
				ret=libodbc.SQLExecDirect(hstmt, sql, sql.Length);


The issue here is, that the sql string is formatted by System.Runtime.Interop, 
eventually into a char*, possibly containing multiby representations of the 
chars of sql. However, the sql.Length call, returns the number of Chars of 
sql, which must be a lower bound on the length of the char* that sql 
eventually is transformed into.

FYI: AFAICT, the sql is transformed in 
marshcal.c:

mono_string_to_lpstr (MonoString *s) 
   mono_string_to_utf8 (s)
     g_utf16_to_utf8

My question is then: Can anybody confirm this is an issue?
Any suggestions for a fix?

I need this to work quite badly, so any help appreciated.

Regards,

Mads

-- 
Med venlig hilsen/Regards

Systemudvikler/Systemsdeveloper cand.scient.dat, Ph.d., Mads Bondo Dydensborg
Dansk BiblioteksCenter A/S, Tempovej 7-11, 2750 Ballerup, Tlf. +45 44 86 77 34




More information about the Mono-devel-list mailing list