[Mono-dev] Handling UTF8 strings containing nul
Rob Wilkens
robwilkens at gmail.com
Sun Jun 24 23:51:56 UTC 2012
I am not an expert, just have a suggestion, and i don't know that my
suggestion is any better than your solution. But i figure it couldn't
hurt to share.
>From what i saw someone replied to your message here about how to do it:
https://mail.gnome.org/archives/gtk-list/2012-June/msg00023.html
The realloc's i agree may be bad, so not knowing anything else, i wonder
if you couldn't pre-alloc a buffer up front of length x 2 (from 8 bit to
16 bit in theory is double size, presuming that's the difference between
utf8 and utf16 and i don't know).
Something like (and this is pseudo code, untested, and probably won't
work anywhere near as written)
buf = malloc (length * 2);
memset(buf,0,length*2);
bufpos=0;
while (bufpos <= length) {
ut =
g_utf8_to_utf16(text+bufpos,length,&bytes_read,&words_written,&error);
if (there is an error) break;
memcpy(buf+(bufpos*2), ut,
(bytes_read<(length-bufpos)?bytes_read*2:(length-bufpos)*2);
bufpos+=((bytes_read+1)*2);
}
That was pulled out of my head, and i am not familiar enough with utf
strings to know if it would work. I'm just guessing your converting
from something that's 8 bits to something that's 16 bits so it would be
length*2 to alloc.
Use my code above more as a guide of what _i_ have in mind whether or
not it is right, someone else should feel free to correct me.
I am _not_ an expert, just a newbie with a little bit of c programming
experience in my very distant past.
-Rob
On 06/24/2012 07:03 PM, Weeble wrote:
> Having diagnosed this bug (when an attribute has a string argument and
> the string contains nul, it gets truncated), I've been trying to find
> a way to fix it: https://bugzilla.xamarin.com/show_bug.cgi?id=5732
>
> My first attempt just tried to use the available functions in glib,
> but it wasn't acceptable because it involved potentially a great many
> inefficient reallocs: https://github.com/mono/mono/pull/346
>
> In that pull request, Rodrigo Kumpera recommends that since mono has
> its own implementation of glib, it would be better to introduce a new
> version of g_utf8_to_utf16 that can handle embedded nuls, which will
> probably be useful in other places as well.
>
> Perhaps naively, I have had a go at implementing this. However, when I
> tried to add tests for my new function in the eglib test suite, I
> realised that the tests are compiled and built against the native glib
> as well, so introducing new tests against a new API results in build
> failures. You can see what I've tried to do here:
> https://github.com/weeble/mono/commit/f545596052125b90ebdd0a302fa3473d768f9d52
>
> I'm willing to keep trying at this if anyone is able to give me some
> pointers. Does eglib's API already diverge from glib? If so, are there
> any conditional #defines to allow the tests for eglib-specific
> functions to run only against eglib and not glib? If not, is it
> definitely okay to introduce divergence?
>
> Regards,
>
> Weeble.
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list
More information about the Mono-devel-list
mailing list