[Mono-dev] ASCII Strings Proposal

Jon Purdy jopur at microsoft.com
Wed Jul 27 18:35:06 UTC 2016

I have written a description of my prototype implementation of adaptive ASCII/UTF-16 strings in Mono:



> For historical reasons, System.String uses the UCS-2 character encoding, that is, UTF-16 without surrogate pairs.

> However, most strings in typical .NET applications consist solely of ASCII characters, leading to wasted space: half of the bytes in a string are likely to be null bytes!

> Since strings are immutable, we can scan the character data when the string is constructed, then dynamically select an encoding, thereby saving 50% of string memory in most cases.

I would like to solicit feedback on this proposal from runtime developers and users alike. In particular:

- Specific objections regarding performance characteristics, compatibility issues, &c.

- Questions about unclear or underspecified parts of the proposal

- Real-world use cases that would benefit from this optimization

- Suggestions for suitable real-world benchmarks

Thank you!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dot.net/pipermail/mono-devel-list/attachments/20160727/7b548395/attachment.html>

More information about the Mono-devel-list mailing list