[Mono-dev] ASCII Strings Proposal

Jon Purdy jopur at microsoft.com
Wed Jul 27 18:35:06 UTC 2016


I have written a description of my prototype implementation of adaptive ASCII/UTF-16 strings in Mono:


http://www.mono-project.com/docs/advanced/runtime/docs/ascii-strings/


Introduction:


> For historical reasons, System.String uses the UCS-2 character encoding, that is, UTF-16 without surrogate pairs.


> However, most strings in typical .NET applications consist solely of ASCII characters, leading to wasted space: half of the bytes in a string are likely to be null bytes!


> Since strings are immutable, we can scan the character data when the string is constructed, then dynamically select an encoding, thereby saving 50% of string memory in most cases.


I would like to solicit feedback on this proposal from runtime developers and users alike. In particular:


- Specific objections regarding performance characteristics, compatibility issues, &c.

- Questions about unclear or underspecified parts of the proposal

- Real-world use cases that would benefit from this optimization

- Suggestions for suitable real-world benchmarks


Thank you!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dot.net/pipermail/mono-devel-list/attachments/20160727/7b548395/attachment.html>


More information about the Mono-devel-list mailing list